Home
/
Authors
/
Federico Tombari

Author

Federico Tombari

Other affiliations: École Polytechnique Fédérale de Lausanne, Ludwig Maximilian University of Munich, Peking University ...read more

Bio: Federico Tombari is an academic researcher from Technische Universität München. The author has contributed to research in topics: Computer science & Pose. The author has an hindex of 48, co-authored 278 publications receiving 12522 citations. Previous affiliations of Federico Tombari include École Polytechnique Fédérale de Lausanne & Ludwig Maximilian University of Munich.

Topics: Computer science, Pose, Template matching, Point cloud, Object detection ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Deeper Depth Prediction with Fully Convolutional Residual Networks

[...]

Iro Laina, Christian Rupprecht, Vasileios Belagiannis¹, Federico Tombari, Nassir Navab - Show less +1 more•Institutions (1)

University of Oxford¹

01 Oct 2016

TL;DR: A fully convolutional architecture, encompassing residual learning, to model the ambiguous mapping between monocular images and depth maps is proposed and a novel way to efficiently learn feature map up-sampling within the network is presented.

...read moreread less

Abstract: This paper addresses the problem of estimating the depth map of a scene given a single RGB image. We propose a fully convolutional architecture, encompassing residual learning, to model the ambiguous mapping between monocular images and depth maps. In order to improve the output resolution, we present a novel way to efficiently learn feature map up-sampling within the network. For optimization, we introduce the reverse Huber loss that is particularly suited for the task at hand and driven by the value distributions commonly present in depth maps. Our model is composed of a single architecture that is trained end-to-end and does not rely on post-processing techniques, such as CRFs or other additional refinement steps. As a result, it runs in real-time on images or videos. In the evaluation, we show that the proposed model contains fewer parameters and requires fewer training data than the current state of the art, while outperforming all approaches on depth estimation. Code and models are publicly available.

...read moreread less

1,677 citations

Book Chapter•DOI•

Unique signatures of histograms for local surface description

[...]

Federico Tombari¹, Samuele Salti¹, Luigi Di Stefano¹•Institutions (1)

University of Bologna¹

05 Sep 2010

TL;DR: A novel comprehensive proposal for surface representation is formulated, which encompasses a new unique and repeatable local reference frame as well as a new 3D descriptor.

...read moreread less

Abstract: This paper deals with local 3D descriptors for surface matching. First, we categorize existing methods into two classes: Signatures and Histograms. Then, by discussion and experiments alike, we point out the key issues of uniqueness and repeatability of the local reference frame. Based on these observations, we formulate a novel comprehensive proposal for surface representation, which encompasses a new unique and repeatable local reference frame as well as a new 3D descriptor. The latter lays at the intersection between Signatures and Histograms, so as to possibly achieve a better balance between descriptiveness and robustness. Experiments on publicly available datasets as well as on range scans obtained with Spacetime Stereo provide a thorough validation of our proposal.

...read moreread less

1,479 citations

Proceedings Article•DOI•

SSD-6D: Making RGB-Based 3D Detection and 6D Pose Estimation Great Again

[...]

Wadim Kehl¹, Fabian Manhardt¹, Federico Tombari¹, Slobodan Ilic¹, Nassir Navab¹ - Show less +1 more•Institutions (1)

Technische Universität München¹

01 Oct 2017

TL;DR: In this paper, a novel method for detecting 3D model instances and estimating their 6D pose from RGB data in a single shot is presented, which outperforms state-of-the-art methods that leverage RGBD data on multiple challenging datasets.

...read moreread less

Abstract: We present a novel method for detecting 3D model instances and estimating their 6D poses from RGB data in a single shot. To this end, we extend the popular SSD paradigm to cover the full 6D pose space and train on synthetic model data only. Our approach competes or surpasses current state-of-the-art methods that leverage RGBD data on multiple challenging datasets. Furthermore, our method produces these results at around 10Hz, which is many times faster than the related methods. For the sake of reproducibility, we make our trained networks and detection code publicly available.

...read moreread less

901 citations

Posted Content•

Deeper Depth Prediction with Fully Convolutional Residual Networks

[...]

Iro Laina, Christian Rupprecht, Vasileios Belagiannis¹, Federico Tombari, Nassir Navab - Show less +1 more•Institutions (1)

University of Oxford¹

01 Jun 2016-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, a fully convolutional architecture, encompassing residual learning, is proposed to model the ambiguous mapping between monocular images and depth maps, which can be trained end-to-end and does not rely on post-processing techniques such as CRFs or other additional refinement steps.

...read moreread less

827 citations

Proceedings Article•DOI•

CNN-SLAM: Real-Time Dense Monocular SLAM with Learned Depth Prediction

[...]

Keisuke Tateno, Federico Tombari, Iro Laina, Nassir Navab

21 Jul 2017

TL;DR: A method where CNN-predicted dense depth maps are naturally fused together with depth measurements obtained from direct monocular SLAM, based on a scheme that privileges depth prediction in image locations where monocularSLAM approaches tend to fail, e.g. along low-textured regions, and vice-versa.

...read moreread less

Abstract: Given the recent advances in depth prediction from Convolutional Neural Networks (CNNs), this paper investigates how predicted depth maps from a deep neural network can be deployed for the goal of accurate and dense monocular reconstruction. We propose a method where CNN-predicted dense depth maps are naturally fused together with depth measurements obtained from direct monocular SLAM, based on a scheme that privileges depth prediction in image locations where monocular SLAM approaches tend to fail, e.g. along low-textured regions, and vice-versa. We demonstrate the use of depth prediction to estimate the absolute scale of the reconstruction, hence overcoming one of the major limitations of monocular SLAM. Finally, we propose a framework to efficiently fuse semantic labels, obtained from a single frame, with dense SLAM, so to yield semantically coherent scene reconstruction from a single view. Evaluation results on two benchmark datasets show the robustness and accuracy of our approach.

...read moreread less

630 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67

Collapse

Cited by

PDF

Open Access

More filters

Book•

Computer Vision: Algorithms and Applications

[...]

Richard Szeliski

30 Sep 2010

TL;DR: Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images and takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene.

...read moreread less

Abstract: Humans perceive the three-dimensional structure of the world with apparent ease. However, despite all of the recent advances in computer vision research, the dream of having a computer interpret an image at the same level as a two-year old remains elusive. Why is computer vision such a challenging problem and what is the current state of the art? Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both for specialized applications such as medical imaging, and for fun, consumer-level tasks such as image editing and stitching, which students can apply to their own personal photos and videos. More than just a source of recipes, this exceptionally authoritative and comprehensive textbook/reference also takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene. These problems are also analyzed using statistical models and solved using rigorous engineering techniques Topics and features: structured to support active curricula and project-oriented courses, with tips in the Introduction for using the book in a variety of customized courses; presents exercises at the end of each chapter with a heavy emphasis on testing algorithms and containing numerous suggestions for small mid-term projects; provides additional material and more detailed mathematical topics in the Appendices, which cover linear algebra, numerical techniques, and Bayesian estimation theory; suggests additional reading at the end of each chapter, including the latest research in each sub-field, in addition to a full Bibliography at the end of the book; supplies supplementary course material for students at the associated website, http://szeliski.org/Book/. Suitable for an upper-level undergraduate or graduate-level course in computer science or engineering, this textbook focuses on basic techniques that work under real-world conditions and encourages students to push their creative boundaries. Its design and exposition also make it eminently suitable as a unique reference to the fundamental techniques and current research literature in computer vision.

...read moreread less

4,146 citations

Journal Article•

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

[...]

杉山拓海

12 Sep 2017-Computers & Graphics

3,940 citations

Journal Article•DOI•

Dynamic Graph CNN for Learning on Point Clouds

[...]

Yue Wang¹, Yongbin Sun¹, Ziwei Liu², Sanjay E. Sarma¹, Michael M. Bronstein³, Justin Solomon¹ - Show less +2 more•Institutions (3)

Massachusetts Institute of Technology¹, University of California, Berkeley², Imperial College London³

10 Oct 2019-ACM Transactions on Graphics

TL;DR: This work proposes a new neural network module suitable for CNN-based high-level tasks on point clouds, including classification and segmentation called EdgeConv, which acts on graphs dynamically computed in each layer of the network.

...read moreread less

Abstract: Point clouds provide a flexible geometric representation suitable for countless applications in computer graphics; they also comprise the raw output of most 3D data acquisition devices. While hand-designed features on point clouds have long been proposed in graphics and vision, however, the recent overwhelming success of convolutional neural networks (CNNs) for image analysis suggests the value of adapting insight from CNN to the point cloud world. Point clouds inherently lack topological information, so designing a model to recover topology can enrich the representation power of point clouds. To this end, we propose a new neural network module dubbed EdgeConv suitable for CNN-based high-level tasks on point clouds, including classification and segmentation. EdgeConv acts on graphs dynamically computed in each layer of the network. It is differentiable and can be plugged into existing architectures. Compared to existing modules operating in extrinsic space or treating each point independently, EdgeConv has several appealing properties: It incorporates local neighborhood information; it can be stacked applied to learn global shape properties; and in multi-layer systems affinity in feature space captures semantic characteristics over potentially long distances in the original embedding. We show the performance of our model on standard benchmarks, including ModelNet40, ShapeNetPart, and S3DIS.

...read moreread less

3,727 citations

The PASCAL Visual Object Classes Challenge

[...]

Jianguo Zhang

01 Jan 2006

3,012 citations

Posted Content•

Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning

[...]

Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond¹, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, Michal Valko - Show less +10 more•Institutions (1)

Imperial College London¹

13 Jun 2020-arXiv: Learning

TL;DR: This work introduces Bootstrap Your Own Latent (BYOL), a new approach to self-supervised image representation learning that performs on par or better than the current state of the art on both transfer and semi- supervised benchmarks.

...read moreread less

Abstract: We introduce Bootstrap Your Own Latent (BYOL), a new approach to self-supervised image representation learning. BYOL relies on two neural networks, referred to as online and target networks, that interact and learn from each other. From an augmented view of an image, we train the online network to predict the target network representation of the same image under a different augmented view. At the same time, we update the target network with a slow-moving average of the online network. While state-of-the art methods rely on negative pairs, BYOL achieves a new state of the art without them. BYOL reaches $74.3\%$ top-1 classification accuracy on ImageNet using a linear evaluation with a ResNet-50 architecture and $79.6\%$ with a larger ResNet. We show that BYOL performs on par or better than the current state of the art on both transfer and semi-supervised benchmarks. Our implementation and pretrained models are given on GitHub.

...read moreread less

2,942 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse