Home
/
Authors
/
Martin R. Oswald

Author

Martin R. Oswald

Other affiliations: Technische Universität München, University of Bonn, University of Zurich

Bio: Martin R. Oswald is an academic researcher from ETH Zurich. The author has contributed to research in topics: Computer science & 3D reconstruction. The author has an hindex of 16, co-authored 78 publications receiving 907 citations. Previous affiliations of Martin R. Oswald include Technische Universität München & University of Bonn.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009

Papers

PDF

Open Access

More filters

Posted Content•

3D Instance Segmentation via Multi-Task Metric Learning

[...]

Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald

20 Jun 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work proposes a novel method for instance label segmentation of dense 3D voxel grids that achieves state-of-the-art performance on the ScanNet 3D instance segmentation benchmark.

...read moreread less

Abstract: We propose a novel method for instance label segmentation of dense 3D voxel grids. We target volumetric scene representations, which have been acquired with depth sensors or multi-view stereo methods and which have been processed with semantic 3D reconstruction or scene completion methods. The main task is to learn shape information about individual object instances in order to accurately separate them, including connected and incompletely scanned objects. We solve the 3D instance-labeling problem with a multi-task learning strategy. The first goal is to learn an abstract feature embedding, which groups voxels with the same instance label close to each other while separating clusters with different instance labels from each other. The second goal is to learn instance information by densely estimating directional information of the instance's center of mass for each voxel. This is particularly useful to find instance boundaries in the clustering post-processing step, as well as, for scoring the segmentation quality for the first goal. Both synthetic and real-world experiments demonstrate the viability and merits of our approach. In fact, it achieves state-of-the-art performance on the ScanNet 3D instance segmentation benchmark.

...read moreread less

116 citations

Proceedings Article•DOI•

3D Instance Segmentation via Multi-Task Metric Learning

[...]

Jean Lahoud¹, Bernard Ghanem¹, Martin R. Oswald², Marc Pollefeys²•Institutions (2)

King Abdullah University of Science and Technology¹, ETH Zurich²

20 Jun 2019

TL;DR: In this paper, a multi-task learning strategy is proposed to learn an abstract feature embedding, which groups voxels with the same instance label close to each other while separating clusters with different instance labels from each other.

...read moreread less

106 citations

Book Chapter•DOI•

A Symmetry Prior for Convex Variational 3D Reconstruction

[...]

Pablo Speciale¹, Martin R. Oswald¹, Andrea Cohen¹, Marc Pollefeys², Marc Pollefeys¹ - Show less +1 more•Institutions (2)

ETH Zurich¹, Microsoft²

08 Oct 2016

TL;DR: A novel prior for variational 3D reconstruction that favors symmetric solutions when dealing with noisy or incomplete data and is able to denoise and complete surface geometry and even hallucinate large scene parts is proposed.

...read moreread less

Abstract: We propose a novel prior for variational 3D reconstruction that favors symmetric solutions when dealing with noisy or incomplete data. We detect symmetries from incomplete data while explicitly handling unexplored areas to allow for plausible scene completions. The set of detected symmetries is then enforced on their respective support domain within a variational reconstruction framework. This formulation also handles multiple symmetries sharing the same support. The proposed approach is able to denoise and complete surface geometry and even hallucinate large scene parts. We demonstrate in several experiments the benefit of harnessing symmetries when regularizing a surface.

...read moreread less

61 citations

Proceedings Article•DOI•

Multi-Label Semantic 3D Reconstruction Using Voxel Blocks

[...]

Ian Cherabier¹, Christian Häne², Martin R. Oswald¹, Marc Pollefeys¹•Institutions (2)

ETH Zurich¹, University of California, Berkeley²

01 Oct 2016

TL;DR: This work proposes a way to reduce the memory consumption of existing methods by determining early on in the reconstruction process which labels need to be active in which block, and shows results of joint semantic 3D reconstruction and semantic segmentation with significantly more labels than previous approaches were able to handle.

...read moreread less

Abstract: Techniques that jointly perform dense 3D reconstruction and semantic segmentation have recently shown very promising results. One major restriction so far is that they can often only handle a very low number of semantic labels. This is mostly due to their high memory consumption caused by the necessity to store indicator variables for every label and transition. We propose a way to reduce the memory consumption of existing methods. Our approach is based on the observation that many semantic labels are only present at very localized positions in the scene, such as cars. Therefore this label does not need to be active at every location. We exploit this observation by dividing the scene into blocks in which generally only a subset of labels is active. By determining early on in the reconstruction process which labels need to be active in which block the memory consumption can be significantly reduced. In order to recover from mistakes we propose to update the set of active labels during the iterative optimization procedure based on the current solution. We also propose a way to initialize the set of active labels using a boosted classifier. In our experimental evaluation we show the reduction of memory usage quantitatively. Eventually, we show results of joint semantic 3D reconstruction and semantic segmentation with significantly more labels than previous approaches were able to handle.

...read moreread less

60 citations

Book Chapter•DOI•

Online Invariance Selection for Local Feature Descriptors

[...]

Rémi Pautrat¹, Viktor Larsson¹, Martin R. Oswald¹, Marc Pollefeys¹•Institutions (1)

ETH Zurich¹

23 Aug 2020

TL;DR: Local Invariance Selection at Runtime for Descriptors (LISRD) as mentioned in this paper proposes to combine local and meta descriptors to select the right invariance when matching the local descriptors.

...read moreread less

Abstract: To be invariant, or not to be invariant: that is the question formulated in this work about local descriptors. A limitation of current feature descriptors is the trade-off between generalization and discriminative power: more invariance means less informative descriptors. We propose to overcome this limitation with a disentanglement of invariance in local descriptors and with an online selection of the most appropriate invariance given the context. Our framework (https://github.com/rpautrat/LISRD) consists in a joint learning of multiple local descriptors with different levels of invariance and of meta descriptors encoding the regional variations of an image. The similarity of these meta descriptors across images is used to select the right invariance when matching the local descriptors. Our approach, named Local Invariance Selection at Runtime for Descriptors (LISRD), enables descriptors to adapt to adverse changes in images, while remaining discriminative when invariance is not required. We demonstrate that our method can boost the performance of current descriptors and outperforms state-of-the-art descriptors in several matching tasks, when evaluated on challenging datasets with day-night illumination as well as viewpoint changes.

...read moreread less

55 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

[...]

杉山拓海

12 Sep 2017-Computers & Graphics

3,940 citations

Proceedings Article•DOI•

ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes

[...]

Angela Dai¹, Angel X. Chang², Manolis Savva², Maciej Halber², Thomas Funkhouser², Matthias NieBner¹ - Show less +2 more•Institutions (2)

Stanford University¹, Princeton University²

21 Jul 2017

TL;DR: This work introduces ScanNet, an RGB-D video dataset containing 2.5M views in 1513 scenes annotated with 3D camera poses, surface reconstructions, and semantic segmentations, and shows that using this data helps achieve state-of-the-art performance on several 3D scene understanding tasks.

...read moreread less

Abstract: A key requirement for leveraging supervised deep learning methods is the availability of large, labeled datasets. Unfortunately, in the context of RGB-D scene understanding, very little data is available – current datasets cover a small range of scene views and have limited semantic annotations. To address this issue, we introduce ScanNet, an RGB-D video dataset containing 2.5M views in 1513 scenes annotated with 3D camera poses, surface reconstructions, and semantic segmentations. To collect this data, we designed an easy-to-use and scalable RGB-D capture system that includes automated surface reconstruction and crowdsourced semantic annotation. We show that using this data helps achieve state-of-the-art performance on several 3D scene understanding tasks, including 3D object classification, semantic voxel labeling, and CAD model retrieval.

...read moreread less

2,305 citations

Book Chapter•DOI•

Pixelwise View Selection for Unstructured Multi-View Stereo

[...]

Johannes L. Schonberger¹, Enliang Zheng², Jan-Michael Frahm², Marc Pollefeys¹, Marc Pollefeys³ - Show less +1 more•Institutions (3)

ETH Zurich¹, University of North Carolina at Chapel Hill², Microsoft³

08 Oct 2016

TL;DR: The core contributions are the joint estimation of depth andnormal information, pixelwise view selection using photometric and geometric priors, and a multi-view geometric consistency term for the simultaneous refinement and image-based depth and normal fusion.

...read moreread less

Abstract: This work presents a Multi-View Stereo system for robust and efficient dense modeling from unstructured image collections. Our core contributions are the joint estimation of depth and normal information, pixelwise view selection using photometric and geometric priors, and a multi-view geometric consistency term for the simultaneous refinement and image-based depth and normal fusion. Experiments on benchmarks and large-scale Internet photo collections demonstrate state-of-the-art performance in terms of accuracy, completeness, and efficiency.

...read moreread less

1,372 citations

Journal Article•DOI•

Deep Learning for 3D Point Clouds: A Survey

[...]

Yulan Guo¹, Hanyun Wang², Qingyong Hu³, Hao Liu¹, Li Liu⁴, Mohammed Bennamoun⁵ - Show less +2 more•Institutions (5)

Sun Yat-sen University¹, PLA Information Engineering University², University of Oxford³, National University of Defense Technology⁴, University of Western Australia⁵

01 Dec 2021-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper presents a comprehensive review of recent progress in deep learning methods for point clouds, covering three major tasks, including 3D shape classification, 3D object detection and tracking, and 3D point cloud segmentation.

...read moreread less

Abstract: Point cloud learning has lately attracted increasing attention due to its wide applications in many areas, such as computer vision, autonomous driving, and robotics As a dominating technique in AI, deep learning has been successfully used to solve various 2D vision problems However, deep learning on point clouds is still in its infancy due to the unique challenges faced by the processing of point clouds with deep neural networks Recently, deep learning on point clouds has become even thriving, with numerous methods being proposed to address different problems in this area To stimulate future research, this paper presents a comprehensive review of recent progress in deep learning methods for point clouds It covers three major tasks, including 3D shape classification, 3D object detection and tracking, and 3D point cloud segmentation It also presents comparative results on several publicly available datasets, together with insightful observations and inspiring future research directions

...read moreread less

1,021 citations

Posted Content•

ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes

[...]

Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner - Show less +2 more

14 Feb 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: The ScanNet dataset as discussed by the authors contains 2.5M RGB-D views in 1513 scenes annotated with 3D camera poses, surface reconstructions, and semantic segmentations.

...read moreread less

Abstract: A key requirement for leveraging supervised deep learning methods is the availability of large, labeled datasets. Unfortunately, in the context of RGB-D scene understanding, very little data is available -- current datasets cover a small range of scene views and have limited semantic annotations. To address this issue, we introduce ScanNet, an RGB-D video dataset containing 2.5M views in 1513 scenes annotated with 3D camera poses, surface reconstructions, and semantic segmentations. To collect this data, we designed an easy-to-use and scalable RGB-D capture system that includes automated surface reconstruction and crowdsourced semantic annotation. We show that using this data helps achieve state-of-the-art performance on several 3D scene understanding tasks, including 3D object classification, semantic voxel labeling, and CAD model retrieval. The dataset is freely available at this http URL.

...read moreread less

978 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse