Accurate, Dense, and Robust Multiview Stereopsis

doi:10.1109/TPAMI.2009.161

Home
/
Papers
/
Accurate, Dense, and Robust Multiview Stereopsis

Journal Article•DOI•

Accurate, Dense, and Robust Multiview Stereopsis

Yasutaka Furukawa¹, Jean Ponce²•Institutions (2)

01 Aug 2010-IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE Computer Society)-Vol. 32, Iss: 8, pp 1362-1376

TL;DR: A novel algorithm for multiview stereopsis that outputs a dense set of small rectangular patches covering the surfaces visible in the images, which outperforms all others submitted so far for four out of the six data sets.

read less

Abstract: This paper proposes a novel algorithm for multiview stereopsis that outputs a dense set of small rectangular patches covering the surfaces visible in the images. Stereopsis is implemented as a match, expand, and filter procedure, starting from a sparse set of matched keypoints, and repeatedly expanding these before using visibility constraints to filter away false matches. The keys to the performance of the proposed algorithm are effective techniques for enforcing local photometric consistency and global visibility constraints. Simple but effective methods are also proposed to turn the resulting patch model into a mesh which can be further refined by an algorithm that enforces both photometric consistency and regularization constraints. The proposed approach automatically detects and discards outliers and obstacles and does not require any initialization in the form of a visual hull, a bounding box, or valid depth ranges. We have tested our algorithm on various data sets including objects with fine surface details, deep concavities, and thin structures, outdoor scenes observed from a restricted set of viewpoints, and "crowded" scenes where moving obstacles appear in front of a static structure of interest. A quantitative evaluation on the Middlebury benchmark [1] shows that the proposed method outperforms all others submitted so far for four out of the six data sets.

...read moreread less

Citations

PDF

Open Access

More filters

Book Chapter•DOI•

Pixelwise View Selection for Unstructured Multi-View Stereo

[...]

Johannes L. Schonberger¹, Enliang Zheng², Jan-Michael Frahm², Marc Pollefeys³, Marc Pollefeys¹ - Show less +1 more•Institutions (3)

ETH Zurich¹, University of North Carolina at Chapel Hill², Microsoft³

08 Oct 2016

TL;DR: The core contributions are the joint estimation of depth andnormal information, pixelwise view selection using photometric and geometric priors, and a multi-view geometric consistency term for the simultaneous refinement and image-based depth and normal fusion.

...read moreread less

Abstract: This work presents a Multi-View Stereo system for robust and efficient dense modeling from unstructured image collections. Our core contributions are the joint estimation of depth and normal information, pixelwise view selection using photometric and geometric priors, and a multi-view geometric consistency term for the simultaneous refinement and image-based depth and normal fusion. Experiments on benchmarks and large-scale Internet photo collections demonstrate state-of-the-art performance in terms of accuracy, completeness, and efficiency.

...read moreread less

1,372 citations

Cites background from "Accurate, Dense, and Robust Multivi..."

...6 and 5(c) show depth/normal maps, and the supplementary material provides more results and comparisons against [9,10,47]....
[...]
...[14] [60] [9] [62] [61] [28] [15] \N \P \S \B \PSB \G Ours...
[...]

Journal Article•DOI•

UAV for 3D mapping applications: a review

[...]

Francesco Nex¹, Fabio Remondino¹•Institutions (1)

Kessler Foundation¹

01 Mar 2014-Applied Geomatics

TL;DR: The paper reports the state of the art of UAV for geomatics applications, giving an overview of different UAV platforms, applications, and case studies, showing also the latest developments of Uav image processing.

...read moreread less

Abstract: Unmanned aerial vehicle (UAV) platforms are nowadays a valuable source of data for inspection, surveillance, mapping, and 3D modeling issues. As UAVs can be considered as a low-cost alternative to the classical manned aerial photogrammetry, new applications in the short- and close-range domain are introduced. Rotary or fixed-wing UAVs, capable of performing the photogrammetric data acquisition with amateur or SLR digital cameras, can fly in manual, semiautomated, and autonomous modes. Following a typical photogrammetric workflow, 3D results like digital surface or terrain models, contours, textured 3D models, vector information, etc. can be produced, even on large areas. The paper reports the state of the art of UAV for geomatics applications, giving an overview of different UAV platforms, applications, and case studies, showing also the latest developments of UAV image processing. New perspectives are also addressed.

...read moreread less

1,358 citations

Journal Article•DOI•

RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments

[...]

Peter Henry¹, Michael Krainin¹, Evan Herbst¹, Xiaofeng Ren², Dieter Fox¹ - Show less +1 more•Institutions (2)

University of Washington¹, Intel²

01 Apr 2012-The International Journal of Robotics Research

TL;DR: This paper presents RGB-D Mapping, a full 3D mapping system that utilizes a novel joint optimization algorithm combining visual features and shape-based alignment to achieve globally consistent maps.

...read moreread less

Abstract: RGB-D cameras (such as the Microsoft Kinect) are novel sensing systems that capture RGB images along with per-pixel depth information. In this paper we investigate how such cameras can be used for building dense 3D maps of indoor environments. Such maps have applications in robot navigation, manipulation, semantic mapping, and telepresence. We present RGB-D Mapping, a full 3D mapping system that utilizes a novel joint optimization algorithm combining visual features and shape-based alignment. Visual and depth information are also combined for view-based loop-closure detection, followed by pose optimization to achieve globally consistent maps. We evaluate RGB-D Mapping on two large indoor environments, and show that it effectively combines the visual and shape information available from RGB-D cameras.

...read moreread less

1,223 citations

Cites background or methods from "Accurate, Dense, and Robust Multivi..."

...In the vision and graphics communities, there has been a large amount of work on dense reconstruction from videos (e.g. Pollefeys et al. 2008) and photos (e.g. Debevec et al. 1996; Furukawa and Ponce 2010),5 mostly on objects or outdoor scenes....
[...]
...For example, patch-based multi-view stereo (PMVS; Furukawa and Ponce, 2010)5 can generate quite accurate reconstructions using a visual consistency measure, and it would be exciting to apply these techniques to our maps....
[...]
...In the conference version of this paper (Henry et al. 2010), we used SIFT features computed with SIFTGPU (Wu 2007)....
[...]

Proceedings Article•DOI•

PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization

[...]

Shunsuke Saito¹, Zeng Huang¹, Ryota Natsume², Shigeo Morishima², Hao Li¹, Angjoo Kanazawa³ - Show less +2 more•Institutions (3)

University of Southern California¹, Waseda University², University of California, Berkeley³

13 May 2019

TL;DR: Pixel-aligned Implicit Function (PIFu) as mentioned in this paper aligns pixels of 2D images with the global context of their corresponding 3D object to produce highresolution surfaces including largely unseen regions such as the back of a person.

...read moreread less

Abstract: We introduce Pixel-aligned Implicit Function (PIFu), an implicit representation that locally aligns pixels of 2D images with the global context of their corresponding 3D object. Using PIFu, we propose an end-to-end deep learning method for digitizing highly detailed clothed humans that can infer both 3D surface and texture from a single image, and optionally, multiple input images. Highly intricate shapes, such as hairstyles, clothing, as well as their variations and deformations can be digitized in a unified way. Compared to existing representations used for 3D deep learning, PIFu produces high-resolution surfaces including largely unseen regions such as the back of a person. In particular, it is memory efficient unlike the voxel representation, can handle arbitrary topology, and the resulting surface is spatially aligned with the input image. Furthermore, while previous techniques are designed to process either a single image or multiple views, PIFu extends naturally to arbitrary number of views. We demonstrate high-resolution and robust reconstructions on real world images from the DeepFashion dataset, which contains a variety of challenging clothing types. Our method achieves state-of-the-art performance on a public benchmark and outperforms the prior work for clothed human digitization from a single image.

...read moreread less

907 citations

Proceedings Article•DOI•

On benchmarking camera calibration and multi-view stereo for high resolution imagery

[...]

Christoph Strecha¹, W. von Hansen, L. Van Gool², Pascal Fua¹, U. Thoennessen - Show less +1 more•Institutions (2)

École Polytechnique Fédérale de Lausanne¹, ETH Zurich²

23 Jun 2008

TL;DR: The discussion on whether image based 3D modelling techniques can possibly be used to replace LIDAR systems for outdoor 3D data acquisition and two main issues have to be addressed: camera calibration and dense multi-view stereo.

...read moreread less

Abstract: In this paper we want to start the discussion on whether image based 3D modelling techniques can possibly be used to replace LIDAR systems for outdoor 3D data acquisition. Two main issues have to be addressed in this context: (i) camera calibration (internal and external) and (ii) dense multi-view stereo. To investigate both, we have acquired test data from outdoor scenes both with LIDAR and cameras. Using the LIDAR data as reference we estimated the ground-truth for several scenes. Evaluation sets are prepared to evaluate different aspects of 3D model building. These are: (i) pose estimation and multi-view stereo with known internal camera parameters; (ii) camera calibration and multi-view stereo with the raw images as the only input and (iii) multi-view stereo.

...read moreread less

890 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

A taxonomy and evaluation of dense two-frame stereo correspondence algorithms

[...]

Daniel Scharstein¹, Richard Szeliski², Ramin Zabih³•Institutions (3)

Middlebury College¹, Microsoft², Cornell University³

09 Dec 2001-International Journal of Computer Vision

TL;DR: This paper has designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms.

...read moreread less

Abstract: Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame stereo methods designed to assess the different components and design decisions made in individual stereo algorithms. Using this taxonomy, we compare existing stereo methods and present experiments evaluating the performance of many different variants. In order to establish a common software platform and a collection of data sets for easy evaluation, we have designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can be easily extended to include new algorithms. We have also produced several new multiframe stereo data sets with ground truth, and are making both the code and data sets available on the Web.

...read moreread less

7,458 citations

Proceedings Article•DOI•

Poisson surface reconstruction

[...]

Michael Kazhdan¹, Matthew Bolitho¹, Hugues Hoppe²•Institutions (2)

Johns Hopkins University¹, Microsoft²

26 Jun 2006

TL;DR: A spatially adaptive multiscale algorithm whose time and space complexities are proportional to the size of the reconstructed model, and which reduces to a well conditioned sparse linear system.

...read moreread less

Abstract: We show that surface reconstruction from oriented points can be cast as a spatial Poisson problem. This Poisson formulation considers all the points at once, without resorting to heuristic spatial partitioning or blending, and is therefore highly resilient to data noise. Unlike radial basis function schemes, our Poisson approach allows a hierarchy of locally supported basis functions, and therefore the solution reduces to a well conditioned sparse linear system. We describe a spatially adaptive multiscale algorithm whose time and space complexities are proportional to the size of the reconstructed model. Experimenting with publicly available scan data, we demonstrate reconstruction of surfaces with greater detail than previously achievable.

...read moreread less

2,712 citations

"Accurate, Dense, and Robust Multivi..." refers methods in this paper

...Table 1 lists the number of input images, their approximate size, the corresponding choice of parameters, the algorithm used to initialize a mesh model (either PSR software [29] or iterative snapping after visual hull construction, denoted as VH), and whether images contain obstacles (crowded scenes) or not....
[...]
...Our first approach to mesh initialization is to simply use Poisson Surface Reconstruction (PSR) software [29] that directly converts a set of oriented points into a triangulated mesh model....
[...]

Proceedings Article•DOI•

A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms

[...]

Steven M. Seitz¹, Brian Curless¹, J. Diebel², Daniel Scharstein³, Richard Szeliski⁴ - Show less +1 more•Institutions (4)

University of Washington¹, Stanford University², Middlebury College³, Microsoft⁴

17 Jun 2006

TL;DR: This paper first survey multi-view stereo algorithms and compare them qualitatively using a taxonomy that differentiates their key properties, then describes the process for acquiring and calibrating multiview image datasets with high-accuracy ground truth and introduces the evaluation methodology.

...read moreread less

Abstract: This paper presents a quantitative comparison of several multi-view stereo reconstruction algorithms. Until now, the lack of suitable calibrated multi-view image datasets with known ground truth (3D shape models) has prevented such direct comparisons. In this paper, we first survey multi-view stereo algorithms and compare them qualitatively using a taxonomy that differentiates their key properties. We then describe our process for acquiring and calibrating multiview image datasets with high-accuracy ground truth and introduce our evaluation methodology. Finally, we present the results of our quantitative comparison of state-of-the-art multi-view stereo reconstruction algorithms on six benchmark datasets. The datasets, evaluation details, and instructions for submitting new models are available online at http://vision.middlebury.edu/mview.

...read moreread less

2,556 citations

"Accurate, Dense, and Robust Multivi..." refers background or methods in this paper

...QUANTITATIVE EVALUATIONS PROVIDED AT [2]....
[...]
...Quantitative evaluations of state-of-the-art MVS algorithms are presented at [2] in terms of accuracy (distance d such that a given percentage of the reconstruction is within d from the ground truth model) and completeness (percentage of the ground truth model that is within a given distance from the reconstruction)....
[...]
...10Rendered views of the reconstructions and all the quantitative evaluations can be found at [2]....
[...]
...The patch generation algorithm is very efficient, in particular, takes only a few minutes for temple and dino, in comparison to most other state-of-the-art techniques evaluated at [2] that take more than half an hour....
[...]
...[2], state-of-the-art MVS algorithms achieve relative accuracy better than 1/200 (1mm for a 20cm wide object) from a set of low-resolution (640×480) images....
[...]

Journal Article•DOI•

A Theory of Shape by Space Carving

[...]

Kiriakos N. Kutulakos¹, Steven M. Seitz²•Institutions (2)

University of Rochester¹, Carnegie Mellon University²

21 Jul 2000-International Journal of Computer Vision

TL;DR: A provably-correct algorithm is given, called Space Carving, for computing the 3D shape of an unknown, arbitrarily-shaped scene from multiple photographs taken at known but arbitrarily-distributed viewpoints to capture photorealistic shapes that accurately model scene appearance from a wide range of viewpoints.

...read moreread less

Abstract: In this paper we consider the problem of computing the 3D shape of an unknown, arbitrarily-shaped scene from multiple photographs taken at known but arbitrarily-distributed viewpoints. By studying the equivalence class of all 3D shapes that reproduce the input photographs, we prove the existence of a special member of this class, the photo hull, that (1) can be computed directly from photographs of the scene, and (2) subsumes all other members of this class. We then give a provably-correct algorithm, called Space Carving, for computing this shape and present experimental results on complex real-world scenes. The approach is designed to (1) capture photorealistic shapes that accurately model scene appearance from a wide range of viewpoints, and (2) account for the complex interactions between occlusion, parallax, shading, and their view-dependent effects on scene-appearance.

...read moreread less

1,487 citations

Proceedings Article•DOI•

Mesh optimization

[...]

Hugues Hoppe, Tony DeRose, Tom Duchamp, John W. McDonald, Werner Stuetzle - Show less +1 more

01 Sep 1993

TL;DR: In this article, the authors present a method for solving the following problem: given a set of data points scattered in three dimensions and an initial triangular mesh M0, produce a mesh M, of the same topological type as M0 that fits the data well and has a small number of vertices.

...read moreread less

Abstract: We present a method for solving the following problem: Given a set of data points scattered in three dimensions and an initial triangular mesh M0, produce a mesh M, of the same topological type as M0, that fits the data well and has a small number of vertices. Our approach is to minimize an energy function that explicitly models the competing desires of conciseness of representation and fidelity to the data. We show that mesh optimization can be effectively used in at least two applications: surface reconstruction from unorganized points, and mesh simplification (the reduction of the number of vertices in an initially dense mesh of triangles).

...read moreread less

1,424 citations