scispace - formally typeset
Search or ask a question

Showing papers on "3D reconstruction published in 2014"


Proceedings ArticleDOI
29 Sep 2014
TL;DR: This work introduces the Imperial College London and National University of Ireland Maynooth (ICL-NUIM) dataset and presents a collection of handheld RGB-D camera sequences within synthetically generated environments to provide a method to benchmark the surface reconstruction accuracy.
Abstract: We introduce the Imperial College London and National University of Ireland Maynooth (ICL-NUIM) dataset for the evaluation of visual odometry, 3D reconstruction and SLAM algorithms that typically use RGB-D data. We present a collection of handheld RGB-D camera sequences within synthetically generated environments. RGB-D sequences with perfect ground truth poses are provided as well as a ground truth surface model that enables a method of quantitatively evaluating the final map or surface reconstruction accuracy. Care has been taken to simulate typically observed real-world artefacts in the synthetic imagery by modelling sensor noise in both RGB and depth data. While this dataset is useful for the evaluation of visual odometry and SLAM trajectory estimation, our main focus is on providing a method to benchmark the surface reconstruction accuracy which to date has been missing in the RGB-D community despite the plethora of ground truth RGB-D datasets available.

857 citations


Book ChapterDOI
06 Sep 2014
TL;DR: Improved 3D structure and temporally consistent semantic segmentation for difficult, large scale, forward moving monocular image sequences is demonstrated.
Abstract: We present an approach for joint inference of 3D scene structure and semantic labeling for monocular video. Starting with monocular image stream, our framework produces a 3D volumetric semantic + occupancy map, which is much more useful than a series of 2D semantic label images or a sparse point cloud produced by traditional semantic segmentation and Structure from Motion(SfM) pipelines respectively. We derive a Conditional Random Field (CRF) model defined in the 3D space, that jointly infers the semantic category and occupancy for each voxel. Such a joint inference in the 3D CRF paves the way for more informed priors and constraints, which is otherwise not possible if solved separately in their traditional frameworks. We make use of class specific semantic cues that constrain the 3D structure in areas, where multiview constraints are weak. Our model comprises of higher order factors, which helps when the depth is unobservable.We also make use of class specific semantic cues to reduce either the degree of such higher order factors, or to approximately model them with unaries if possible. We demonstrate improved 3D structure and temporally consistent semantic segmentation for difficult, large scale, forward moving monocular image sequences.

282 citations


Journal ArticleDOI
27 Jul 2014
TL;DR: This work presents a global optimization approach for mapping color images onto geometric reconstructions by optimizing camera poses in tandem with non-rigid correction functions for all images to maximize the photometric consistency of the reconstructed mapping.
Abstract: We present a global optimization approach for mapping color images onto geometric reconstructions. Range and color videos produced by consumer-grade RGB-D cameras suffer from noise and optical distortions, which impede accurate mapping of the acquired color data to the reconstructed geometry. Our approach addresses these sources of error by optimizing camera poses in tandem with non-rigid correction functions for all images. All parameters are optimized jointly to maximize the photometric consistency of the reconstructed mapping. We show that this optimization can be performed efficiently by an alternating optimization algorithm that interleaves analytical updates of the color map with decoupled parameter updates for all images. Experimental results demonstrate that our approach substantially improves color mapping fidelity.

251 citations


Proceedings ArticleDOI
23 Jun 2014
TL;DR: This paper forms the reconstruction task as a linear inverse problem on the transient response of a scene, which they acquire using an affordable setup consisting of a modulated light source and a time-of-flight image sensor, and achieves resolutions in the order of a few centimeters for object shape and albedo.
Abstract: The functional difference between a diffuse wall and a mirror is well understood: one scatters back into all directions, and the other one preserves the directionality of reflected light. The temporal structure of the light, however, is left intact by both: assuming simple surface reflection, photons that arrive first are reflected first. In this paper, we exploit this insight to recover objects outside the line of sight from second-order diffuse reflections, effectively turning walls into mirrors. We formulate the reconstruction task as a linear inverse problem on the transient response of a scene, which we acquire using an affordable setup consisting of a modulated light source and a time-of-flight image sensor. By exploiting sparsity in the reconstruction domain, we achieve resolutions in the order of a few centimeters for object shape (depth and laterally) and albedo. Our method is robust to ambient light and works for large room-sized scenes. It is drastically faster and less expensive than previous approaches using femtosecond lasers and streak cameras, and does not require any moving parts.

184 citations


Journal ArticleDOI
TL;DR: In the search to develop a complete 3D reconstruction pipeline, this work has comprehensively studied techniques related to this topic and divided the 3D digitization process in four major overviews: image acquisition, view registration, mesh integration and texture generation.

179 citations


Proceedings ArticleDOI
06 Nov 2014
TL;DR: It is shown how a simple world model for AR applications can be derived from semi-dense depth maps, and the practical applicability in the context of an AR application in which simulated objects can collide with real geometry is demonstrated.
Abstract: We present a direct monocular visual odometry system which runs in real-time on a smartphone. Being a direct method, it tracks and maps on the images themselves instead of extracted features such as keypoints. New images are tracked using direct image alignment, while geometry is represented in the form of a semi-dense depth map. Depth is estimated by filtering over many small-baseline, pixel-wise stereo comparisons. This leads to significantly less outliers and allows to map and use all image regions with sufficient gradient, including edges. We show how a simple world model for AR applications can be derived from semi-dense depth maps, and demonstrate the practical applicability in the context of an AR application in which simulated objects can collide with real geometry.

169 citations


Proceedings ArticleDOI
23 Jun 2014
TL;DR: This paper presents a system to reconstruct piecewise planar and compact floorplans from images, which are then converted to high quality texture-mapped models for free- viewpoint visualization, and shows that the texture mapped mesh models provide compelling free-viewpoint visualization experiences, when compared against the state-of-the-art and ground truth.
Abstract: This paper presents a system to reconstruct piecewise planar and compact floorplans from images, which are then converted to high quality texture-mapped models for free- viewpoint visualization. There are two main challenges in image-based floorplan reconstruction. The first is the lack of 3D information that can be extracted from images by Structure from Motion and Multi-View Stereo, as indoor scenes abound with non-diffuse and homogeneous surfaces plus clutter. The second challenge is the need of a sophisti- cated regularization technique that enforces piecewise pla- narity, to suppress clutter and yield high quality texture mapped models. Our technical contributions are twofold. First, we propose a novel structure classification technique to classify each pixel to three regions (floor, ceiling, and wall), which provide 3D cues even from a single image. Second, we cast floorplan reconstruction as a shortest path problem on a specially crafted graph, which enables us to enforce piecewise planarity. Besides producing compact piecewise planar models, this formulation allows us to di- rectly control the number of vertices (i.e., density) of the output mesh. We evaluate our system on real indoor scenes, and show that our texture mapped mesh models provide compelling free-viewpoint visualization experiences, when compared against the state-of-the-art and ground truth.

134 citations


Journal ArticleDOI
TL;DR: Virtual finger (VF) is developed to generate 3D curves, points and regions-of-interest in the 3D space of a volumetric image with a single finger operation, such as a computer mouse stroke, or click or zoom from the 2D-projection plane of an image as visualized with a computer.
Abstract: Three-dimensional (3D) bioimaging, visualization and data analysis are in strong need of powerful 3D exploration techniques. We develop virtual finger (VF) to generate 3D curves, points and regions-of-interest in the 3D space of a volumetric image with a single finger operation, such as a computer mouse stroke, or click or zoom from the 2D-projection plane of an image as visualized with a computer. VF provides efficient methods for acquisition, visualization and analysis of 3D images for roundworm, fruitfly, dragonfly, mouse, rat and human. Specifically, VF enables instant 3D optical zoom-in imaging, 3D free-form optical microsurgery, and 3D visualization and annotation of terabytes of whole-brain image volumes. VF also leads to orders of magnitude better efficiency of automated 3D reconstruction of neurons and similar biostructures over our previous systems. We use VF to generate from images of 1,107 Drosophila GAL4 lines a projectome of a Drosophila brain.

119 citations


Proceedings ArticleDOI
23 Jun 2014
TL;DR: It is discovered that 3D reconstruction can be achieved from asingle still photographic capture due to accidental motions of the photographer, even while attempting to hold the camera still, and the possibility that depth maps of sufficient quality for RGB-D photography applications likepective change, simulated aperture, and object segmentation, can come "for free" for a significant fraction of still photographsunder reasonable conditions.
Abstract: We have discovered that 3D reconstruction can be achieved from asingle still photographic capture due to accidental motions of thephotographer, even while attempting to hold the camera still. Although these motions result in little baseline and therefore high depth uncertainty, in theory, we can combine many such measurements over the duration of the capture process (a few seconds) to achieve usable depth estimates. Wepresent a novel 3D reconstruction system tailored for this problemthat produces depth maps from short video sequences from standard cameraswithout the need for multi-lens optics, active sensors, or intentionalmotions by the photographer. This result leads to the possibilitythat depth maps of sufficient quality for RGB-D photography applications likeperspective change, simulated aperture, and object segmentation, cancome "for free" for a significant fraction of still photographsunder reasonable conditions.

108 citations


Proceedings ArticleDOI
23 Jun 2014
TL;DR: A dense 3D reconstruction technique that jointly refines the shape and the camera parameters of a scene by minimizing the photometric reprojection error between a generated model and the observed images, hence considering all pixels in the original images is proposed.
Abstract: Motivated by a Bayesian vision of the 3D multi-view reconstruction from images problem, we propose a dense 3D reconstruction technique that jointly refines the shape and the camera parameters of a scene by minimizing the photometric reprojection error between a generated model and the observed images, hence considering all pixels in the original images. The minimization is performed using a gradient descent scheme coherent with the shape representation (here a triangular mesh), where we derive evolution equations in order to optimize both the shape and the camera parameters. This can be used at a last refinement step in 3D reconstruction pipelines and helps improving the 3D reconstruction's quality by estimating the 3D shape and camera calibration more accurately. Examples are shown for multi-view stereo where the texture is also jointly optimized and improved, but could be used for any generative approaches dealing with multi-view reconstruction settings (ie depth map fusion, multi-view photometric stereo).

107 citations


Journal ArticleDOI
TL;DR: The software package SPRING is presented that combines Fourier based symmetry analysis and real-space helical processing into a single workflow and enables the simultaneous exploration and evaluation of many symmetry combinations at low resolution.

Proceedings ArticleDOI
23 Jun 2014
TL;DR: It is demonstrated that the developed method can easily be integrated into a system for monocular interactive 3D modeling by substantially improving its accuracy while adding a negligible overhead to its performance and retaining its interactive potential.
Abstract: In this paper, we propose an efficient and accurate scheme for the integration of multiple stereo-based depth measurements. For each provided depth map a confidence-based weight is assigned to each depth estimate by evaluating local geometry orientation, underlying camera setting and photometric evidence. Subsequently, all hypotheses are fused together into a compact and consistent 3D model. Thereby, visibility conflicts are identified and resolved, and fitting measurements are averaged with regard to their confidence scores. The individual stages of the proposed approach are validated by comparing it to two alternative techniques which rely on a conceptually different fusion scheme and a different confidence inference, respectively. Pursuing live 3D reconstruction on mobile devices as a primary goal, we demonstrate that the developed method can easily be integrated into a system for monocular interactive 3D modeling by substantially improving its accuracy while adding a negligible overhead to its performance and retaining its interactive potential.

Journal ArticleDOI
TL;DR: Findings demonstrate that these two platforms, which integrate the scanning principle and image reconstruction methods, can supplement each other in terms of coverage, sensing resolution, and model accuracy to create high-quality 3D recordings and presentations.
Abstract: No single sensor can acquire complete information by applying one or several multi-surveys to cultural object reconstruction. For instance, a terrestrial laser scanner (TLS) usually obtains information on building facades, whereas aerial photogrammetry is capable of providing the perspective for building roofs. In this study, a camera-equipped unmanned aerial vehicle system (UAV) and a TLS were used in an integrated design to capture 3D point clouds and thus facilitate the acquisition of whole information on an object of interest for cultural heritage. A camera network is proposed to modify the image-based 3D reconstruction or structure from motion (SfM) method by taking full advantage of the flight control data acquired by the UAV platform. The camera network improves SfM performances in terms of image matching efficiency and the reduction of mismatches. Thus, this camera network modified SfM is employed to process the overlapping UAV image sets and to recover the scene geometry. The SfM output covers most information on building roofs, but has sparse resolution. The dense multi-view 3D reconstruction algorithm is then applied to improve in-depth detail. The two groups of point clouds from image reconstruction and TLS scanning are registered from coarse to fine with the use of an iterative method. This methodology has been tested on one historical monument in Fujian Province, China. Results show a final point cloud with complete coverage and in-depth details. Moreover, findings demonstrate that these two platforms, which integrate the scanning principle and image reconstruction methods, can supplement each other in terms of coverage, sensing resolution, and model accuracy to create high-quality 3D recordings and presentations.

Journal ArticleDOI
TL;DR: The paper aims to overcome the problems related to the use of macro lenses in photogrammetry, showing how it is possible to retrieve the camera calibration parameters of the sharp images by using an open source Structure from Motion software.

Patent
27 Jan 2014
TL;DR: In this article, camera pose information for a first color image captured by a camera on an MS may be obtained and a determination may be made whether to extend or update a first 3Dimensional (3D) model of an environment being modeled by the MS based, in part, on the first colour image and associated camera pose.
Abstract: Embodiments disclosed facilitate resource utilization efficiencies in Mobile Stations (MS) during 3D reconstruction. In some embodiments, camera pose information for a first color image captured by a camera on an MS may be obtained and a determination may be made whether to extend or update a first 3-Dimensional (3D) model of an environment being modeled by the MS based, in part, on the first color image and associated camera pose information. The depth sensor, which provides depth information for images captured by the camera, may be disabled, when the first 3D model is not extended or updated.

Journal ArticleDOI
TL;DR: The aim of this new pipeline, called Satellite Stereo Pipeline and abbreviated as s2p, is to use off-the-shelf computer vision tools while abstracting from the complexity associated to satellite imaging, and it is proved that the pushbroom geometry is very accurately approximated by the pinhole model.
Abstract: . The increasing availability of high resolution stereo images from Earth observation satellites has boosted the development of tools for producing 3D elevation models. The objective of these tools is to produce digital elevation models of very large areas with minimal human intervention. The development of these tools has been shaped by the constraints of the remote sensing acquisition, for example, using ad hoc stereo matching tools to deal with the pushbroom image geometry. However, this specialization has also created a gap with respect to the fields of computer vision and image processing, where these constraints are usually factored out. In this work we propose a fully automatic and modular stereo pipeline to produce digital elevation models from satellite images. The aim of this new pipeline, called Satellite Stereo Pipeline and abbreviated as s2p, is to use (and test) off-the-shelf computer vision tools while abstracting from the complexity associated to satellite imaging. To this aim, images are cut in small tiles for which we proved that the pushbroom geometry is very accurately approximated by the pinhole model. These tiles are then processed with standard stereo image rectification and stereo matching tools. The specifics of satellite imaging such as pointing accuracy refinement, estimation of the initial elevation from SRTM data, and geodetic coordinate systems are handled transparently by s2p. We demonstrate the robustness of our approach on a large database of satellite images and by providing an online demo of s2p.

Proceedings ArticleDOI
01 Sep 2014
TL;DR: It is shown that plane-based prior models can be applied even though planes in 3D do not project to planes in the omnidirectional domain, and the obtained depth maps can be represented very compactly by a small number of image segments and plane parameters.
Abstract: This paper proposes a method for high-quality omnidirectional 3D reconstruction of augmented Manhattan worlds from catadioptric stereo video sequences. In contrast to existing works we do not rely on constructing virtual perspective views, but instead propose to optimize depth jointly in a unified omnidirectional space. Furthermore, we show that plane-based prior models can be applied even though planes in 3D do not project to planes in the omnidirectional domain. Towards this goal, we propose an omnidirectional slanted-plane Markov random field model which relies on plane hypotheses extracted using a novel voting scheme for 3D planes in omnidirectional space. To quantitatively evaluate our method we introduce a dataset which we have captured using our autonomous driving platform AnnieWAY which we equipped with two horizontally aligned catadioptric cameras and a Velodyne HDL-64E laser scanner for precise ground truth depth measurements. As evidenced by our experiments, the proposed method clearly benefits from the unified view and significantly outperforms existing stereo matching techniques both quantitatively and qualitatively. Furthermore, our method is able to reduce noise and the obtained depth maps can be represented very compactly by a small number of image segments and plane parameters.

Proceedings Article
08 Dec 2014
TL;DR: This paper proposes a method that combines dense optical flow tracking, motion trajectory clustering and NRSfM for 3D reconstruction of objects in videos, and is the first to extract dense object models from realistic videos, such as those found in Youtube or Hollywood movies, without object-specific priors.
Abstract: Extracting 3D shape of deforming objects in monocular videos, a task known as non-rigid structure-from-motion (NRSfM), has so far been studied only on synthetic datasets and controlled environments. Typically, the objects to reconstruct are pre-segmented, they exhibit limited rotations and occlusions, or full-length trajectories are assumed. In order to integrate NRSfM into current video analysis pipelines, one needs to consider as input realistic -thus incomplete- tracking, and perform spatio-temporal grouping to segment the objects from their surroundings. Furthermore, NRSfM needs to be robust to noise in both segmentation and tracking, e.g., drifting, segmentation "leaking", optical flow "bleeding" etc. In this paper, we make a first attempt towards this goal, and propose a method that combines dense optical flow tracking, motion trajectory clustering and NRSfM for 3D reconstruction of objects in videos. For each trajectory cluster, we compute multiple reconstructions by minimizing the reprojection error and the rank of the 3D shape under different rank bounds of the trajectory matrix. We show that dense 3D shape is extracted and trajectories are completed across occlusions and low textured regions, even under mild relative motion between the object and the camera. We achieve competitive results on a public NRSfM benchmark while using fixed parameters across all sequences and handling incomplete trajectories, in contrast to existing approaches. We further test our approach on popular video segmentation datasets. To the best of our knowledge, our method is the first to extract dense object models from realistic videos, such as those found in Youtube or Hollywood movies, without object-specific priors.

Proceedings ArticleDOI
23 Jun 2014
TL;DR: This paper leverages occluding contours (aka "internal silhouettes") to improve the performance of multi-view stereo methods, yielding dramatic quality improvements both around object contours and in surface detail.
Abstract: This paper leverages occluding contours (aka "internal silhouettes") to improve the performance of multi-view stereo methods. The contributions are 1) a new technique to identify free-space regions arising from occluding contours, and 2) a new approach for incorporating the resulting free-space constraints into Poisson surface reconstruction. The proposed approach outperforms state of the art MVS techniques for challenging Internet datasets, yielding dramatic quality improvements both around object contours and in surface detail.

Journal ArticleDOI
TL;DR: Objects in a railway environment such as the ground, railroads, buildings, high voltage powerlines, pylons and so on were reconstructed and visualized in real-life experiments in Kokemaki, Finland.
Abstract: This paper presents methods for 3D modeling of railway environments from airborne laser scanning (ALS) and mobile laser scanning (MLS). Conventionally, aerial data such as ALS and aerial images were utilized for 3D model reconstruction. However, 3D model reconstruction only from aerial-view datasets can not meet the requirement of advanced visualization (e.g., walk-through visualization). In this paper, objects in a railway environment such as the ground, railroads, buildings, high voltage powerlines, pylons and so on were reconstructed and visualized in real-life experiments in Kokemaki, Finland. Because of the complex terrain and scenes in railway environments, 3D modeling is challenging, especially for high resolution walk-through visualizations. However, MLS has flexible platforms and provides the possibility of acquiring data in a complex environment in high detail by combining with ALS data to produce complete 3D scene modeling. A procedure from point cloud classification to 3D reconstruction and 3D visualization is introduced, and new solutions are proposed for object extraction, 3D reconstruction, model simplification and final model 3D visualization. Image processing technology is used for the classification, 3D randomized Hough transformations (RHT) are used for the planar detection, and a quadtree approach is used for the ground model simplification. The results are visually analyzed by a comparison with an orthophoto at a 20 cm ground resolution.

Journal ArticleDOI
TL;DR: An empirical procedure is presented for differentiating between a reconstruction with well-aligned particles and a Reconstruction with grossly misclassified particles.

Journal ArticleDOI
09 Jul 2014
TL;DR: A method for the 3D reconstruction of a piecewise‐planar surface from range images, typically laser scans with millions of points, which is a watertight polygonal mesh that conforms to observations at a given scale in the visible planar parts of the scene, and that is plausible in hidden parts.
Abstract: This paper presents a method for the 3D reconstruction of a piecewise-planar surface from range images, typically laser scans with millions of points. The reconstructed surface is a watertight polygonal mesh that conforms to observations at a given scale in the visible planar parts of the scene, and that is plausible in hidden parts. We formulate surface reconstruction as a discrete optimization problem based on detected and hypothesized planes. One of our major contributions, besides a treatment of data anisotropy and novel surface hypotheses, is a regularization of the reconstructed surface w.r.t. the length of edges and the number of corners. Compared to classical area-based regularization, it better captures surface complexity and is therefore better suited for man-made environments, such as buildings. To handle the underlying higher-order potentials, that are problematic for MRF optimizers, we formulate minimization as a sparse mixed-integer linear programming problem and obtain an approximate solution using a simple relaxation. Experiments show that it is fast and reaches near-optimal solutions.

Proceedings ArticleDOI
23 Jun 2014
TL;DR: A depth-guided photometric 3D reconstruction method that works solely with a depth camera like the Kinect, and is believed to be the first method to use an IR depth camera system in this manner.
Abstract: In this paper we present a depth-guided photometric 3D reconstruction method that works solely with a depth camera like the Kinect Existing methods that fuse depth with normal estimates use an external RGB camera to obtain photometric information and treat the depth camera as a black box that provides a low quality depth estimate Our contribution to such methods are two fold Firstly, instead of using an extra RGB camera, we use the infra-red (IR) camera of the depth camera system itself to directly obtain high resolution photometric information We believe that ours is the first method to use an IR depth camera system in this manner Secondly, photometric methods applied to complex objects result in numerous holes in the reconstructed surface due to shadows and self-occlusions To mitigate this problem, we develop a simple and effective multiview reconstruction approach that fuses depth and normal information from multiple viewpoints to build a complete, consistent and accurate 3D surface representation We demonstrate the efficacy of our method to generate high quality 3D surface reconstructions for some complex 3D figurines

Journal ArticleDOI
TL;DR: The paper for the first time analyzed and proposed the optimal finger shape model and results obtained from different fingerprint feature correspondences are analyzed and compared to show which features are more suitable for 3D fingerprint images generation.

Proceedings ArticleDOI
23 Jun 2014
TL;DR: This work forms an object class specific shape prior in the form of spatially varying anisotropic smoothness terms for volumetric multi-label reconstruction approaches, which allows a segmentation between the object and its supporting ground.
Abstract: Dense 3D reconstruction of real world objects containing textureless, reflective and specular parts is a challenging task. Using general smoothness priors such as surface area regularization can lead to defects in the form of disconnected parts or unwanted indentations. We argue that this problem can be solved by exploiting the object class specific local surface orientations, e.g. a car is always close to horizontal in the roof area. Therefore, we formulate an object class specific shape prior in the form of spatially varying anisotropic smoothness terms. The parameters of the shape prior are extracted from training data. We detail how our shape prior formulation directly fits into recently proposed volumetric multi-label reconstruction approaches. This allows a segmentation between the object and its supporting ground. In our experimental evaluation we show reconstructions using our trained shape prior on several challenging datasets.

Journal ArticleDOI
TL;DR: The use of Gabor filters is explored to extract information about the orientation of the object edges that produce the events, therefore increasing the number of constraints applied to the matching algorithm, which provides more reliably matched pairs of events, improving the final 3D reconstruction.
Abstract: The recently developed Dynamic Vision Sensors (DVS) sense visual information asynchronously and code it into trains of events with sub-micro second temporal resolution. This high temporal precision makes the output of these sensors especially suited for dynamic 3D visual reconstruction, by matching corresponding events generated by two different sensors in a stereo setup. This paper explores the use of Gabor filters to extract information about the orientation of the object edges that produce the events, therefore increasing the number of restrictions applied to the matching algorithm. This strategy provides a larger number of pairs of matching events, improving the final 3D reconstruction.

Journal ArticleDOI
TL;DR: Evidence is presented showing the ChESS feature detector, designed to exclusively respond to chess-board vertices, superior robustness, accuracy, and efficiency in comparison to other commonly used detectors, both under simulation and in experimental 3D reconstruction of flat plate and cylindrical objects.

Journal ArticleDOI
TL;DR: The properties of SIRT, TVM and DART reconstructions are studied with respect to having only a limited number of electrons available for imaging and applying different angular sampling schemes and overall, SIRT algorithm is the most stable method and insensitive to changes in angular sampling.

Proceedings ArticleDOI
08 Dec 2014
TL;DR: This work proposes a novel linear approximation of the nonlinear camera responses with normal estimation algorithm that will be useful in twofold: resolving the shape-light ambiguity in uncalibrated photometric stereo and guiding the estimated normals to produce the high quality 3D surface.
Abstract: Photometric stereo using unorganized Internet images is very challenging, because the input images are captured under unknown general illuminations, with uncontrolled cameras. We propose to solve this difficult problem by a simple yet effective approach that makes use of a coarse shape prior. The shape prior is obtained from multi-view stereo and will be useful in twofold: resolving the shape-light ambiguity in uncalibrated photometric stereo and guiding the estimated normals to produce the high quality 3D surface. By assuming the surface albedo is not highly contrasted, we also propose a novel linear approximation of the nonlinear camera responses with our normal estimation algorithm. We evaluate our method using synthetic data and demonstrate the surface improvement on real data over multi-view stereo results.

Journal ArticleDOI
TL;DR: This paper evaluates some feature-based methods used to automatically extract the tie points necessary for calibration and orientation procedures, in order to better understand their performances for 3D reconstruction purposes.
Abstract: . Every day new tools and algorithms for automated image processing and 3D reconstruction purposes become available, giving the possibility to process large networks of unoriented and markerless images, delivering sparse 3D point clouds at reasonable processing time. In this paper we evaluate some feature-based methods used to automatically extract the tie points necessary for calibration and orientation procedures, in order to better understand their performances for 3D reconstruction purposes. The performed tests – based on the analysis of the SIFT algorithm and its most used variants – processed some datasets and analysed various interesting parameters and outcomes (e.g. number of oriented cameras, average rays per 3D points, average intersection angles per 3D points, theoretical precision of the computed 3D object coordinates, etc.).