scispace - formally typeset
Search or ask a question

Showing papers on "3D reconstruction published in 2012"


Journal ArticleDOI
TL;DR: This work addresses traditional multiview stereo methods to the extracted low-resolution views can result in reconstruction errors due to aliasing, and incorporates Lambertian and texture preserving priors to reconstruct both scene depth and its superresolved texture in a variational Bayesian framework.
Abstract: Portable light field (LF) cameras have demonstrated capabilities beyond conventional cameras. In a single snapshot, they enable digital image refocusing and 3D reconstruction. We show that they obtain a larger depth of field but maintain the ability to reconstruct detail at high resolution. In fact, all depths are approximately focused, except for a thin slab where blur size is bounded, i.e., their depth of field is essentially inverted compared to regular cameras. Crucial to their success is the way they sample the LF, trading off spatial versus angular resolution, and how aliasing affects the LF. We show that applying traditional multiview stereo methods to the extracted low-resolution views can result in reconstruction errors due to aliasing. We address these challenges using an explicit image formation model, and incorporate Lambertian and texture preserving priors to reconstruct both scene depth and its superresolved texture in a variational Bayesian framework, eliminating aliasing by fusing multiview information. We demonstrate the method on synthetic and real images captured with our LF camera, and show that it can outperform other computational camera systems.

434 citations


Journal ArticleDOI
TL;DR: This work proposes a new method for reconstruction of 3D fetal brain MRI from 2D slices that is interleaved with motion correction and shows excellent results for clinical and optimized data.

326 citations


Proceedings ArticleDOI
13 Oct 2012
TL;DR: In this article, an empirically derived noise model for the Kinect sensor is presented, where both lateral and axial noise distributions are measured as a function of both distance and angle of the Kinect to an observed surface.
Abstract: We contribute an empirically derived noise model for the Kinect sensor. We systematically measure both lateral and axial noise distributions, as a function of both distance and angle of the Kinect to an observed surface. The derived noise model can be used to filter Kinect depth maps for a variety of applications. Our second contribution applies our derived noise model to the KinectFusion system to extend filtering, volumetric fusion, and pose estimation within the pipeline. Qualitative results show our method allows reconstruction of finer details and the ability to reconstruct smaller objects and thinner surfaces. Quantitative results also show our method improves pose estimation accuracy.

308 citations


Book
07 Nov 2012
TL;DR: This work is a technical introduction to TOF sensors, from architectural and design issues, to selected image processing and computer vision methods.
Abstract: Time-of-flight (TOF) cameras provide a depth value at each pixel, from which the 3D structure of the scene can be estimated. This new type of active sensor makes it possible to go beyond traditional 2D image processing, directly to depth-based and 3D scene processing. Many computer vision and graphics applications can benefit from TOF data, including 3D reconstruction, activity and gesture recognition, motion capture and face detection. It is already possible to use multiple TOF cameras, in order to increase the scene coverage, and to combine the depth data with images from several colour cameras. Mixed TOF and colour systems can be used for computational photography, including full 3D scene modelling, as well as for illumination and depth-of-field manipulations. This work is a technical introduction to TOF sensors, from architectural and design issues, to selected image processing and computer vision methods.

229 citations


Journal ArticleDOI
TL;DR: A new reconstruction algorithm for electron tomography, which is based on compressive sensing, is applied and it is shown that missing wedge artefacts are reduced in the final reconstruction.

204 citations


Journal ArticleDOI
01 Jan 2012
TL;DR: This work proposes a fast, accurate, and robust method to extract laser stripes in industrial environments using an improved Split-and-Merge approach with different approximation functions including linear, quadratic, and Akima splines.
Abstract: The use of 3D reconstruction based on active laser triangulation techniques is very complex in industrial environments. The main problem is that most of these techniques are based on laser stripe extraction methods which are highly sensitive to noise, which is virtually inevitable in these conditions. In industrial environments, variable luminance, reflections which show up in the images as noise, and uneven surfaces are common. These factors modify the shape of the laser profile. This work proposes a fast, accurate, and robust method to extract laser stripes in industrial environments. Specific procedures are proposed to extract the laser stripe projected on the background, using a boundary linking process, and on the foreground, using an improved Split-and-Merge approach with different approximation functions including linear, quadratic, and Akima splines. Also, a novel procedure to automatically define the region of interest in the image is proposed. The real-time performance of the proposed method is analyzed by measuring the time taken by the tasks involved in their application. Finally, the proposed extraction method is applied to two real applications: 3D reconstruction of steel strips and weld seam tracking.

120 citations


Journal ArticleDOI
TL;DR: Comparison of the models obtained using the presented method with those obtained using a precise laser scanner shows that multiview 3D reconstruction yields models that present a root mean square error/average linear dimensions between 0.11 and 0.68%.

102 citations


Journal ArticleDOI
TL;DR: In experiments on several real-world data sets, it is shown that exploiting a silhouette coherency criterion in a multiview setting allows for dramatic improvements of silhouette quality over independent 2D segmentations without any significant increase of computational efforts.
Abstract: We propose a probabilistic formulation of joint silhouette extraction and 3D reconstruction given a series of calibrated 2D images. Instead of segmenting each image separately in order to construct a 3D surface consistent with the estimated silhouettes, we compute the most probable 3D shape that gives rise to the observed color information. The probabilistic framework, based on Bayesian inference, enables robust 3D reconstruction by optimally taking into account the contribution of all views. We solve the arising maximum a posteriori shape inference in a globally optimal manner by convex relaxation techniques in a spatially continuous representation. For an interactively provided user input in the form of scribbles specifying foreground and background regions, we build corresponding color distributions as multivariate Gaussians and find a volume occupancy that best fits to this data in a variational sense. Compared to classical methods for silhouette-based multiview reconstruction, the proposed approach does not depend on initialization and enjoys significant resilience to violations of the model assumptions due to background clutter, specular reflections, and camera sensor perturbations. In experiments on several real-world data sets, we show that exploiting a silhouette coherency criterion in a multiview setting allows for dramatic improvements of silhouette quality over independent 2D segmentations without any significant increase of computational efforts. This results in more accurate visual hull estimation, needed by a multitude of image-based modeling approaches. We made use of recent advances in parallel computing with a GPU implementation of the proposed method generating reconstructions on volume grids of more than 20 million voxels in up to 4.41 seconds.

100 citations


Proceedings ArticleDOI
05 Nov 2012
TL;DR: A novel non-invasive system, using arbitrary scene geometry as a light probe for photometric registration, and a general AR rendering pipeline supporting real-time global illumination techniques, based on state of the art real- time geometric reconstruction.
Abstract: Visually coherent rendering for augmented reality is concerned with seamlessly blending the virtual world and the real world in real-time. One challenge in achieving this is the correct handling of lighting. We are interested in applying real-world light to virtual objects, and compute the interaction of light between virtual and real. This implies the measurement of the real-world lighting, also known as photometric registration. So far, photometric registration has mainly been done through capturing images with artificial light probes, such as mirror balls or planar markers, or by using high dynamic range cameras with fish-eye lenses. In this paper, we present a novel non-invasive system, using arbitrary scene geometry as a light probe for photometric registration, and a general AR rendering pipeline supporting real-time global illumination techniques. Based on state of the art real-time geometric reconstruction, we show how to robustly extract data for photometric registration to compute a realistic representation of the real-world diffuse lighting. Our approach estimates the light from observations of the reconstructed model and is based on spherical harmonics, enabling plausible illumination such as soft shadows, in a mixed virtual-real rendering pipeline.

95 citations


Journal ArticleDOI
01 Jul 2012
TL;DR: This research proposes a passive, camera-based system that is robust against arbitrary motion since all data is acquired within the time period of a single exposure and can successfully reconstruct a variety of facial-hair styles together with the underlying skin surface.
Abstract: Although facial hair plays an important role in individual expression, facial-hair reconstruction is not addressed by current face-capture systems Our research addresses this limitation with an algorithm that treats hair and skin surface capture together in a coupled fashion so that a high-quality representation of hair fibers as well as the underlying skin surface can be reconstructed We propose a passive, camera-based system that is robust against arbitrary motion since all data is acquired within the time period of a single exposure Our reconstruction algorithm detects and traces hairs in the captured images and reconstructs them in 3D using a multiview stereo approach Our coupled skin-reconstruction algorithm uses information about the detected hairs to deliver a skin surface that lies underneath all hairs irrespective of occlusions In dense regions like eyebrows, we employ a hair-synthesis method to create hair fibers that plausibly match the image data We demonstrate our scanning system on a number of individuals and show that it can successfully reconstruct a variety of facial-hair styles together with the underlying skin surface

94 citations


Journal ArticleDOI
TL;DR: Experimental results on various data sets for diverse robotic applications have demonstrated that the novel framework is accurate, robust, maintains the orthogonality of the vanishing points and can run in real-time.
Abstract: Rotation estimation is a fundamental step for various robotic applications such as automatic control of ground/aerial vehicles, motion estimation and 3D reconstruction. However it is now well established that traditional navigation equipments, such as global positioning systems (GPSs) or inertial measurement units (IMUs), suffer from several disadvantages. Hence, some vision-based works have been proposed recently. Whereas interesting results can be obtained, the existing methods have non-negligible limitations such as a difficult feature matching (e.g. repeated textures, blur or illumination changes) and a high computational cost (e.g. analyze in the frequency domain). Moreover, most of them utilize conventional perspective cameras and thus have a limited field of view. In order to overcome these limitations, in this paper we present a novel rotation estimation approach based on the extraction of vanishing points in omnidirectional images. The first advantage is that our rotation estimation is decoupled from the translation computation, which accelerates the execution time and results in a better control solution. This is made possible by our complete framework dedicated to omnidirectional vision, whereas conventional vision has a rotation/translation ambiguity. Second, we propose a top-down approach which maintains the important constraint of vanishing point orthogonality by inverting the problem: instead of performing a difficult line clustering preliminary step, we directly search for the orthogonal vanishing points. Finally, experimental results on various data sets for diverse robotic applications have demonstrated that our novel framework is accurate, robust, maintains the orthogonality of the vanishing points and can run in real-time.

Proceedings ArticleDOI
05 Nov 2012
TL;DR: An algorithm for real-time surface light-field capture from a single hand-held camera, which is able to capture dense illumination information for general specular surfaces and is ideal for future combination with dense 3D reconstruction methods.
Abstract: A single hand-held camera provides an easily accessible but potentially extremely powerful setup for augmented reality. Capabilities which previously required expensive and complicated infrastructure have gradually become possible from a live monocular video feed, such as accurate camera tracking and, most recently, dense 3D scene reconstruction. A new frontier is to work towards recovering the reflectance properties of general surfaces and the lighting configuration in a scene without the need for probes, omni-directional cameras or specialised light-field cameras. Specular lighting phenomena cause effects in a video stream which can lead current tracking and reconstruction algorithms to fail. However, the potential exists to measure and use these effects to estimate deeper physical details about an environment, enabling advanced scene understanding and more convincing AR. In this paper we present an algorithm for real-time surface light-field capture from a single hand-held camera, which is able to capture dense illumination information for general specular surfaces. Our system incorporates a guidance mechanism to help the user interactively during capture. We then split the light-field into its diffuse and specular components, and show that the specular component can be used for estimation of an environment map. This enables the convincing placement of an augmentation on a specular surface such as a shiny book, with realistic synthesized shadow, reflection and occlusion of specularities as the viewpoint changes. Our method currently works for planar scenes, but the surface light-field representation makes it ideal for future combination with dense 3D reconstruction methods.

Journal ArticleDOI
TL;DR: An automated UAV based data acquisition and outdoor site reconstruction system using a coarse digital surface model (DSM) with minimal data preprocessing and a developed view planning heuristic that considers a coverage, a maximum view angle and an overlapping constraint imposed by multi-view stereo reconstruction techniques.
Abstract: Multi-view stereo algorithms are an attractive technique for the digital reconstruction of outdoor sites. Concerning the data acquisition process a vertical take off and landing UAV carrying a digital camera is a suitable platform in terms of mobility and flexibility in viewpoint placement. We introduce an automated UAV based data acquisition and outdoor site reconstruction system. A special focus is set on the problem of model based view planning using a coarse digital surface model (DSM) with minimal data preprocessing. The developed view planning heuristic considers a coverage, a maximum view angle and an overlapping constraint imposed by multi-view stereo reconstruction techniques. The time complexity of the algorithm is linear with respect to the size of the area of interest. We demonstrate the efficiency of the entire system in two scenarios, a building and a hillside.

Book ChapterDOI
05 Nov 2012
TL;DR: The solution is to learn nonlinear and probabilistic low dimensional latent spaces, using the Gaussian Process Latent Variable Models dimensionality reduction technique, which acts as class or activity constraints to a simultaneous and variational segmentation --- recovery --- reconstruction process.
Abstract: We propose a novel framework for joint 2D segmentation and 3D pose and 3D shape recovery, for images coming from a single monocular source. In the past, integration of all three has proven difficult, largely because of the high degree of ambiguity in the 2D - 3D mapping. Our solution is to learn nonlinear and probabilistic low dimensional latent spaces, using the Gaussian Process Latent Variable Models dimensionality reduction technique. These act as class or activity constraints to a simultaneous and variational segmentation --- recovery --- reconstruction process. We define an image and level set based energy function, which we minimise with respect to 3D pose and shape, 2D segmentation resulting automatically as the projection of the recovered shape under the recovered pose. We represent 3D shapes as zero levels of 3D level set embedding functions, which we project down directly to probabilistic 2D occupancy maps, without the requirement of an intermediary explicit contour stage. Finally, we detail a fast, open-source, GPU-based implementation of our algorithm, which we use to produce results on both real and artificial video sequences.

Journal ArticleDOI
TL;DR: The recommended approach toward intraoperative surface reconstruction based on stereo endoscopic images is fast, robust, and accurate and can represent changes in the intraoperative environment and can be used to adapt a preoperative model within the surgical site by registration of these two models.
Abstract: Purpose In laparoscopic surgery, soft tissue deformations substantially change the surgical site, thus impeding the use of preoperative planning during intraoperative navigation. Extracting depth information from endoscopic images and building a surface model of the surgical field-of-view is one way to represent this constantly deforming environment. The information can then be used for intraoperative registration. Stereo reconstruction is a typical problem within computer vision. However, most of the available methods do not fulfill the specific requirements in a minimally invasive setting such as the need of real-time performance, the problem of view-dependent specular reflections and large curved areas with partly homogeneous or periodic textures and occlusions. Methods In this paper, the authors present an approach toward intraoperative surface reconstruction based on stereo endoscopic images. The authors describe our answer to this problem through correspondence analysis, disparity correction and refinement, 3D reconstruction, point cloud smoothing and meshing. Real-time performance is achieved by implementing the algorithms on the gpu. The authors also present a new hybrid cpu-gpu algorithm that unifies the advantages of the cpu and the gpu version. Results In a comprehensive evaluation using in vivo data, in silico data from the literature and virtual data from a newly developed simulation environment, the cpu, the gpu, and the hybrid cpu-gpu versions of the surface reconstruction are compared to a cpu and a gpu algorithm from the literature. The recommended approach toward intraoperative surface reconstruction can be conducted in real-time depending on the image resolution (20 fps for the gpu and 14fps for the hybrid cpu-gpu version on resolution of 640 × 480). It is robust to homogeneous regions without texture, large image changes, noise or errors from camera calibration, and it reconstructs the surface down to sub millimeter accuracy. In all the experiments within the simulation environment, the mean distance to ground truth data is between 0.05 and 0.6 mm for the hybrid cpu-gpu version. The hybrid cpu-gpu algorithm shows a much more superior performance than its cpu and gpu counterpart (mean distance reduction 26% and 45%, respectively, for the experiments in the simulation environment). Conclusions The recommended approach for surface reconstruction is fast, robust, and accurate. It can represent changes in the intraoperative environment and can be used to adapt a preoperative model within the surgical site by registration of these two models.

Patent
18 Dec 2012
TL;DR: In this article, an automated three dimensional mapping and display system for a diagnostic ultrasound system is presented, which enables the ultrasound probe position and orientation to be continuously displayed over a body or body part diagram.
Abstract: An automated three dimensional mapping and display system for a diagnostic ultrasound system is presented. According to the invention, ultrasound probe position registration is automated, the position of each pixel in the ultrasound image in reference to selected anatomical references is calculated, and specified information is stored on command. The system, during real time ultrasound scanning, enables the ultrasound probe position and orientation to be continuously displayed over a body or body part diagram, thereby facilitating scanning and images interpretation of stored information. The system can then record single or multiple ultrasound free hand two-dimensional (also "2D") frames in a video sequence (clip) or cine loop wherein multiple 2D frames of one or more video sequences corresponding to a scanned volume can be reconstructed in three-dimensional (also "3D") volume images corresponding to the scanned region, using known 3D reconstruction algorithms. In later examinations, the exact location and position of the transducer can be recreated along three dimensional or two dimensional axis points enabling known targets to be viewed from an exact, known position.

Book ChapterDOI
05 Nov 2012
TL;DR: This work proposes to extract high level primitives---planes---from an RGB-D camera, in addition to low level image features, to better constrain the problem and help improve indoor 3D reconstruction and demonstrates with real datasets that the method with plane constraints achieves more accurate and more appealing results comparing with other state-of-the-art scene reconstruction algorithms.
Abstract: Given a hand-held RGB-D camera (e.g. Kinect), methods such as Structure from Motion (SfM) and Iterative Closest Point (ICP), perform poorly when reconstructing indoor scenes with few image features or little geometric structure information. In this paper, we propose to extract high level primitives---planes---from an RGB-D camera, in addition to low level image features (e.g. SIFT), to better constrain the problem and help improve indoor 3D reconstruction. Our work has two major contributions: first, for frame to frame matching, we propose a new scheme which takes into account both low-level appearance feature correspondences in RGB image and high-level plane correspondences in depth image. Second, in the global bundle adjustment step, we formulate a novel error measurement that not only takes into account the traditional 3D point re-projection errors, but also the planar surface alignment errors. We demonstrate with real datasets that our method with plane constraints achieves more accurate and more appealing results comparing with other state-of-the-art scene reconstruction algorithms in aforementioned challenging indoor scenarios.

Book ChapterDOI
01 Oct 2012
TL;DR: The first solution to 3D reconstruction in monocular laparoscopy using methods based on Photometric Stereo (PS) is presented, which can compute 3D from a single image, does not require correspondence estimation, and computes absolute depth densely.
Abstract: In this paper we present the first solution to 3D reconstruction in monocular laparoscopy using methods based on Photometric Stereo (PS) Our main contributions are to provide the new theory and practical solutions to successfully apply PS in close-range imaging conditions We are specifically motivated by a solution with minimal hardware modification to existing laparoscopes In fact the only physical modification we make is to adjust the colour of the laparoscope's illumination via three colour filters placed at its tip Once calibrated, our approach can compute 3D from a single image, does not require correspondence estimation, and computes absolute depth densely We demonstrate the potential of our approach with ground truth ex-vivo and in-vivo experimentation

Journal ArticleDOI
TL;DR: A hybrid 3D reconstruction framework that supplements projected pattern correspondence matching with texture information is presented that reduces measurement errors versus traditional structured light and phase matching methodologies while being insensitive to gamma distortion, projector flickering, and secondary reflections.
Abstract: Active stereo vision is a method of 3D surface scanning involving the projecting and capturing of a series of light patterns where depth is derived from correspondences between the observed and projected patterns. In contrast, passive stereo vision reveals depth through correspondences between textured images from two or more cameras. By employing a projector, active stereo vision systems find correspondences between two or more cameras, without ambiguity, independent of object texture. In this paper, we present a hybrid 3D reconstruction framework that supplements projected pattern correspondence matching with texture information. The proposed scheme consists of using projected pattern data to derive initial correspondences across cameras and then using texture data to eliminate ambiguities. Pattern modulation data are then used to estimate error models from which Kullback-Leibler divergence refinement is applied to reduce misregistration errors. Using only a small number of patterns, the presented approach reduces measurement errors versus traditional structured light and phase matching methodologies while being insensitive to gamma distortion, projector flickering, and secondary reflections. Experimental results demonstrate these advantages in terms of enhanced 3D reconstruction performance in the presence of noise, deterministic distortions, and conditions of texture and depth contrast.

Journal ArticleDOI
TL;DR: Three-dimensional feature parameters reconstruction and measurement methods of bubbles in gas–liquid two-phase flow based on virtual stereo vision have been developed, and the optimized design method of structure parameters is discussed in detail.

Proceedings ArticleDOI
05 Nov 2012
TL;DR: A dense solution to all three elements of this problem: depth estimation, motion label assignment and rigid transformation estimation directly from the raw video by optimizing a single cost function using a hill-climbing approach.
Abstract: Existing approaches to camera tracking and reconstruction from a single handheld camera for Augmented Reality (AR) focus on the reconstruction of static scenes. However, most real world scenarios are dynamic and contain multiple independently moving rigid objects. This paper addresses the problem of simultaneous segmentation, motion estimation and dense 3D reconstruction of dynamic scenes. We propose a dense solution to all three elements of this problem: depth estimation, motion label assignment and rigid transformation estimation directly from the raw video by optimizing a single cost function using a hill-climbing approach. We do not require prior knowledge of the number of objects present in the scene — the number of independent motion models and their parameters are automatically estimated. The resulting inference method combines the best techniques in discrete and continuous optimization: a state of the art variational approach is used to estimate the dense depth maps while the motion segmentation is achieved using discrete graph-cut based optimization. For the rigid motion estimation of the independently moving objects we propose a novel tracking approach designed to cope with the small fields of view they induce and agile motion. Our experimental results on real sequences show how accurate segmentations and dense depth maps can be obtained in a completely automated way and used in marker-free AR applications.

Book ChapterDOI
07 Oct 2012
TL;DR: Experiments show that the use of expensive global bundle adjustment can be reduced throughout the process, while the additional cost of propagation is essentially linear in the problem size.
Abstract: This paper examines the potential benefits of applying next best view planning to sequential 3D reconstruction from unordered image sequences. A standard sequential structure-and-motion pipeline is extended with active selection of the order in which cameras are resectioned. To this end, approximate covariance propagation is implemented throughout the system, providing running estimates of the uncertainties of the reconstruction, while also enhancing robustness and accuracy. Experiments show that the use of expensive global bundle adjustment can be reduced throughout the process, while the additional cost of propagation is essentially linear in the problem size.

Proceedings ArticleDOI
13 Oct 2012
TL;DR: The proposed method is a projector-camera system that reconstructs a shape from a single image where a static pattern is cast by a projector, such a method is ideal for acquisition of moving objects at a high frame rate.
Abstract: In this paper, we propose a method to reconstruct the shapes of moving objects. The proposed method is a projector-camera system that reconstructs a shape from a single image where a static pattern is cast by a projector, such a method is ideal for acquisition of moving objects at a high frame rate. The issues tackled in this paper are as follows: 1) realize one-shot 3D reconstruction with a single-colored pattern, and 2) obtain accurate shapes by finding correspondences in sub-pixel accuracy. To achieve these goals, we propose the following methods: 1) implicit encoding of projector information by a grid of wave lines, 2) grid-based stereo between projector pattern and camera images to determine unique correspondences, 3) (quasi-)pixel-wise interpolations and optimizations to reconstruct dense shapes, and 4) a single-colored pattern contributes to simplify pattern projecting devices compared to color-coded methods. In the experiment, we show the proposed method is efficient to solve the issues above.

Journal ArticleDOI
TL;DR: In this article, an approach for fully automatic image-based 3D reconstruction of buildings using UAVs is described. But the results of the application of the complete processing chain starting at image acquisition and ending in a dense surface-mesh are presented and discussed.
Abstract: Unmanned Aerial Vehicles (UAVs) offer several new possibilities in a wide range of applications. One example is the 3D reconstruction of buildings. In former times this was either restricted by earthbound vehicles to the reconstruction of facades or by air-borne sensors to generate only very coarse building models. This paper describes an approach for fully automatic image-based 3D reconstruction of buildings using UAVs. UAVs are able to observe the whole 3D scene and to capture images of the object of interest from completely different perspectives. The platform used by this work is a Falcon 8 octocopter from Ascending Technologies. A slightly modified high-resolution consumer camera serves as sensor for data acquisition. The final 3D reconstruction is computed offline after image acquisition and follows a reconstruction process originally developed for image sequences obtained by earthbound vehicles. The per- formance of the described method is evaluated on benchmark datasets showing that the achieved accuracy is high and even comparable with Light Detection and Ranging (LIDAR). Additionally, the results of the application of the complete processing-chain starting at image acquisition and ending in a dense surface-mesh are presented and discussed.

Journal ArticleDOI
TL;DR: The experimental results reveal a rather surprising and useful yet overlooked fact that the SVP camera model with radial distortion correction and focal length adjustment can compensate for refraction and achieve high accuracy in multiview underwater 3D reconstruction compared with the results of land-based systems.
Abstract: In an underwater imaging system, a perspective camera is often placed outside a tank or in waterproof housing with a flat glass window. The refraction of light occurs when a light ray passes through the water-glass and air-glass interface, rendering the conventional multiple view geometry based on the single viewpoint (SVP) camera model invalid. While most recent underwater vision studies mainly focus on the challenging topic of calibrating such systems, no previous work has systematically studied the influence of refraction on underwater three-dimensional (3D) reconstruction. This paper demonstrates the possibility of using the SVP camera model in underwater 3D reconstruction through theoretical analysis of refractive distortion and simulations. Then, the performance of the SVP camera model in multiview underwater 3D reconstruction is quantitatively evaluated. The experimental results reveal a rather surprising and useful yet overlooked fact that the SVP camera model with radial distortion correction and focal length adjustment can compensate for refraction and achieve high accuracy in multiview underwater 3D reconstruction (within 0.7 mm for an object of dimension 200 mm) compared with the results of land-based systems. Such an observation justifies the use of the SVP camera model in underwater application for reconstructing reliable 3D scenes. Our results can be used to guide the selection of system parameters in the design of an underwater 3D imaging setup.

Journal ArticleDOI
TL;DR: A correlation-based stereo algorithm, suitable for parallel processing, is developed with the actual scene structure in mind using multiple camera pairs for validating 3D locations, three different certainty metrics, and does not require extrinsic rectification of the images.
Abstract: This paper presents research using a correlation-based stereo vision approach to 3D blossom mapping for automated thinning of peach blossoms on perpendicular “V” architecture trees. To this end, a calibrated camera system is proposed, based upon three synchronized ten megapixel cameras and flash illumination for nighttime image acquisition. A correlation-based stereo algorithm, suitable for parallel processing, is developed with the actual scene structure in mind using multiple camera pairs for validating 3D locations, three different certainty metrics, and does not require extrinsic rectification of the images. Results show that mapping accuracy of less than half of a blossom width ( ~ 1 cm) is feasible, and validates the approach as the sensor part of an automated selective blossom thinning system. Furthermore, the effects of the different certainty metrics are examined. They effectively improve the accuracy of blossom positions when the visibility of blossoms is good by removing insecure matches and through qualified selection of subsets of cameras for 3D triangulation. The proposed algorithm is compared and found superior to a popular global optimization algorithm, designed to perform well in another scene structure, demonstrating the quality of correlation-based stereo in practical applications.

Journal ArticleDOI
TL;DR: A novel approach to monocular deformable shape recovery that can operate under complex lighting and handle partially textured surfaces is proposed, using a learned mapping from intensity patterns to the shape of local surface patches and a principled approach to piecing together the resulting local shape estimates.
Abstract: Most recent approaches to monocular nonrigid 3D shape recovery rely on exploiting point correspondences and work best when the whole surface is well textured. The alternative is to rely on either contours or shading information, which has only been demonstrated in very restrictive settings. Here, we propose a novel approach to monocular deformable shape recovery that can operate under complex lighting and handle partially textured surfaces. At the heart of our algorithm are a learned mapping from intensity patterns to the shape of local surface patches and a principled approach to piecing together the resulting local shape estimates. We validate our approach quantitatively and qualitatively using both synthetic and real data.

Journal ArticleDOI
TL;DR: A methodology to automatically align 3D views has been developed by integrating a stereo vision system and a full field optical scanner and has been validated by experimental tests regarding both the evaluation of the measurement accuracy and the 3D reconstruction of an industrial shape.

Proceedings ArticleDOI
16 Jun 2012
TL;DR: This paper proposes a novel approach, called example-based 3D object reconstruction from line drawings, which is based on the observation that a natural or man-made complex3D object normally consists of a set of basic 3D objects.
Abstract: Recovering 3D geometry from a single 2D line drawing is an important and challenging problem in computer vision. It has wide applications in interactive 3D modeling from images, computer-aided design, and 3D object retrieval. Previous methods of 3D reconstruction from line drawings are mainly based on a set of heuristic rules. They are not robust to sketch errors and often fail for objects that do not satisfy the rules. In this paper, we propose a novel approach, called example-based 3D object reconstruction from line drawings, which is based on the observation that a natural or man-made complex 3D object normally consists of a set of basic 3D objects. Given a line drawing, a graphical model is built where each node denotes a basic object whose candidates are from a 3D model (example) database. The 3D reconstruction is solved using a maximum-a-posteriori (MAP) estimation such that the reconstructed result best fits the line drawing. Our experiments show that this approach achieves much better reconstruction accuracy and are more robust to imperfect line drawings than previous methods.

Book ChapterDOI
07 Oct 2012
TL;DR: This work proposes a novel dense depth estimation method which can automatically recover accurate and consistent depth maps from the synchronized video sequences taken by a few handheld cameras, and simultaneously solves bilayer segmentation and depth estimation in a unified energy minimization framework.
Abstract: Accurate dense 3D reconstruction of dynamic scenes from natural images is still very challenging. Most previous methods rely on a large number of fixed cameras to obtain good results. Some of these methods further require separation of static and dynamic points, which are usually restricted to scenes with known background. We propose a novel dense depth estimation method which can automatically recover accurate and consistent depth maps from the synchronized video sequences taken by a few handheld cameras. Unlike fixed camera arrays, our data capturing setup is much more flexible and easier to use. Our algorithm simultaneously solves bilayer segmentation and depth estimation in a unified energy minimization framework, which combines different spatio-temporal constraints for effective depth optimization and segmentation of static and dynamic points. A variety of examples demonstrate the effectiveness of the proposed framework.