scispace - formally typeset
Search or ask a question

Showing papers on "3D reconstruction published in 2001"


01 Jan 2001
TL;DR: This book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts and it will show the best book collections and completed collections.
Abstract: Downloading the book in this website lists can give you more advantages. It will show you the best book collections and completed collections. So many books can be found in this website. So, this is not only this multiple view geometry in computer vision. However, this book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts. This is simple, read the soft file of the book and you get it.

14,282 citations


Journal ArticleDOI
TL;DR: A survey on deformable surfaces identifies the main representations proposed in the literature and studies the influence of the representation on the model evolution behavior, revealing some similarities between different approaches.

319 citations


Proceedings ArticleDOI
01 Dec 2001
TL;DR: A novel solution for flow-based tracking and 3D reconstruction of deforming objects in monocular image sequences using a linear combination of 3D basis shapes and the rank constraint is used to achieve robust and precise low-level optical flow estimation.
Abstract: This paper presents a novel solution for flow-based tracking and 3D reconstruction of deforming objects in monocular image sequences. A non-rigid 3D object undergoing rotation and deformation can be effectively approximated using a linear combination of 3D basis shapes. This puts a bound on the rank of the tracking matrix. The rank constraint is used to achieve robust and precise low-level optical flow estimation without prior knowledge of the 3D shape of the object. The bound on the rank is also exploited to handle occlusion at the tracking level leading to the possibility of recovering the complete trajectories of occluded/disoccluded points. Following the same low-rank principle, the resulting flow matrix can be factored to get the 3D pose, configuration coefficients, and 3D basis shapes. The flow matrix is factored in an iterative manner, looping between solving for pose, configuration, and basis shapes. The flow-based tracking is applied to several video sequences and provides the input to the 3D non-rigid reconstruction task. Additional results on synthetic data and comparisons to ground truth complete the experiments.

290 citations


Journal ArticleDOI
TL;DR: The fast marching method coupled with a back tracking via gradient descent along the reconstructed surface is shown to solve the path planning problem in robot navigation.
Abstract: An optimal algorithm for the reconstruction of a surface from its shading image is presented. The algorithm solves the 3D reconstruction from a single shading image problem. The shading image is treated as a penalty function and the height of the reconstructed surface is a weighted distance. A consistent numerical scheme based on Sethian's fast marching method is used to compute the reconstructed surface. The surface is a viscosity solution of an Eikonal equation for the vertical light source case. For the oblique light source case, the reconstructed surface is the viscosity solution to a different partial differential equation. A modification of the fast marching method yields a numerically consistent, computationally optimal, and practically fast algorithm for the classical shape from shading problem. Next, the fast marching method coupled with a back tracking via gradient descent along the reconstructed surface is shown to solve the path planning problem in robot navigation.

232 citations


Proceedings ArticleDOI
07 Jul 2001
TL;DR: This framework allows to fully exploit parallelepipeds and thus overcomes several limitations of calibration approaches based on cuboids and presents an original and very efficient interactive method for 3D reconstruction from single images.
Abstract: In this paper parallelepipeds and their use in camera calibration and 3D reconstruction processes are studied. Parallelepipeds naturally characterize rigidity constraints present in a scene, such as parallelism and orthogonality. A subclass of parallelepipeds-the cuboids-has been frequently used over the past to partially calibrate cameras. However, the full potential of parallelepipeds, in camera calibration as well as in scene reconstruction, has never been clearly established. We propose a new framework for the use of parallelepipeds which is based on an extensive study of this potential. In particular, we exhibit the complete duality that exists between the intrinsic metric characteristics of a parallelepiped and the intrinsic parameters of a camera. Our framework allows to fully exploit parallelepipeds and thus overcomes several limitations of calibration approaches based on cuboids. To illustrate this framework, we present an original and very efficient interactive method for 3D reconstruction from single images. This method allows to quickly build a scene model from a single uncalibrated image.

114 citations


Proceedings ArticleDOI
07 Jul 2001
TL;DR: Two methods that were developed by re-examining the underlying image formation process are introduced, making no assumptions about the object's shape, the presence or absence of shadowing, or the nature of the BRDF which may vary over the surface.
Abstract: We address an open and hitherto neglected problem in computer vision, how to reconstruct the geometry of objects with arbitrary and possibly anisotropic bidirectional reflectance distribution functions (BRDFs). Present reconstruction techniques, whether stereo vision, structure from motion, laser range finding, etc. make explicit or implicit assumptions about the BRDF. Here, we introduce two methods that were developed by re-examining the underlying image formation process; the methods make no assumptions about the object's shape, the presence or absence of shadowing, or the nature of the BRDF which may vary over the surface. The first method takes advantage of Helmholtz reciprocity, while the second method exploits the fact that the radiance along a ray of light is constant. In particular, the first method uses stereo pairs of images in which point light sources are co-located at the centers of projection of the stereo cameras. The second method is based on double covering a scene's incident light field; the depths of surface points are estimated using a large collection of images in which the viewpoint remains fixed and a point light source illuminates the object. Results from our implementations lend empirical support to both techniques.

113 citations


Journal ArticleDOI
TL;DR: Some of the fundamental findings in the study of early vision are surveyed including basic visual anatomy and physiology, optical properties of the eye, light sensitivity and visual adaptation, and spatial vision.
Abstract: Visually based techniques in computer graphics have blossomed. Important advances in perceptually driven rendering, realistic image display, high-fidelity visualization, and appearance-preserving geometric simplification have all been realized by applying knowledge of the limitations and capabilities of human visual processing. Much of this work is grounded in the physiology and psychophysics of early vision, which focuses on how visual mechanisms transduce and code the patterns of light arriving at the eye. The article surveys some of the fundamental findings in the study of early vision including basic visual anatomy and physiology, optical properties of the eye, light sensitivity and visual adaptation, and spatial vision.

101 citations


Proceedings ArticleDOI
01 Dec 2001
TL;DR: This study shows that skin reflectance data can best be approximated by a linear combination of Gaussians or their first derivatives, which has a significant practical impact on optical acquisition devices: the entire visible spectrum of skin reflectances can now be captured with a few filters of optimally chosen central wavelengths and bandwidth.
Abstract: The automated detection of humans in computer vision as well as the realistic rendering of people in computer graphics necessitates improved modeling of the human skin color. We describe the acquisition and modeling of skin reflectance data densely sampled over the entire visible spectrum. The data collected through a spectrograph allows us to explain skin color (and its variations) and to discriminate between human skin and dyes designed to mimic human skin. We study the approximation of these data using several sets of basis functions. Our study shows that skin reflectance data can best be approximated by a linear combination of Gaussians or their first derivatives. This result has a significant practical impact on optical acquisition devices: the entire visible spectrum of skin reflectance can now be captured with a few filters of optimally chosen central wavelengths and bandwidth.

90 citations


Journal ArticleDOI
TL;DR: A method in which the registration of transmission electron microscope images is automated using conventional colloidal gold particles as reference markers between images, which shows not only the reliability of the suggested method but also a high level of accuracy in alignment.

78 citations


Proceedings ArticleDOI
01 Dec 2001
TL;DR: A dataset of several thousand calibrated, geo-referenced, high dynamic range color images, acquired under uncontrolled, variable illumination in an outdoor region spanning hundreds of meters is described.
Abstract: We describe a dataset of several thousand calibrated, geo-referenced, high dynamic range color images, acquired under uncontrolled, variable illumination in an outdoor region spanning hundreds of meters. All image, feature, calibration, and geo-referencing data are available at http://city.lcs.mit.edu/data. Calibrated imagery is of fundamental interest in a wide variety of applications. We have made this data available in the belief that researchers in computer graphics, computer vision, photogrammetry and digital cartography will find it useful in several ways: as a test set for their own algorithms; as a calibrated image set for applications such as image-based rendering, metric 3D reconstruction, and appearance recovery; and as controlled imagery for integration into existing GIS systems and applications. The Web-based interface to the data provides interactive viewing of high-dynamic-range images and mosaics; extracted edge and point features; intrinsic and extrinsic calibration, along with maps of the ground context in which the images were acquired; the spatial adjacency relationships among images; the epipolar geometry relating adjacent images; compass and absolute scale overlays; and quantitative consistency measures for the calibration data.

74 citations


Journal ArticleDOI
TL;DR: A strategy for quantitative comparison and optimization of 3D reconstruction algorithms is developed and it is shown that the elongation artifacts that had been previously reported can be strongly reduced.

Journal ArticleDOI
TL;DR: A novel non-intrusive technique able to reconstruct unconstrained motion is presented, from multiple-viewpoint images taken with an ordinary camera a 3D reconstruction is computed with a technique known as volume intersection.

Proceedings ArticleDOI
21 May 2001
TL;DR: Key features of the user interface used in a robot-assisted system for medical diagnostic ultrasound, which can be enabled to automatically compensate, through robot motions, unwanted motions in the plane of the ultrasound beam, are presented.
Abstract: A robot-assisted system for medical diagnostic ultrasound has been developed by the authors. This paper presents key features of the user interface used in this system. While the ultrasound transducer is positioned by a robot, the operator, the robot controller, and an ultrasound image processor have shared control over its motion. Ultrasound image features that can be selected by the operator are recognized and tracked by a variety of techniques. Based on feature tracking, ultrasound image servoing in three axes has been incorporated in the interface and can be enabled to automatically compensate, through robot motions, unwanted motions in the plane of the ultrasound beam. The stability and accuracy of the system is illustrated through a 3D reconstruction of an ultrasound phantom.

Journal ArticleDOI
07 Jul 2001
TL;DR: The algorithm is based on a stratified approach, starting with affine reconstruction from factorization, followed by rectification to metric structure using the articulated structure constraints, and shows promise as a means of creating 3D animations of dynamic activities such as sports events.
Abstract: We present an algorithm for 3D reconstruction of dynamic articulated structures, such as humans, from uncalibrated multiple views. The reconstruction exploits constraints associated with a dynamic articulated structure, specifically the conservation over time of length between rotational joints. These constraints admit metric reconstruction from at least two different images in each of two uncalibrated parallel projection cameras. The algorithm is based on a stratified approach, starting with affine reconstruction from factorization, followed by rectification to metric structure using the articulated structure constraints. The exploitation of these specific constraints allows reconstruction and self-calibration with fewer feature paints and views compared to standard self-calibration. The method is extended to pairs of cameras that are zooming, Where calibration of the cameras allows compensation for the changing scale factor in a scaled orthographic camera. Results are presented in the form of stick figures and animated 3D reconstructions using pairs of sequences from broadcast television. The technique shows promise as a means of creating 3D animations of dynamic activities such as sports events.

Journal ArticleDOI
TL;DR: The reconstruction problem can be framed as an optimization over a compact set with low dimension (no more than four) and can be solved efficiently by coupling standard nonlinear optimization techniques with a multistart method.
Abstract: This paper deals with the problem of recovering the dimensions of an object and its pose from a single image acquired with a camera of unknown focal length. It is assumed that the object in question can be modeled as a polyhedron where the coordinates of the vertices can be expressed as a linear function of a dimension vector. The reconstruction program takes as input, a set of correspondences between features in the model and features in the image. From this information, the program determines an appropriate projection model for the camera, the dimensions of the object, its pose relative to the camera and, in the case of perspective projection, the focal length of the camera. This paper describes how the reconstruction problem can be framed as an optimization over a compact set with low dimension (no more than four). This optimization problem can be solved efficiently by coupling standard nonlinear optimization techniques with a multistart method. The result is an efficient, reliable solution system that does not require initial estimates for any of the parameters being estimated.

Proceedings ArticleDOI
01 Dec 2001
TL;DR: A novel probabilistic framework for 3D surface reconstruction from multiple stereo images using a discrete voxelized representation of the scene to model and recover surfaces which may be occluded in some views.
Abstract: The paper presents a novel probabilistic framework for 3D surface reconstruction from multiple stereo images. The method works on a discrete voxelized representation of the scene. An iterative scheme is used to estimate the probability that a scene point lies on the true 3D surface. The novelty of our approach lies in the ability to model and recover surfaces which may be occluded in some views. This is done by explicitly estimating the probabilities that a 3D scene point is visible in a particular view from the set of given images. This relies on the fact that for a point on a lambertian surface, if the pixel intensities of its projection along two views differ, then the point is necessarily occluded in one of the views. We present results of surface reconstruction from both real and synthetic image sets.

Journal ArticleDOI
TL;DR: The 3D model thus achieved was useful for virtual preoperative planning and for simulation of the internal fixation of long bones.
Abstract: We present a new concept with mathematical background for the construction of a three-dimensional (3D) volumetric model of the human tibia based on two conventional orthogonal two-dimensional (2D) radiographic images. This approach is supported by a computer database containing a collection of 80 2D/3D image data sets of individual cadaveric tibiae. For each of these tibiae, the database contains digitized 2D orthogonal radiographic images in both anterior and lateral views, and the corresponding 3D CT data obtained by computerized tomography. To obtain a 3D model of a tibia for a given patient, shape matching is performed. The computer finds the most similar tibia to the patient's tibia among the 2D radiographic images in the database by applying a matching process. To improve accuracy, a 2D image warping procedure can be applied on the slices of the selected bone prior to 3D reconstruction. The warping process is controlled by the contour data of the two orthogonal views. We found that the 3D model thus achieved was useful for virtual preoperative planning and for simulation of the internal fixation of long bones.

Proceedings ArticleDOI
21 May 2001
TL;DR: This work presents an application for image-based visual servoing: computer graphics animation and applies this approach in two different contexts: highly reactive applications (virtual reality, video games) and the control of humanoid avatars.
Abstract: Presents an application for image-based visual servoing: computer graphics animation. Indeed, the control of a virtual camera in a virtual environment is not a trivial problem and usually requires skilled operators. Visual servoing, a now well known technique in robotics and computer vision, consists in positioning a camera according to the informations perceived in the images. Using this method within a computer graphics context leads to a very intuitive approach of animation. Furthermore, in that case a full knowledge about the scene is available. It allows us to easily introduce constraints within the control law in order to react automatically to modifications of the environment. We apply this approach in two different contexts: highly reactive applications (virtual reality, video games) and the control of humanoid avatars.

Proceedings ArticleDOI
08 Dec 2001
TL;DR: This work investigates how scale fixing influences the accuracy of 3D reconstruction and determines what measurement should be made to maximize the shape accuracy.
Abstract: Computer vision techniques can estimate 3D shape from images, but usually only up to a scale factor. The scale factor must be obtained by a physical measurement of the scene or the camera motion. Using gauge theory, we show that how this scale factor is determined can significantly affect the accuracy of the estimated shape. And yet these considerations have been ignored in previous works where 3D shape accuracy is optimized. We investigate how scale fixing influences the accuracy of 3D reconstruction and determine what measurement should be made to maximize the shape accuracy.

Book ChapterDOI
14 Oct 2001
TL;DR: A new method to reconstruct and analyze the left ventricle from multiple acoustic window three-dimensional ultrasound acquired using a transthoracic 3-D rotational probe is proposed and shows that the new method agrees better with MRI measurements than the previous approach developed based on a single acoustic window.
Abstract: A new method is proposed to reconstruct and analyse the left ventricle (LV) from multiple acoustic windows 3D ultrasound acquired using a 3D rotational probe. Prior research in this area has been based on one acoustic window acquisition. However, the data suffers from several limitations that degrade the 3D reconstruction, such as motion of the probe during the acquisition and the presence of shadow due to bone (ribs) and air (in the lungs). In this paper we aim to overcome these limitations by automatically fusing information from multiple acoustic windows sparse-view acquisitions and using a position sensor to track the probe in real time. Geometric constraints of the object shape, and spatio-temporal information relating to the image acquisition process are used to propose new algorithms for (1) grouping endocardial edge cues from an initial image segmentation and (2) defining a novel reconstruction method that utilises information from multiple acoustic windows. We illustrate our new method on phantom and real heart data and compare its performance against our previous approach that is based on a single acoustic window.

Book ChapterDOI
28 May 2001
TL;DR: 3D-ultrasound can become a new, fast, non-radiative,non-invasive, and inexpensive tomographic medical imaging technique with unique advantages for the localization of vessels and tumors in soft tissue
Abstract: 3D-ultrasound can become a new, fast, non-radiative, non-invasive, and inexpensive tomographic medical imaging technique with unique advantages for the localization of vessels and tumors in soft tissue (spleen, kidneys, liver, breast etc.). In general, unlike the usual 2D- ultrasound, in the 3D-case a complete volume is covered with a whole series of cuts, which would enable a 3D reconstruction and visualization.In the last two decades, many researchers have attempted to produce systems that would allow the construction and visualization of three-dimensional (3-D) images from ultrasound data. There is a general agreement that this development represents a positive step forward in medical imaging, and clinical applications have been suggested in many different areas. However, it is clear that 3-D ultrasound has not yet gained widespread clinical acceptance, and that there are still important problems to solve before it becomes a common tool.

Proceedings ArticleDOI
03 Jul 2001
TL;DR: This paper implements Feldkamp CBR with multi-level acceleration on a commercially available PC based on hybrid computing (HC) utilizing single instruction multiple data (SIMD) and making execution units (EU) in the processor work effectively.
Abstract: Cone beam reconstruction has attracted a great deal of attention in the medical imaging community. However, high-resolution cone beam reconstruction (CBR) involves a huge set of data and very time consuming computing. It usually needs customized hardware or a large-scale computer to achieve acceptable speed. Although the Feldkamp algorithm is an approximate CBR algorithm, it is a practical and efficient 3D reconstruction algorithm and is a basic component in several exact cone-beam reconstruction algorithms (CBRA). In this paper, we present a practical implementation for high-speed CBR on a commercially available PC based on hybrid computing (HC). We implement Feldkamp CBR with multi-level acceleration. We use HC utilizing single instruction multiple data (SIMD) and making execution units (EU) in the processor work effectively. We also utilize the multi-thread and fiber support on the operating system, which automatically enable the reconstruction parallelism in the multi-processor environment, and makes data I/O to the hard disk more effective. Memory and cache access optimization is done by properly data partition. This approach was tested on an Intel Pentium III 500Mhz computer and was compared to the traditional implementation. It decreases more than 75% the filtering time for 288 pieces projections, saves more than 60% of the reconstruction time for the 5123 cube, and maintains good precision with less than 0.08% average error. Our system is cost-effective and high-speed. An effective reconstruction engine can be built with a market-available Symmetric Multi-processor (SMP) computer. This is an easy and cheap upgrade and is compatible with newer PC processors.© (2001) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

Proceedings ArticleDOI
01 May 2001
TL;DR: An algorithm for the automatic construction of a 3D model of archaeological vessels is presented, and a rough model of the object is obtained quickly and is refined as the processed level of the octree increases.
Abstract: An algorithm for the automatic construction of a 3D model of archaeological vessels is presented. In archeology the determination of the exact volume of arbitrary vessels is of importance since this provides information about the manufacturer and the usage of the vessel. To acquire the shape of objects with handles in 3d is complicated, since occlusions of the object's surface are introduced by the handle and can only be resolved by taking multiple views. Therefore, the 3d reconstruction is based on a sequence of images of the object taken from different viewpoints. The object's silhouette is the only feature which is extracted from an input image. Images are acquired by rotating the object on a turntable in front of a stationary camera. The algorithm uses an octree representation of the model, and builds this model incrementally, by performing limited processing of all input images for each level of the octree. Beginning from the root node at the level 0 a rough model of the object is obtained quickly and is refined as the processed level of the octree increases. Results of the algorithm developed are presented for both synthetic and real input images.

Journal ArticleDOI
TL;DR: A novel algorithm for volumetric reconstruction of objects from planar sections using Delaunay triangulation is presented, which solves the main problems posed to models defined by reconstruction, particularly from the viewpoint of producing meshes that are suitable for interaction and simulation tasks.
Abstract: This paper presents a novel algorithm for volumetric reconstruction of objects from planar sections using Delaunay triangulation, which solves the main problems posed to models defined by reconstruction, particularly from the viewpoint of producing meshes that are suitable for interaction and simulation tasks. The requirements for these applications are discussed here and the results of the method are presented. Additionally, it is compared to another commonly used reconstruction algorithm based on Delaunay triangulation, showing the advantages of the reconstructions obtained by our technique.

01 Jan 2001
TL;DR: An approach is proposed that can capture the 3D shape and appearance of objects, monuments or sites from photographs or video that retrieves both the structure of a scene and the motion of the camera from an image sequence.
Abstract: In this contribution an approach is proposed that can capture the 3D shape and appearance of objects, monuments or sites from photographs or video. The flexibility of the approach allows us to deal with uncalibrated hand-held camera images. In addition, through the use of advanced computer vision algorithms the process is largely automated. Both these factors make the approach ideally suited to applications in archaeology. Not only does it become feasible to obtain photo-realistic virtual reconstructions of monuments and sites, but also stratigraphy layers and separate building blocks can be reconstructed. These can then be used as detailed records of the excavations or allow virtual re-assemblage of monuments. Since the motion of the camera is also computed, it also becomes possible to augment video streams of ancient remains with virtual reconstruction. The proposed approach retrieves both the structure of a scene and the motion of the camera from an image sequence. In a first step features are extracted and matched over consecutive images. This step is followed by a structurefrom-motion algorithm that yields a sparse 3D reconstruction (i.e. the matched 3D features) and the path of the camera. These results are enhanced through auto-calibration and bundle adjustment. To allow a full surface reconstruction of the observed scene, the images are rectified so that a standard stereo algorithm can be used to determine dense disparity maps. By combining several of these maps, accurate depth maps are computed. These can then be integrated together to yield a dense 3D surface model. By making use of texture mapping photo-realistic models can be obtained.

Journal ArticleDOI
TL;DR: This paper reports on three parallel algorithms used for 3D reconstruction of asymmetric objects from their 2D projections and discusses their computational, communication, I/O, and space requirements and presents some performance data.
Abstract: The 3D electron‐density determination of viruses, from experimental data provided by electron microscopy, is a data‐intensive computation that requires the use of clusters of PCs or parallel computers. In this paper we report on three parallel algorithms used for 3D reconstruction of asymmetric objects from their 2D projections. We discuss their computational, communication, I/O, and space requirements and present some performance data. The algorithms are general and can be used for 3D reconstruction of asymmetric objects for applications other than structural biology. Copyright © 2001 John Wiley & Sons, Ltd.

Proceedings ArticleDOI
04 Nov 2001
TL;DR: The goal of this study was to determine if the image quality improvements predicted for the 3D row action maximum likelihood algorithm (RAMLA) over 2.5D RAMLA after Fourier rebinning (FORE) would be seen with clinical PET data.
Abstract: True three-dimensional (3D) reconstructions from fully 3D positron emission tomography (PET) data yield high-quality images but at a high computational cost. Image representation using three-dimensional spherically-symmetric basis functions on a body-centered cubic (BCC) grid, as opposed to a simple cubic (SC) grid, can reduce the computational demands of a 3D approach without compromising image quality by reducing the number of image elements to be reconstructed. The goal of this study was to determine if the image quality improvements predicted for the 3D row action maximum likelihood algorithm (RAMLA) over 2.5D RAMLA after Fourier rebinning (FORE) would be seen with clinical PET data. Torso phantom, whole-body patient, and brain patient studies were used in this analysis. Data were corrected for detector efficiency, scatter, and randoms prior to reconstruction. Attenuation effects were either incorporated into the system model or pre-corrected prior to reconstruction. Higher contrast at comparable noise levels (or lower noise for comparable contrast) are seen with 3D RAMLA (SC or BCC grid) for both phantom and patient data. The brain patient data show improved axial resolution with 3D RAMLA, where the degradation in resolution with FORE is eliminated. Application of a fully 3D reconstruction algorithm is possible in clinically reasonable times.

Proceedings ArticleDOI
08 Feb 2001
TL;DR: The technique described in this paper provides a 3D reconstruction up to a scale factor along with knowledge of the coordinates of one world point, or the dimensions of a familiar object, yield the true to scale point coordinates for that scene.
Abstract: Estimation of the spatial coordinates of object points of a 3D environment, determined from stereo images of that environment, is well documented. However, the current approaches to 3D reconstruction of a scene from stereo images, require either the intrinsic camera parameters, the extrinsic parameters, or the spatial coordinates of at least five world points of that scene. In practice many more such points are usually required. This paper presents a method for 3D reconstruction when only two camera views are available, and no other camera information is available. It is assumed that the optical axes of the right and left cameras intersect and their Y-axes are parallel. A derivation of the equations needed in order to solve for the unknown parameters, within the framework of the assumed stereo system, is presented. Next, a one-sixteenth pixel interpolation over the camera pixel arrays is performed in order to improve the point correspondences, and thereby the accuracy of the parameter estimates. The technique described in this paper provides a 3D reconstruction up to a scale factor. This scaled reconstruction along with knowledge of the coordinates of one world point, or the dimensions of a familiar object, yield the true to scale point coordinates for that scene. Experiments on both synthetic and real stereo images yield very satisfactory results, as demonstrated.

Proceedings ArticleDOI
01 Jan 2001
TL;DR: A knowledge-based approach for automatic 3D reconstruction of buildings from aerial images by combining the image analysis with information from GIS maps and specific knowledge of the buildings can be greatly reduced.
Abstract: This paper presents a knowledge-based approach for automatic 3D reconstruction of buildings from aerial images. By combining the image analysis with information from GIS maps and specific knowledge of the buildings the complexity of the building reconstruction task can be greatly reduced. The building reconstruction process is described as a tree search in the space of possible building hypotheses. Hypotheses derived from outlines of building footprints from the map are fit against image pixel gradients. To guide the search of the tree an evaluation function based on information theory principles is defined. The proposed evaluation function defines the score of matching between a hypothesised building model and the image pixel gradients. It uses a mutual information measure and MDL criterion to select the best fit to image data in the tree search.

Journal ArticleDOI
TL;DR: In this article, the authors extended the TV-norm to take into account the third spatial dimension, and developed an iterative EM algorithm based on the three-dimensional (3D) TVnorm, which they call TV3D-EM.
Abstract: The total variation (TV) norm has been described in literature as a method for reducing noise in two-dimensional (2D) images. At the same time, the TV-norm is very good at recovering edges in images, without introducing ringing or edge artefacts. It has also been proposed as a 2D regularisation function in Bayesian reconstruction, implemented in an expectation maximisation (EM) algorithm, and called TV-EM. The TV-EM was developed for 2D SPECT imaging, and the algorithm is capable of smoothing noise while maintaining edges without introducing artefacts. We have extended the TV-norm to take into account the third spatial dimension, and developed an iterative EM algorithm based on the three-dimensional (3D) TV-norm, which we call TV3D-EM. This takes into account the correlation between transaxial sections in SPECT, due to system resolution. We have compared the 2D and 3D algorithms using reconstructed images from simulated projection data. Phantoms used were a homogeneous sphere, and a 3D head phantom based on the Shepp-Logan phantom. The TV3D-EM algorithm yielded somewhat lower noise levels than TV-EM. The noise in the TV3D-EM had similar correlation in transaxial and longitudinal sections, which was not the case for TV-EM, or any 2D reconstruction method. In particular, longitudinal sections from TV3D-EM were perceived as less noisy when compared to TV-EM. The use of 3D reconstruction should also be advantageous if compensation for distant dependent collimator blurring is incorporated in the iterative algorithm.