scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Multiple motion scene reconstruction with uncalibrated cameras

01 Jul 2003-IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE Computer Society)-Vol. 25, Iss: 7, pp 884-894
TL;DR: A reconstruction method for multiple motion scenes, which are scenes containing multiple moving objects, from uncalibrated views, that first performs a projective reconstruction using a bilinear factorization algorithm and converts the projective solution to a Euclidean one by enforcing metric constraints.
Abstract: In this paper, we describe a reconstruction method for multiple motion scenes, which are scenes containing multiple moving objects, from uncalibrated views. Assuming that the objects are moving with constant velocities, the method recovers the scene structure, the trajectories of the moving objects, the camera motion, and the camera intrinsic parameters (except skews) simultaneously. We focus on the case where the cameras have unknown and varying focal lengths while the other intrinsic parameters are known. The number of the moving objects is automatically detected without prior motion segmentation. The method is based on a unified geometrical representation of the static scene and the moving objects. It first performs a projective reconstruction using a bilinear factorization algorithm and, then, converts the projective solution to a Euclidean one by enforcing metric constraints. Experimental results on synthetic and real images are presented.
Citations
More filters
Journal ArticleDOI
TL;DR: This paper surveys contemporary progress in SLAM algorithms, especially those using computer vision as main sensing means, i.e., visual SLAM, and clearly identifies the inherent relationship between the state estimation via the KF versus PF and EM techniques, all of which are derivations of Bayes rule.
Abstract: Simultaneous localization and map-building (SLAM) continues to draw considerable attention in the robotics community due to the advantages it can offer in building autonomous robots. It examines the ability of an autonomous robot starting in an unknown environment to incrementally build an environment map and simultaneously localize itself within this map. Recent advances in computer vision have contributed a whole class of solutions for the challenge of SLAM. This paper surveys contemporary progress in SLAM algorithms, especially those using computer vision as main sensing means, i.e., visual SLAM. We categorize and introduce these visual SLAM techniques with four main frameworks: Kalman filter (KF)-based, particle filter (PF)-based, expectation-maximization (EM)-based and set membership-based schemes. Important topics of SLAM involving different frameworks are also presented. This article complements other surveys in this field by being current as well as reviewing a large body of research in the area o...

79 citations

Proceedings ArticleDOI
14 Mar 2010
TL;DR: This paper proposes a method for reconstructing the 2D geometry of the surrounding environment based on the signals acquired by a fixed microphone, when a series of acoustic stimula are produced in different positions in space.
Abstract: In this paper we propose a method for reconstructing the 2D geometry of the surrounding environment based on the signals acquired by a fixed microphone, when a series of acoustic stimula are produced in different positions in space. After estimating the Times Of Arrival (TOAs) of the reflective paths, we turn each TOA into a projective geometric constraint that can be used for determining the locations of the reflectors. The result consists of a collection of planar surfaces that correspond to the reflectors' locations. In this paper we present the whole processing chain and prove its effectiveness through experimental results.

63 citations


Cites background from "Multiple motion scene reconstructio..."

  • ...Index Terms— Geometrical acoustics, acoustic environment, projective geometry, active sensing....

    [...]

01 Jan 2005
TL;DR: Wei et al. as discussed by the authors proposed an information and communication theory (ICT) based approach in the Delft University of Technology Delft, the Netherlands, which is based on the work of Assoc. Prof. Dr. E. A. Hendriks and Dr. P. Ir. Redert.
Abstract: Supervisors: Assoc. Prof. Dr. Ir. E. A. Hendriks Dr. Ir. P. A. Redert Information and Communication Theory Group (ICT) Faculty of Electrical Engineering, Mathematics and Computer Science Delft University of Technology, the Netherlands Qingqing Wei Student Nr: 9936241 Email: weiqingqing@yahoo.com December 2005

58 citations


Cites methods from "Multiple motion scene reconstructio..."

  • ...Motion Optical flow [2]; Factorization [ 10 ]; Kalman filter [11] Defocus Local image decomposition using the Hermite polynomial basis [4]; Inverse filtering [12]; S-Transform [13]...

    [...]

  • ...An example of the latter is the factorization algorithm [ 10 ], where the registered measurement matrix, containing entries of the normalized image point coordinates over several video frames, is converted into a product of a shape matrix and motion matrix....

    [...]

Journal ArticleDOI
TL;DR: This paper presents a constraint-based factorization method for reconstructing 3-D structures registered to the patient, from 2-D endoscopic images, suitable for on-line surgical applications to provide surgeons with additional3-D shape information, critical distance monitoring and warnings.
Abstract: The endoscope is a popular imaging modality used in many preevaluations and surgical treatments, and is also one of the essential tools in minimally invasive surgery. However, regular endoscopes provide only 2-D images. Even though stereoendoscopy systems can display 3-D images, the real anatomical structure of the observed lesion is unavailable and can only be judged by the surgeon's imagination. In this paper, we present a constraint-based factorization method for reconstructing 3-D structures registered to the patient, from 2-D endoscopic images. The proposed method incorporates the geometric constraints from the tracked surgical instrument into the traditional factorization method based on frame-to-frame feature motion on the endoscopically viewed scene. Experiments with real and synthetic data demonstrate good real-scale 3-D extraction, with greater accuracy than is available from traditional methods. The reconstruction process can also be accomplished in a few seconds, making it suitable for on-line surgical applications to provide surgeons with additional 3-D shape information, critical distance monitoring and warnings.

58 citations


Cites methods from "Multiple motion scene reconstructio..."

  • ...Subsequent researchers extended this method to include other camera projection models [21], and to deal with imperfect data [22] and uncalibrated cameras [23]....

    [...]

Proceedings ArticleDOI
01 Jan 2006
TL;DR: It is shown that under the assumption that people walk with a constant velocity, calibration performance can be improved significantly and the incorporation of temporal data helps to take correlations between subsequent detections into consideration, which leads to an up-front reduction of the noise in the measurements and an overall improvement in auto-calibration performance.
Abstract: It has been shown that under a small number of assumptions, observations of people can be used to obtain metric calibration information of a camera, which is particularly useful for surveillance applications. However, previous work had to exclude the common criticial configuration of the camera’s principal point falling on the horizon line and very long focal lengths, both of which occur commonly in practise. Due to noise, the quality of the calibration quickly degrades at and in the vicinity of these configurations. This paper provides a robust solution to this problem by incorporating information about the motion of people into the estimation process. It is shown that under the assumption that people walk with a constant velocity, calibration performance can be improved significantly. In addition to solving the above problem, the incorporation of temporal data also helps to take correlations between subsequent detections into consideration, which leads to an up-front reduction of the noise in the measurements and an overall improvement in auto-calibration performance.

43 citations


Cites methods from "Multiple motion scene reconstructio..."

  • ...A particularly relevant example is the work by Han and Kanade [7], which uses a second-order motion model in conjunction with a projective factorization method to simultaneously recover structure, camera motion and intrinsic parameters in a scene with multiple constant-speed rectilinear motions....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: New results are derived on the minimum number of landmarks needed to obtain a solution, and algorithms are presented for computing these minimum-landmark solutions in closed form that provide the basis for an automatic system that can solve the Location Determination Problem under difficult viewing.
Abstract: A new paradigm, Random Sample Consensus (RANSAC), for fitting a model to experimental data is introduced. RANSAC is capable of interpreting/smoothing data containing a significant percentage of gross errors, and is thus ideally suited for applications in automated image analysis where interpretation is based on the data provided by error-prone feature detectors. A major portion of this paper describes the application of RANSAC to the Location Determination Problem (LDP): Given an image depicting a set of landmarks with known locations, determine that point in space from which the image was obtained. In response to a RANSAC requirement, new results are derived on the minimum number of landmarks needed to obtain a solution, and algorithms are presented for computing these minimum-landmark solutions in closed form. These results provide the basis for an automatic system that can solve the LDP under difficult viewing

23,396 citations

Book ChapterDOI
21 Sep 1999
TL;DR: A survey of the theory and methods of photogrammetric bundle adjustment can be found in this article, with a focus on general robust cost functions rather than restricting attention to traditional nonlinear least squares.
Abstract: This paper is a survey of the theory and methods of photogrammetric bundle adjustment, aimed at potential implementors in the computer vision community. Bundle adjustment is the problem of refining a visual reconstruction to produce jointly optimal structure and viewing parameter estimates. Topics covered include: the choice of cost function and robustness; numerical optimization including sparse Newton methods, linearly convergent approximations, updating and recursive methods; gauge (datum) invariance; and quality control. The theory is developed for general robust cost functions rather than restricting attention to traditional nonlinear least squares.

3,521 citations

Journal ArticleDOI
TL;DR: In this paper, the singular value decomposition (SVDC) technique is used to factor the measurement matrix into two matrices which represent object shape and camera rotation respectively, and two of the three translation components are computed in a preprocessing stage.
Abstract: Inferring scene geometry and camera motion from a stream of images is possible in principle, but is an ill-conditioned problem when the objects are distant with respect to their size. We have developed a factorization method that can overcome this difficulty by recovering shape and motion under orthography without computing depth as an intermediate step. An image stream can be represented by the 2FxP measurement matrix of the image coordinates of P points tracked through F frames. We show that under orthographic projection this matrix is of rank 3. Based on this observation, the factorization method uses the singular-value decomposition technique to factor the measurement matrix into two matrices which represent object shape and camera rotation respectively. Two of the three translation components are computed in a preprocessing stage. The method can also handle and obtain a full solution from a partially filled-in measurement matrix that may result from occlusions or tracking failures. The method gives accurate results, and does not introduce smoothing in either shape or motion. We demonstrate this with a series of experiments on laboratory and outdoor image streams, with and without occlusions.

2,696 citations

Book ChapterDOI
19 May 1992
TL;DR: This paper addresses the problem of determining the kind of three- dimensional reconstructions that can be obtained from a binocular stereo rig for which no three-dimensional metric calibration data is available, and shows that even in this case some very rich non-metric reconstructions of the environment can nonetheless be obtained.
Abstract: This paper addresses the problem of determining the kind of three-dimensional reconstructions that can be obtained from a binocular stereo rig for which no three-dimensional metric calibration data is available. The only information at our disposal is a set of pixel correspondences between the two retinas which we assume are obtained by some correlation technique or any other means. We show that even in this case some very rich non-metric reconstructions of the environment can nonetheless be obtained.

998 citations


"Multiple motion scene reconstructio..." refers background in this paper

  • ...There has been considerable progress on uncalibrated static scene reconstruction [ 11 ], [22], [38], [26], [27], [4], [7], [17], [19], [40], [1], [14], [23]....

    [...]

Proceedings ArticleDOI
15 Jun 2000
TL;DR: This paper proposes a novel technique based on a non-rigid model, where the 3D shape in each frame is a linear combination of a set of basis shapes, and can be factored in a three-step process to yield pose, configuration and shape.
Abstract: The paper addresses the problem of recovering 3D non-rigid shape models from image sequences. For example, given a video recording of a talking person, we would like to estimate a 3D model of the lips and the full face and its internal modes of variation. Many solutions that recover 3D shape from 2D image sequences have been proposed; these so-called structure-from-motion techniques usually assume that the 3D object is rigid. For example, C. Tomasi and T. Kanades' (1992) factorization technique is based on a rigid shape matrix, which produces a tracking matrix of rank 3 under orthographic projection. We propose a novel technique based on a non-rigid model, where the 3D shape in each frame is a linear combination of a set of basis shapes. Under this model, the tracking matrix is of higher rank, and can be factored in a three-step process to yield pose, configuration and shape. To the best of our knowledge, this is the first model free approach that can recover from single-view video sequences nonrigid shape models. We demonstrate this new algorithm on several video sequences. We were able to recover 3D non-rigid human face and animal models with high accuracy.

902 citations


"Multiple motion scene reconstructio..." refers methods in this paper

  • ...Bregler et al. describe a technique to recover nonrigid 3D models based on the representation of 3D shape as a linear combination of a set of basis shapes [ 6 ]....

    [...]