scispace - formally typeset
Search or ask a question
Book ChapterDOI

Recursive Structure from Motion

TL;DR: A technique that estimates the Structure from Motion (SFM) in a recursive fashion by recursively performing the SFM on the incoming set of images and updating the previously reconstructed structure with the structure estimated from the current set of photographs.
Abstract: In this paper we present a technique that estimates the Structure from Motion (SFM) in a recursive fashion. Traditionally successful SFM algorithms take the set of images and estimate the scene geometry and camera positions either using incremental algorithms or the global algorithms and do the refinement process [2] to reduce the reprojection error. In this work it is assumed that we don’t have complete image set at the start of the reconstruction process, unlike most of the traditional approaches present in the literature. It is assumed that the set of images come in at the regular intervals and we recursively perform the SFM on the incoming set of images and update the previously reconstructed structure with the structure estimated from the current set of images. The proposed system has been tested on two datasets which consist of 12 images and 60 images respectively and reconstructions obtained show the validity of our proposed technique.
References
More filters
Proceedings ArticleDOI
26 Dec 2007
TL;DR: This paper brings query expansion into the visual domain via two novel contributions: strong spatial constraints between the query image and each result allow us to accurately verify each return, suppressing the false positives which typically ruin text-based query expansion.
Abstract: Given a query image of an object, our objective is to retrieve all instances of that object in a large (1M+) image database. We adopt the bag-of-visual-words architecture which has proven successful in achieving high precision at low recall. Unfortunately, feature detection and quantization are noisy processes and this can result in variation in the particular visual words that appear in different images of the same object, leading to missed results. In the text retrieval literature a standard method for improving performance is query expansion. A number of the highly ranked documents from the original query are reissued as a new query. In this way, additional relevant terms can be added to the query. This is a form of blind rele- vance feedback and it can fail if 'outlier' (false positive) documents are included in the reissued query. In this paper we bring query expansion into the visual domain via two novel contributions. Firstly, strong spatial constraints between the query image and each result allow us to accurately verify each return, suppressing the false positives which typically ruin text-based query expansion. Secondly, the verified images can be used to learn a latent feature model to enable the controlled construction of expanded queries. We illustrate these ideas on the 5000 annotated image Oxford building database together with more than 1M Flickr images. We show that the precision is substantially boosted, achieving total recall in many cases.

966 citations

Journal ArticleDOI
TL;DR: A system for automatic, geo-registered, real-time 3D reconstruction from video of urban scenes that extends existing algorithms to meet the robustness and variability necessary to operate out of the lab and shows results on real video sequences comprising hundreds of thousands of frames.
Abstract: The paper presents a system for automatic, geo-registered, real-time 3D reconstruction from video of urban scenes. The system collects video streams, as well as GPS and inertia measurements in order to place the reconstructed models in geo-registered coordinates. It is designed using current state of the art real-time modules for all processing steps. It employs commodity graphics hardware and standard CPU's to achieve real-time performance. We present the main considerations in designing the system and the steps of the processing pipeline. Our system extends existing algorithms to meet the robustness and variability necessary to operate out of the lab. To account for the large dynamic range of outdoor videos the processing pipeline estimates global camera gain changes in the feature tracking stage and efficiently compensates for these in stereo estimation without impacting the real-time performance. The required accuracy for many applications is achieved with a two-step stereo reconstruction process exploiting the redundancy across frames. We show results on real video sequences comprising hundreds of thousands of frames.

846 citations

Proceedings ArticleDOI
17 Jun 2006
TL;DR: This work presents a monocular SLAM system that employs a particle filter and top-down search to allow realtime performance while mapping large numbers of landmarks, and is the first to apply this FastSLAM-type particle filter to single-camera SLAM.
Abstract: Localization and mapping in unknown environments becomes more difficult as the complexity of the environment increases. With conventional techniques, the cost of maintaining estimates rises rapidly with the number of landmarks mapped. We present a monocular SLAM system that employs a particle filter and top-down search to allow realtime performance while mapping large numbers of landmarks. To our knowledge, we are the first to apply this FastSLAM-type particle filter to single-camera SLAM. We also introduce a novel partial initialization procedure that efficiently determines the depth of new landmarks. Moreover, we use information available in observations of new landmarks to improve camera pose estimates. Results show the system operating in real-time on a standard workstation while mapping hundreds of landmarks.

480 citations

Book ChapterDOI
05 Sep 2010
TL;DR: The experiments show that truncated Newton methods, when paired with relatively simple preconditioners, offer state of the art performance for large-scale bundle adjustment.
Abstract: We present the design and implementation of a new inexact Newton type algorithm for solving large-scale bundle adjustment problems with tens of thousands of images. We explore the use of Conjugate Gradients for calculating the Newton step and its performance as a function of some simple and computationally efficient preconditioners. We show that the common Schur complement trick is not limited to factorization-based methods and that it can be interpreted as a form of preconditioning. Using photos from a street-side dataset and several community photo collections, we generate a variety of bundle adjustment problems and use them to evaluate the performance of six different bundle adjustment algorithms. Our experiments show that truncated Newton methods, when paired with relatively simple preconditioners, offer state of the art performance for large-scale bundle adjustment. The code, test problems and detailed performance data are available at http://grail.cs.washington.edu/projects/bal.

471 citations

Book ChapterDOI
06 Sep 2014
TL;DR: This work proposes a method for removing outliers from problem instances by solving simpler low-dimensional subproblems, which it refers to as 1DSfM problems.
Abstract: We present a simple, effective method for solving structure from motion problems by averaging epipolar geometries. Based on recent successes in solving for global camera rotations using averaging schemes, we focus on the problem of solving for 3D camera translations given a network of noisy pairwise camera translation directions (or 3D point observations). To do this well, we have two main insights. First, we propose a method for removing outliers from problem instances by solving simpler low-dimensional subproblems, which we refer to as 1DSfM problems. Second, we present a simple, principled averaging scheme. We demonstrate this new method in the wild on Internet photo collections.

391 citations