Author
Drew Steedly
Other affiliations: Hastings Entertainment
Bio: Drew Steedly is an academic researcher from Microsoft. The author has contributed to research in topics: Display device & Image stitching. The author has an hindex of 27, co-authored 51 publications receiving 2829 citations. Previous affiliations of Drew Steedly include Hastings Entertainment.
Papers
More filters
01 Dec 2008
TL;DR: An interactive system for generating photorealistic, textured, piecewise-planar 3D models of architectural structures and urban scenes from unordered sets of photographs, which enables us to accurately model polygonal faces from 2D interactions in a single image.
Abstract: We present an interactive system for generating photorealistic, textured, piecewise-planar 3D models of architectural structures and urban scenes from unordered sets of photographs. To reconstruct 3D geometry in our system, the user draws outlines overlaid on 2D photographs. The 3D structure is then automatically computed by combining the 2D interaction with the multi-view geometric information recovered by performing structure from motion analysis on the input photographs. We utilize vanishing point constraints at multiple stages during the reconstruction, which is particularly useful for architectural scenes where parallel lines are abundant. Our approach enables us to accurately model polygonal faces from 2D interactions in a single image. Our system also supports useful operations such as edge snapping and extrusions.Seamless texture maps are automatically generated by combining multiple input photographs using graph cut optimization and Poisson blending. The user can add brush strokes as hints during the texture generation stage to remove artifacts caused by unmodeled geometric structures. We build models for a variety of architectural scenes from collections of up to about a hundred photographs.
279 citations
01 Sep 2009
TL;DR: A novel multi-view stereo method designed for image-based rendering that generates piecewise planar depth maps from an unordered collection of photographs is presented.
Abstract: We present a novel multi-view stereo method designed for image-based rendering that generates piecewise planar depth maps from an unordered collection of photographs.
273 citations
Patent•
28 Jul 2005TL;DR: Panoramic Viewfinder as discussed by the authors provides an intuitive interactive viewfinder display which operates on a digital camera display screen, which can brush a panorama from images captured in any order, while providing visual feedback to the user for ensuring that desired scene elements will appear in the final panorama.
Abstract: A “Panoramic Viewfinder” provides an intuitive interactive viewfinder display which operates on a digital camera display screen. This interactive viewfinder provides real-time assistance in capturing images for constructing panoramic image mosaics. The Panoramic Viewfinder “brushes” a panorama from images captured in any order, while providing visual feedback to the user for ensuring that desired scene elements will appear in the final panorama. This visual feedback presents real-time stitched previews of the panorama while capturing images. In one embodiment, the viewfinder display of the Panoramic Viewfinder includes a “mosaic preview” which presents a stitched mosaic preview of the captured images; a live display window representing a “current content” of the camera viewfinder, which is mapped to a matching location within the mosaic preview; and an optional panoramic “cropping frame” overlaid onto the mosaic preview which illustrates a section of the mosaic which will survive a rectangular cropping of the mosaic.
243 citations
26 Dec 2007
TL;DR: This paper presents an extremely efficient, inherently out-of-core bundle adjustment algorithm that decouple the original problem into several submaps that have their own local coordinate systems and can be optimized in parallel.
Abstract: Large-scale 3D reconstruction has recently received much attention from the computer vision community. Bundle adjustment is a key component of 3D reconstruction problems. However, traditional bundle adjustment algorithms require a considerable amount of memory and computational resources. In this paper, we present an extremely efficient, inherently out-of-core bundle adjustment algorithm. We decouple the original problem into several submaps that have their own local coordinate systems and can be optimized in parallel. A key contribution to our algorithm is making as much progress towards optimizing the global non-linear cost function as possible using the fragments of the reconstruction that are currently in core memory. This allows us to converge with very few global sweeps (often only two) through the entire reconstruction. We present experimental results on large-scale 3D reconstruction datasets, both synthetic and real.
153 citations
10 Apr 2007
TL;DR: A submap-based approach, tectonic SAM, is proposed, in which the original optimization problem is solved by using a divide-and-conquer scheme, and the linearization of the submaps can be cached and reused when they are combined into a global map.
Abstract: Simultaneous localization and mapping (SLAM) is a method that robots use to explore, navigate, and map an unknown environment. However, this method poses inherent problems with regard to cost and time. To lower computation costs, smoothing and mapping (SAM) approaches have shown some promise, and they also provide more accurate solutions than filtering approaches in realistic scenarios. However, in SAM approaches, updating the linearization is still the most time-consuming step. To mitigate this problem, we propose a submap-based approach, tectonic SAM, in which the original optimization problem is solved by using a divide-and-conquer scheme. Submaps are optimized independently and parameterized relative to a local coordinate frame. During the optimization, the global position of the submap may change dramatically, but the positions of the nodes in the submap relative to the local coordinate frame do not change very much. The key contribution of this paper is to show that the linearization of the submaps can be cached and reused when they are combined into a global map. According to the results of both simulation and real experiments, Tectonic SAM drastically speeds up SAM in very large environments while still maintaining its global accuracy.
137 citations
Cited by
More filters
Book•
30 Sep 2010
TL;DR: Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images and takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene.
Abstract: Humans perceive the three-dimensional structure of the world with apparent ease. However, despite all of the recent advances in computer vision research, the dream of having a computer interpret an image at the same level as a two-year old remains elusive. Why is computer vision such a challenging problem and what is the current state of the art? Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both for specialized applications such as medical imaging, and for fun, consumer-level tasks such as image editing and stitching, which students can apply to their own personal photos and videos. More than just a source of recipes, this exceptionally authoritative and comprehensive textbook/reference also takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene. These problems are also analyzed using statistical models and solved using rigorous engineering techniques Topics and features: structured to support active curricula and project-oriented courses, with tips in the Introduction for using the book in a variety of customized courses; presents exercises at the end of each chapter with a heavy emphasis on testing algorithms and containing numerous suggestions for small mid-term projects; provides additional material and more detailed mathematical topics in the Appendices, which cover linear algebra, numerical techniques, and Bayesian estimation theory; suggests additional reading at the end of each chapter, including the latest research in each sub-field, in addition to a full Bibliography at the end of the book; supplies supplementary course material for students at the associated website, http://szeliski.org/Book/. Suitable for an upper-level undergraduate or graduate-level course in computer science or engineering, this textbook focuses on basic techniques that work under real-world conditions and encourages students to push their creative boundaries. Its design and exposition also make it eminently suitable as a unique reference to the fundamental techniques and current research literature in computer vision.
4,146 citations
01 Jul 2006
TL;DR: This work presents a system for interactively browsing and exploring large unstructured collections of photographs of a scene using a novel 3D interface that consists of an image-based modeling front end that automatically computes the viewpoint of each photograph and a sparse 3D model of the scene and image to model correspondences.
Abstract: We present a system for interactively browsing and exploring large unstructured collections of photographs of a scene using a novel 3D interface. Our system consists of an image-based modeling front end that automatically computes the viewpoint of each photograph as well as a sparse 3D model of the scene and image to model correspondences. Our photo explorer uses image-based rendering techniques to smoothly transition between photographs, while also enabling full 3D navigation and exploration of the set of images and world geometry, along with auxiliary information such as overhead maps. Our system also makes it easy to construct photo tours of scenic or historic locations, and to annotate image details, which are automatically transferred to other relevant images. We demonstrate our system on several large personal photo collections as well as images gathered from Internet photo sharing sites.
3,398 citations
27 Jun 2016
TL;DR: This work proposes a new SfM technique that improves upon the state of the art to make a further step towards building a truly general-purpose pipeline.
Abstract: Incremental Structure-from-Motion is a prevalent strategy for 3D reconstruction from unordered image collections. While incremental reconstruction systems have tremendously advanced in all regards, robustness, accuracy, completeness, and scalability remain the key problems towards building a truly general-purpose pipeline. We propose a new SfM technique that improves upon the state of the art to make a further step towards this ultimate goal. The full reconstruction pipeline is released to the public as an open-source implementation.
3,050 citations
09 May 2011
TL;DR: G2o, an open-source C++ framework for optimizing graph-based nonlinear error functions, is presented and demonstrated that while being general g2o offers a performance comparable to implementations of state-of-the-art approaches for the specific problems.
Abstract: Many popular problems in robotics and computer vision including various types of simultaneous localization and mapping (SLAM) or bundle adjustment (BA) can be phrased as least squares optimization of an error function that can be represented by a graph. This paper describes the general structure of such problems and presents g2o, an open-source C++ framework for optimizing graph-based nonlinear error functions. Our system has been designed to be easily extensible to a wide range of problems and a new problem typically can be specified in a few lines of code. The current implementation provides solutions to several variants of SLAM and BA. We provide evaluations on a wide range of real-world and simulated datasets. The results demonstrate that while being general g2o offers a performance comparable to implementations of state-of-the-art approaches for the specific problems.
2,192 citations
TL;DR: What is now the de-facto standard formulation for SLAM is presented, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers.
Abstract: Simultaneous Localization and Mapping (SLAM)consists in the concurrent construction of a model of the environment (the map), and the estimation of the state of the robot moving within it. The SLAM community has made astonishing progress over the last 30 years, enabling large-scale real-world applications, and witnessing a steady transition of this technology to industry. We survey the current state of SLAM. We start by presenting what is now the de-facto standard formulation for SLAM. We then review related work, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers. This paper simultaneously serves as a position paper and tutorial to those who are users of SLAM. By looking at the published research with a critical eye, we delineate open challenges and new research issues, that still deserve careful scientific investigation. The paper also contains the authors' take on two questions that often animate discussions during robotics conferences: Do robots need SLAM? and Is SLAM solved?
1,828 citations