scispace - formally typeset
Search or ask a question
Author

Oleg Naroditsky

Other affiliations: Apple Inc., SRI International, University of Minnesota  ...read more
Bio: Oleg Naroditsky is an academic researcher from University of Pennsylvania. The author has contributed to research in topics: Visual odometry & Structure from motion. The author has an hindex of 16, co-authored 27 publications receiving 3380 citations. Previous affiliations of Oleg Naroditsky include Apple Inc. & SRI International.

Papers
More filters
Proceedings Article
01 Jan 2004
TL;DR: A system that estimates the motion of a stereo head or a single moving camera based on video input in real-time with low delay and the motion estimates are used for navigational purposes.
Abstract: We present a system that estimates the motion of a stereo head or a single moving camera based on video input. The system operates in real-time with low delay and the motion estimates are used for navigational purposes. The front end of the system is a feature tracker. Point features are matched between pairs of frames and linked into image trajectories at video rate. Robust estimates of the camera motion are then produced from the feature tracks using a geometric hypothesize-and-test architecture. This generates what we call visual odometry, i.e. motion estimates from visual input alone. No prior knowledge of the scene nor the motion is necessary. The visual odometry can also be used in conjunction with information from other sources such as GPS, inertia sensors, wheel encoders, etc. The pose estimation method has been applied successfully to video from aerial, automotive and handheld platforms. We focus on results with an autonomous ground vehicle. We give examples of camera trajectories estimated purely from images over previously unseen distances and periods of time.

1,786 citations

Journal ArticleDOI
TL;DR: A system that estimates the motion of a stereo head, or a single moving camera, based on video input, in real time with low delay, and the motion estimates are used for navigational purposes.
Abstract: We present a system that estimates the motion of a stereo head, or a single moving camera, based on video input. The system operates in real time with low delay, and the motion estimates are used for navigational purposes. The front end of the system is a feature tracker. Point features are matched between pairs of frames and linked into image trajectories at video rate. Robust estimates of the camera motion are then produced from the feature tracks using a geometric hypothesize-and-test architecture. This generates motion estimates from visual input alone. No prior knowledge of the scene or the motion is necessary. The visual estimates can also be used in conjunction with information from other sources, such as a global positioning system, inertia sensors, wheel encoders, etc. The pose estimation method has been applied successfully to video from aerial, automotive, and handheld platforms. We focus on results obtained with a stereo head mounted on an autonomous ground vehicle. We give examples of camera trajectories estimated in real time purely from images over previously unseen distances (600 m) and periods of time. © 2006 Wiley Periodicals, Inc.

704 citations

Journal ArticleDOI
01 Nov 2006
TL;DR: The Iris on the Move (IOM) system is the first system to enable capture of iris images of sufficient quality for iris recognition while the subject is moving at a normal walking pace through a minimally confining portal.
Abstract: Iris recognition is one of the most powerful techniques for biometric identification ever developed. Commercial systems based on the algorithms developed by John Daugman have been available since 1995 and have been used in a variety of practical applications. However, all currently available systems impose substantial constraints on subject position and motion during the recognition process. These constraints are largely driven by the image acquisition process, rather than the particular pattern-matching algorithm used for the recognition process. In this paper we present results of our efforts to substantially reduce constraints on position and motion by means of a new image acquisition system based on high-resolution cameras, video synchronized strobed illumination, and specularity based image segmentation. We discuss the design tradeoffs we made in developing the system and the performance we have been able to achieve when the image acquisition system is combined with a standard iris recognition algorithm. The Iris on the Move (IOM) system is the first system to enable capture of iris images of sufficient quality for iris recognition while the subject is moving at a normal walking pace through a minimally confining portal

336 citations

Patent
27 Oct 2009
TL;DR: In this paper, a system and method for generating a mixed-reality environment is presented, which provides a user-worn sub-system communicatively connected to a synthetic object computer module.
Abstract: A system and method for generating a mixed-reality environment is provided. The system and method provides a user-worn sub-system communicatively connected to a synthetic object computer module. The user-worn sub-system may utilize a plurality of user-worn sensors to capture and process data regarding a user's pose and location. The synthetic object computer module may generate and provide to the user-worn sub-system synthetic objects based information defining a user's real world life scene or environment indicating a user's pose and location. The synthetic objects may then be rendered on a user-worn display, thereby inserting the synthetic objects into a user's field of view. Rendering the synthetic objects on the user-worn display creates the virtual effect for the user that the synthetic objects are present in the real world.

126 citations

Patent
03 Dec 2007
TL;DR: In this paper, a system and method for efficiently locating in 3D an object of interest in a target scene using video information captured by a plurality of cameras is presented, where pose estimates are generated for each camera by all of the cameras in the multi-camera configuration.
Abstract: A system and method for efficiently locating in 3D an object of interest in a target scene using video information captured by a plurality of cameras The system and method provide for multi-camera visual odometry wherein pose estimates are generated for each camera by all of the cameras in the multi-camera configuration Furthermore, the system and method can locate and identify salient landmarks in the target scene using any of the cameras in the multi-camera configuration and compare the identified landmark against a database of previously identified landmarks In addition, the system and method provide for the integration of video-based pose estimations with position measurement data captured by one or more secondary measurement sensors, such as, for example, Inertial Measurement Units (IMUs) and Global Positioning System (GPS) units

124 citations


Cited by
More filters
Book
30 Sep 2010
TL;DR: Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images and takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene.
Abstract: Humans perceive the three-dimensional structure of the world with apparent ease. However, despite all of the recent advances in computer vision research, the dream of having a computer interpret an image at the same level as a two-year old remains elusive. Why is computer vision such a challenging problem and what is the current state of the art? Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both for specialized applications such as medical imaging, and for fun, consumer-level tasks such as image editing and stitching, which students can apply to their own personal photos and videos. More than just a source of recipes, this exceptionally authoritative and comprehensive textbook/reference also takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene. These problems are also analyzed using statistical models and solved using rigorous engineering techniques Topics and features: structured to support active curricula and project-oriented courses, with tips in the Introduction for using the book in a variety of customized courses; presents exercises at the end of each chapter with a heavy emphasis on testing algorithms and containing numerous suggestions for small mid-term projects; provides additional material and more detailed mathematical topics in the Appendices, which cover linear algebra, numerical techniques, and Bayesian estimation theory; suggests additional reading at the end of each chapter, including the latest research in each sub-field, in addition to a full Bibliography at the end of the book; supplies supplementary course material for students at the associated website, http://szeliski.org/Book/. Suitable for an upper-level undergraduate or graduate-level course in computer science or engineering, this textbook focuses on basic techniques that work under real-world conditions and encourages students to push their creative boundaries. Its design and exposition also make it eminently suitable as a unique reference to the fundamental techniques and current research literature in computer vision.

4,146 citations

Proceedings ArticleDOI
13 Nov 2007
TL;DR: A system specifically designed to track a hand-held camera in a small AR workspace, processed in parallel threads on a dual-core computer, that produces detailed maps with thousands of landmarks which can be tracked at frame-rate with accuracy and robustness rivalling that of state-of-the-art model-based systems.
Abstract: This paper presents a method of estimating camera pose in an unknown scene. While this has previously been attempted by adapting SLAM algorithms developed for robotic exploration, we propose a system specifically designed to track a hand-held camera in a small AR workspace. We propose to split tracking and mapping into two separate tasks, processed in parallel threads on a dual-core computer: one thread deals with the task of robustly tracking erratic hand-held motion, while the other produces a 3D map of point features from previously observed video frames. This allows the use of computationally expensive batch optimisation techniques not usually associated with real-time operation: The result is a system that produces detailed maps with thousands of landmarks which can be tracked at frame-rate, with an accuracy and robustness rivalling that of state-of-the-art model-based systems.

4,091 citations

Journal ArticleDOI
TL;DR: The first successful application of the SLAM methodology from mobile robotics to the "pure vision" domain of a single uncontrolled camera, achieving real time but drift-free performance inaccessible to structure from motion approaches is presented.
Abstract: We present a real-time algorithm which can recover the 3D trajectory of a monocular camera, moving rapidly through a previously unknown scene. Our system, which we dub MonoSLAM, is the first successful application of the SLAM methodology from mobile robotics to the "pure vision" domain of a single uncontrolled camera, achieving real time but drift-free performance inaccessible to structure from motion approaches. The core of the approach is the online creation of a sparse but persistent map of natural landmarks within a probabilistic framework. Our key novel contributions include an active approach to mapping and measurement, the use of a general motion model for smooth camera movement, and solutions for monocular feature initialization and feature orientation estimation. Together, these add up to an extremely efficient and robust algorithm which runs at 30 Hz with standard PC and camera hardware. This work extends the range of robotic systems in which SLAM can be usefully applied, but also opens up new areas. We present applications of MonoSLAM to real-time 3D localization and mapping for a high-performance full-size humanoid robot and live augmented reality with a hand-held camera

3,772 citations

Proceedings ArticleDOI
12 Jul 2014
TL;DR: The method achieves both low-drift and low-computational complexity without the need for high accuracy ranging or inertial measurements and can achieve accuracy at the level of state of the art offline batch methods.
Abstract: We propose a real-time method for odometry and mapping using range measurements from a 2-axis lidar moving in 6-DOF. The problem is hard because the range measurements are received at different times, and errors in motion estimation can cause mis-registration of the resulting point cloud. To date, coherent 3D maps can be built by off-line batch methods, often using loop closure to correct for drift over time. Our method achieves both low-drift and low-computational complexity without the need for high accuracy ranging or inertial measurements. The key idea in obtaining this level of performance is the division of the complex problem of simultaneous localization and mapping, which seeks to optimize a large number of variables simultaneously, by two algorithms. One algorithm performs odometry at a high frequency but low fidelity to estimate velocity of the lidar. Another algorithm runs at a frequency of an order of magnitude lower for fine matching and registration of the point cloud. Combination of the two algorithms allows the method to map in real-time. The method has been evaluated by a large set of experiments as well as on the KITTI odometry benchmark. The results indicate that the method can achieve accuracy at the level of state of the art offline batch methods.

1,879 citations

Journal ArticleDOI
TL;DR: What is now the de-facto standard formulation for SLAM is presented, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers.
Abstract: Simultaneous Localization and Mapping (SLAM)consists in the concurrent construction of a model of the environment (the map), and the estimation of the state of the robot moving within it. The SLAM community has made astonishing progress over the last 30 years, enabling large-scale real-world applications, and witnessing a steady transition of this technology to industry. We survey the current state of SLAM. We start by presenting what is now the de-facto standard formulation for SLAM. We then review related work, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers. This paper simultaneously serves as a position paper and tutorial to those who are users of SLAM. By looking at the published research with a critical eye, we delineate open challenges and new research issues, that still deserve careful scientific investigation. The paper also contains the authors' take on two questions that often animate discussions during robotics conferences: Do robots need SLAM? and Is SLAM solved?

1,828 citations