scispace - formally typeset
Search or ask a question

Showing papers by "Larry Matthies published in 1989"


Journal ArticleDOI
TL;DR: A new, pixel-based (iconic) algorithm that estimates depth and depth uncertainty at each pixel and incrementally refines these estimates over time and can serve as a useful and general framework for low-level dynamic vision.
Abstract: Using known camera motion to estimate depth from image sequences is an important problem in robot vision. Many applications of depth-from-motion, including navigation and manipulation, require algorithms that can estimate depth in an on-line, incremental fashion. This requires a representation that records the uncertainty in depth estimates and a mechanism that integrates new measurements with existing depth estimates to reduce the uncertainty over time. Kalman filtering provides this mechanism. Previous applications of Kalman filtering to depth-from-motion have been limited to estimating depth at the location of a sparse set of features. In this paper, we introduce a new, pixel-based (iconic) algorithm that estimates depth and depth uncertainty at each pixel and incrementally refines these estimates over time. We describe the algorithm and contrast its formulation and performance to that of a feature-based Kalman filtering algorithm. We compare the performance of the two approaches by analyzing their theoretical convergence rates, by conducting quantitative experiments with images of a flat poster, and by conducting qualitative experiments with images of a realistic outdoor-scene model. The results show that the new method is an effective way to extract depth from lateral camera translations. This approach can be extended to incorporate general motion and to integrate other sources of information, such as stereo. The algorithms we have developed, which combine Kalman filtering with iconic descriptions of depth, therefore can serve as a useful and general framework for low-level dynamic vision.

780 citations


01 Jan 1989
TL;DR: This work begins by using cameras on-board a robot vehicle to estimate the motion of the vehicle by tracking 3-D feature-points or "landmarks", develops sequential methods for estimating the vehicle motion and updating the landmark model, and implements a system that successfully tracks landmarks through stereo image sequences.
Abstract: Sensing 3-D shape and motion is an important problem in autonomous navigation and manipulation. Stereo vision is an attractive approach to this problem in several domains. In this thesis, I address fundamental components of this problem by using stereo vision to estimate the 3-D structure of "depth" of objects visible to a robot, as well as to estimate the motion of the robot as its travels through an unknown environment. I begin by using cameras on-board a robot vehicle to estimate the motion of the vehicle by tracking 3-D feature-points or "landmarks". I formulate this task as a statistical estimation problem, develop sequential methods for estimating the vehicle motion and updating the landmark model, and implement a system that successfully tracks landmarks through stereo image sequences. In laboratory experiments, this system has achieved an accuracy of 2$\\$% of distance over 5.5 meters and 55 stereo image pairs. These results establish the importance of statistical modelling in this problem and demonstrate the feasibility of visual motion estimation in unknown environments. This work embodies a successful paradigm for feature-based depth and motion estimation, but the feature-based approach results in a very limited 3-D model of the environment. To extend this aspect of the system, I address the problem of estimating "depth maps" from stereo images. Depth maps specify scene depth for each pixel in the image. I propose a system architecture in which exploratory camera motion is used to acquire a narrow-baseline image pair by moving one camera of the stereo system. Depth estimates obtained from this image pair are used to "bootstrap" matching of a wide-baseline image pair acquired with both cameras of the system. I formulate the bootstrap operation statistically by modelling depth maps as random fields and developing Bayesian matching algorithms in which depth information from the narrow-baseline image pair determines the prior density for matching the wide baseline image pair. This leads to efficient, area-based matching algorithms that are applied independently for each pixel or each scanline of the image. Experimental results with images of complex, outdoor scene models demonstrate the power of the approach.

156 citations


Proceedings ArticleDOI
05 Jan 1989
TL;DR: In this article, the authors proposed a probabilistic approach for updating estimates of spatial occupancy from a model of uncertainty in sonar range measurements, which can be used in conjunction to build occupancy maps from both sonar and stereo range measurements.
Abstract: Two fundamental issues in sensor fusion are (1) the definition of model spaces for representing objects of interest and (2) the definition of estimation procedures for instantiating repre-sentations, with descriptions of uncertainty, from noisy observa-tions. In 3-D perception, model spaces frequently are defined by contour and surface descriptions, such as line segments and planar patches. These models impose strong geometric limitations on the class of scenes that can be modelled and involve segmentation decisions that make model updating difficult. In this paper, we show that random field models provide attractive, alternative representations for the problem of creating spatial descriptions from stereo and sonar range measurements. For stereo ranging, we model the depth at every pixel in the image as a random variable. Maximum likelihood or Bayesian formulations of the matching problem allow us to express the uncertainty in depth at each pixel that results from matching in noisy images. For sonar ranging, we describe a tesselated spatial representation that encodes spatial occupancy probability at each cell. We derive a probabilistic scheme for updating estimates of spatial occupancy from a model of uncertainty in sonar range measurements. These representations can be used in conjunction to build occupancy maps from both sonar and stereo range measurements. We show preliminary results from sonar and single-scanline stereo that illustrate the potential of this ap-proach. We conclude with a discussion of the advantages of the representations and estimation procedures used in this paper over approaches based on contour and surface models.

16 citations


Book ChapterDOI
01 Jan 1989
TL;DR: A Bayesian approach to processing stereo image sequences that is based on representing, predicting, and updating depth and depth variance at every pixel in the image is proposed.
Abstract: Stereo vision is an attractive approach to depth sensing in many robotics applications. To date, most research in stereo has concentrated on the analysis of a single stereo image pair. However, many applications will involve moving robot systems that can acquire sequences of stereo pairs from successive robot positions. Such systems can achieve greatly improved stereo performance by appropriately controlling the motion of the cameras and by using depth information obtained from early images to guide the interpretation of later images. This requires a representation for the depth model at any point in time, methods for using the model to influence matching in subsequent images, and methods for controlling the motion of the cameras that take into account the degree of uncertainty in the depth model. In this paper, we propose a Bayesian approach to processing stereo image sequences that serves these requirements. The approach is based on representing, predicting, and updating depth and depth variance at every pixel in the image. We describe a vision system under development for a robot vehicle that incorporates this approach and summarize implementation results for parts of the system.1

4 citations