scispace - formally typeset
Search or ask a question

Showing papers by "David G. Lowe published in 1997"


Proceedings ArticleDOI
17 Jun 1997
TL;DR: This paper shows that a new variant of the k-d tree search algorithm makes indexing in higher-dimensional spaces practical, and is integrated into a fully developed recognition system, which is able to detect complex objects in real, cluttered scenes in just a few seconds.
Abstract: Shape indexing is a way of making rapid associations between features detected in an image and object models that could have produced them. When model databases are large, the use of high-dimensional features is critical, due to the improved level of discrimination they can provide. Unfortunately, finding the nearest neighbour to a query point rapidly becomes inefficient as the dimensionality of the feature space increases. Past indexing methods have used hash tables for hypothesis recovery, but only in low-dimensional situations. In this paper we show that a new variant of the k-d tree search algorithm makes indexing in higher-dimensional spaces practical. This Best Bin First, or BBF search is an approximate algorithm which finds the nearest neighbour for a large fraction of the queries, and a very close neighbour in the remaining cases. The technique has been integrated into a fully developed recognition system, which is able to detect complex objects in real, cluttered scenes in just a few seconds.

1,044 citations


Proceedings ArticleDOI
20 Apr 1997
TL;DR: An implemented model-based telerobotic system designed to investigate assembly and other tasks involving contact and manipulation of known objects and a task-centric operator interface is described, which includes performing assembly-like tasks over the Internet.
Abstract: We describe an implemented model-based telerobotic system designed to investigate assembly and other tasks involving contact and manipulation of known objects. Key features of our system include ease of maintaining a world model at the operator site and a task-centric operator interface. Our system incorporates gray-scale model-based vision to assist in building and maintaining the local model. The local model is used to provide a task-centric operator interface, emphasizing the natural and direct manipulation of objects, with the robot's presence indicated in a more abstract fashion. The operator interface is designed to work with widely available and inexpensive desktop computers with low DOF input devices (such as a mouse). We also describe experimental results to date, which include performing assembly-like tasks over the Internet.

36 citations


Proceedings ArticleDOI
20 Apr 1997
TL;DR: The idea of temporally extending the results of a stereo algorithm in order to improve the algorithm's performance is introduced and speedups of up to 400% are achieved without significant errors.
Abstract: This paper introduces the idea of temporally extending the results of a stereo algorithm in order to improve the algorithm's performance This approach anticipates the changes between two consecutive depth maps resulting from the motion of the cameras Uncertainties in motion are accounted for by computation of an ambiguity area and a resulting disparity range for each pixel The computation is used to verify and refine the anticipated values, rather than calculate them without prior knowledge The paper compares the performance of the algorithm under different constraints on motion Speedups of up to 400% are achieved without significant errors

7 citations


01 Jan 1997
TL;DR: In this paper, a method for determining the correspondence between sets of point features extracted from a pair of images taken of a static scene from disparate viewpoints is proposed, where the relative position and orientation between the viewpoints as well as the structure of the scene is assumed to be unknown.
Abstract: A solution is proposed for the problem of determining the correspondence between sets of point features extracted from a pair of images taken of a static scene from disparate viewpoints. The relative position and orientation between the viewpoints as well as the structure of the scene is assumed to be unknown. Point features from a pair of views are deemed to be in correspondence if they are projectively determined by the same scene points. The determination of correspondences is a critical sub-task for recovering the structure of the world from a set of images taken by a moving camera, a task usually referred to as structure-from-motion, or for determining the relative motion between the scene and the observer. A key property of a static world, assumed by the proposed method, is rigidity. Rigidity of the world and knowledge of the intrinsic camera parameters determines a powerful constraint on point correspondences. The main contribution of this thesis is the rigidity checking method. Rigidity checking is a tractable and robust algorithm for verifying the potential rigidity of a set of hypothesized three-dimensional correspondences from a pair of images under perspective projection. The rigidity checking method, which is based on a set of structure-from-motion constraints, is uniquely designed to answer the question, "Could these corresponding points from two views be the projection of a rigid configuration?" The rigidity constraint proposed in this thesis embodies the recovery of the extrinsic (relative orientation) camera parameters which determine the epipolar geometry--the only available geometric constraint for matching images. The implemented solution combines radiometric and geometric constraints to determine the correct set of correspondences. The radiometric constraint consists of a set of grey-level differential invariants due to Schmid and Mohr. Several enhancements are made to the grey-level differential invariant matching scheme which improves the robustness and speed of the method. The specification of differential invariants for grey-scale images is extended to color images, and experimental results for matching point features with color differential invariants are reported.

4 citations


DOI
01 Jan 1997
TL;DR: The indexing algorithm has been embedded within a fully functional automatic recognition system that typically requires only a few seconds to recognize objects in standard sized images and an incremental learning procedure has been introduced which extracts model grouping information from real images as the system performs recognition, and adds it into the index to improve indexing accuracy.
Abstract: This thesis presents a method to e ciently recognize 3D objects from single, 2D images by the use of a novel, probabilistic indexing technique. Indexing is a two-stage process that includes an o ine training stage and a runtime lookup stage. During training, feature vectors representing object appearance are acquired from several points of view about each object and stored in the index. At runtime, for each image feature vector detected, a small set of the closest model vectors is recovered from the index and used to form match hypotheses. This set of nearest neighbours provides interpolation between the nearby training views of the objects, and is used to compute probability estimates that proposed matches are correct. The overall recognition process becomes extremely e cient when hypotheses are veri ed in order of their probabilities. Contributions of this thesis include the use of an indexing data structure (the kd-tree) and search algorithm (Best-Bin First search) which, unlike the standard hash table methods, remain e cient to higher index space dimensionalities. This behavior is critical to provide discrimination between models in large databases. In addition, the repertoire of 3D objects that can be recognized has been signi cantly expanded from that in most previous indexing work, by explicitly avoiding the requirement for specialcase invariant features. Finally, an incremental learning procedure has been introduced which extracts model grouping information from real images as the system performs recognition, and adds it into the index to improve indexing accuracy. A new clustering algorithm (Weighted Vector Quantization) is used to limit the memory requirements of this continual learning process. The indexing algorithm has been embedded within a fully functional automatic recognition system that typically requires only a few seconds to recognize objects in standard sized images. Experiments with real and synthetic images are presented, using indexing features derived from groupings of line segments. Indexing accuracy is shown to be high, as indicated by the rankings assigned to correct hypotheses. Experiments with the Best-Bin First search algorithm show that, if it is acceptable to miss a small fraction of the exact closest neighbours, the regime in which kd-tree search remains e cient can be extended, roughly from 5-dimensional to 20-dimensional spaces, and that this e ciency holds for very large numbers of stored points. Finally, experiments with the Weighted Vector Quantization algorithm show that it is possible to incorporate real image data into the index via incremental learning so that indexing performance is improved without increasing the memory requirements of the system. ii Abstract This thesis presents a method to e ciently recognize 3D objects from single, 2D images by the use of a novel, probabilistic indexing technique. Indexing is a two-stage process that includes an o ine training stage and a runtime lookup stage. During training, feature vectors representing object appearance are acquired from several points of view about each object and stored in the index. At runtime, for each image feature vector detected, a small set of the closest model vectors is recovered from the index and used to form match hypotheses. This set of nearest neighbours provides interpolation between the nearby training views of the objects, and is used to compute probability estimates that proposed matches are correct. The overall recognition process becomes extremely e cient when hypotheses are veri ed in order of their probabilities. Contributions of this thesis include the use of an indexing data structure (the kd-tree) and search algorithm (Best-Bin First search) which, unlike the standard hash table methods, remain e cient to higher index space dimensionalities. This behavior is critical to provide discrimination between models in large databases. In addition, the repertoire of 3D objects that can be recognized has been signi cantly expanded from that in most previous indexing work, by explicitly avoiding the requirement for specialcase invariant features. Finally, an incremental learning procedure has been introduced which extracts model grouping information from real images as the system performs recognition, and adds it into the index to improve indexing accuracy. A new clustering algorithm (Weighted Vector Quantization) is used to limit the memory requirements of this continual learning process. The indexing algorithm has been embedded within a fully functional automatic recognition system that typically requires only a few seconds to recognize objects in standard sized images. Experiments with real and synthetic images are presented, using indexing features derived from groupings of line segments. Indexing accuracy is shown to be high, as indicated by the rankings assigned to correct hypotheses. Experiments with the Best-Bin First search algorithm show that, if it is acceptable to miss a small fraction of the exact closest neighbours, the regime in which kd-tree search remains e cient can be extended, roughly from 5-dimensional to 20-dimensional spaces, and that this e ciency holds for very large numbers of stored points. Finally, experiments with the Weighted Vector Quantization algorithm show that it is possible to incorporate real image data into the index via incremental learning so that indexing performance is improved without increasing the memory requirements of the system. iiThis thesis presents a method to e ciently recognize 3D objects from single, 2D images by the use of a novel, probabilistic indexing technique. Indexing is a two-stage process that includes an o ine training stage and a runtime lookup stage. During training, feature vectors representing object appearance are acquired from several points of view about each object and stored in the index. At runtime, for each image feature vector detected, a small set of the closest model vectors is recovered from the index and used to form match hypotheses. This set of nearest neighbours provides interpolation between the nearby training views of the objects, and is used to compute probability estimates that proposed matches are correct. The overall recognition process becomes extremely e cient when hypotheses are veri ed in order of their probabilities. Contributions of this thesis include the use of an indexing data structure (the kd-tree) and search algorithm (Best-Bin First search) which, unlike the standard hash table methods, remain e cient to higher index space dimensionalities. This behavior is critical to provide discrimination between models in large databases. In addition, the repertoire of 3D objects that can be recognized has been signi cantly expanded from that in most previous indexing work, by explicitly avoiding the requirement for specialcase invariant features. Finally, an incremental learning procedure has been introduced which extracts model grouping information from real images as the system performs recognition, and adds it into the index to improve indexing accuracy. A new clustering algorithm (Weighted Vector Quantization) is used to limit the memory requirements of this continual learning process. The indexing algorithm has been embedded within a fully functional automatic recognition system that typically requires only a few seconds to recognize objects in standard sized images. Experiments with real and synthetic images are presented, using indexing features derived from groupings of line segments. Indexing accuracy is shown to be high, as indicated by the rankings assigned to correct hypotheses. Experiments with the Best-Bin First search algorithm show that, if it is acceptable to miss a small fraction of the exact closest neighbours, the regime in which kd-tree search remains e cient can be extended, roughly from 5-dimensional to 20-dimensional spaces, and that this e ciency holds for very large numbers of stored points. Finally, experiments with the Weighted Vector Quantization algorithm show that it is possible to incorporate real image data into the index via incremental learning so that indexing performance is improved without increasing the memory requirements of the system. ii Table of

3 citations