scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Model-based recognition in robot vision

01 Mar 1986-ACM Computing Surveys (ACM)-Vol. 18, Iss: 1, pp 67-108
TL;DR: This paper presents a comparative study and survey of model-based object-recognition algorithms for robot vision, and an evaluation and comparison of existing industrial part- recognition systems and algorithms is given, providing insights for progress toward future robot vision systems.
Abstract: This paper presents a comparative study and survey of model-based object-recognition algorithms for robot vision. The goal of these algorithms is to recognize the identity, position, and orientation of randomly oriented industrial parts. In one form this is commonly referred to as the "bin-picking" problem, in which the parts to be recognized are presented in a jumbled bin. The paper is organized according to 2-D, 2½-D, and 3-D object representations, which are used as the basis for the recognition algorithms. Three central issues common to each category, namely, feature extraction, modeling, and matching, are examined in detail. An evaluation and comparison of existing industrial part-recognition systems and algorithms is given, providing insights for progress toward future robot vision systems.
Citations
More filters
Journal ArticleDOI
Paul J. Besl1, H.D. McKay1
TL;DR: In this paper, the authors describe a general-purpose representation-independent method for the accurate and computationally efficient registration of 3D shapes including free-form curves and surfaces, based on the iterative closest point (ICP) algorithm, which requires only a procedure to find the closest point on a geometric entity to a given point.
Abstract: The authors describe a general-purpose, representation-independent method for the accurate and computationally efficient registration of 3-D shapes including free-form curves and surfaces. The method handles the full six degrees of freedom and is based on the iterative closest point (ICP) algorithm, which requires only a procedure to find the closest point on a geometric entity to a given point. The ICP algorithm always converges monotonically to the nearest local minimum of a mean-square distance metric, and the rate of convergence is rapid during the first few iterations. Therefore, given an adequate set of initial rotations and translations for a particular class of objects with a certain level of 'shape complexity', one can globally minimize the mean-square distance metric over all six degrees of freedom by testing each initial registration. One important application of this method is to register sensed data from unfixtured rigid objects with an ideal geometric model, prior to shape inspection. Experimental results show the capabilities of the registration algorithm on point sets, curves, and surfaces. >

17,598 citations

Journal ArticleDOI
TL;DR: Efficient algorithms for computing the Hausdorff distance between all possible relative positions of a binary image and a model are presented and it is shown that the method extends naturally to the problem of comparing a portion of a model against an image.
Abstract: The Hausdorff distance measures the extent to which each point of a model set lies near some point of an image set and vice versa. Thus, this distance can be used to determine the degree of resemblance between two objects that are superimposed on one another. Efficient algorithms for computing the Hausdorff distance between all possible relative positions of a binary image and a model are presented. The focus is primarily on the case in which the model is only allowed to translate with respect to the image. The techniques are extended to rigid motion. The Hausdorff distance computation differs from many other shape comparison methods in that no correspondence between the model and the image is derived. The method is quite tolerant of small position errors such as those that occur with edge detectors and other feature extraction methods. It is shown that the method extends naturally to the problem of comparing a portion of a model against an image. >

4,194 citations

Journal ArticleDOI
TL;DR: A neural network-based upright frontal face detection system that arbitrates between multiple networks to improve performance over a single network, and a straightforward procedure for aligning positive face examples for training.
Abstract: We present a neural network-based upright frontal face detection system. A retinally connected neural network examines small windows of an image and decides whether each window contains a face. The system arbitrates between multiple networks to improve performance over a single network. We present a straightforward procedure for aligning positive face examples for training. To collect negative examples, we use a bootstrap algorithm, which adds false detections into the training set as training progresses. This eliminates the difficult task of manually selecting nonface training examples, which must be chosen to span the entire space of nonface images. Simple heuristics, such as using the fact that faces rarely overlap in images, can further improve the accuracy. Comparisons with several other state-of-the-art face detection systems are presented, showing that our system has comparable performance in terms of detection and false-positive rates.

4,105 citations

Journal ArticleDOI
TL;DR: A new information-theoretic approach is presented for finding the pose of an object in an image that works well in domains where edge or gradient-magnitude based methods have difficulty, yet it is more robust than traditional correlation.
Abstract: A new information-theoretic approach is presented for finding the pose of an object in an image. The technique does not require information about the surface properties of the object, besides its shape, and is robust with respect to variations of illumination. In our derivation few assumptions are made about the nature of the imaging process. As a result the algorithms are quite general and may foreseeably be used in a wide variety of imaging situations. Experiments are presented that demonstrate the approach registering magnetic resonance (MR) images, aligning a complex 3D object model to real scenes including clutter and occlusion, tracking a human head in a video sequence and aligning a view-based 2D object model to real images. The method is based on a formulation of the mutual information between the model and the image. As applied here the technique is intensity-based, rather than feature-based. It works well in domains where edge or gradient-magnitude based methods have difficulty, yet it is more robust than traditional correlation. Additionally, it has an efficient implementation that is based on stochastic approximation.

3,584 citations

Proceedings ArticleDOI
30 Apr 1992
TL;DR: In this paper, the authors describe a general purpose representation independent method for the accurate and computationally efficient registration of 3D shapes including free-form curves and surfaces, based on the iterative closest point (ICP) algorithm, which requires only a procedure to find the closest point on a geometric entity to a given point.
Abstract: This paper describes a general purpose, representation independent method for the accurate and computationally efficient registration of 3-D shapes including free-form curves and surfaces. The method handles the full six-degrees of freedom and is based on the iterative closest point (ICP) algorithm, which requires only a procedure to find the closest point on a geometric entity to a given point. The ICP algorithm always converges monotonically to the nearest local minimum of a mean-square distance metric, and experience shows that the rate of convergence is rapid during the first few iterations. Therefore, given an adequate set of initial rotations and translations for a particular class of objects with a certain level of 'shape complexity', one can globally minimize the mean-square distance metric over all six degrees of freedom by testing each initial registration. For examples, a given 'model' shape and a sensed 'data' shape that represents a major portion of the model shape can be registered in minutes by testing one initial translation and a relatively small set of rotations to allow for the given level of model complexity. One important application of this method is to register sensed data from unfixtured rigid objects with an ideal geometric model prior to shape inspection. The described method is also useful for deciding fundamental issues such as the congruence (shape equivalence) of different geometric representations as well as for estimating the motion between point sets where the correspondences are not known. Experimental results show the capabilities of the registration algorithm on point sets, curves, and surfaces.

2,377 citations


Cites background from "Model-based recognition in robot vi..."

  • ...The reader may consult [6][14] for pre-1985 work in these areas....

    [...]

References
More filters
Book
01 Jan 1982

5,834 citations

Journal ArticleDOI
TL;DR: It is shown how the boundaries of an arbitrary non-analytic shape can be used to construct a mapping between image space and Hough transform space, which makes the generalized Houghtransform a kind of universal transform which can beused to find arbitrarily complex shapes.

4,310 citations


"Model-based recognition in robot vi..." refers methods in this paper

  • ...Recognition is based on template matching between the model edge template and the edge image in the generalized Hough transform space [Ballard 1981a]....

    [...]

Book
01 Mar 1986
TL;DR: Robot Vision as discussed by the authors is a broad overview of the field of computer vision, using a consistent notation based on a detailed understanding of the image formation process, which can provide a useful and current reference for professionals working in the fields of machine vision, image processing, and pattern recognition.
Abstract: From the Publisher: This book presents a coherent approach to the fast-moving field of computer vision, using a consistent notation based on a detailed understanding of the image formation process. It covers even the most recent research and will provide a useful and current reference for professionals working in the fields of machine vision, image processing, and pattern recognition. An outgrowth of the author's course at MIT, Robot Vision presents a solid framework for understanding existing work and planning future research. Its coverage includes a great deal of material that is important to engineers applying machine vision methods in the real world. The chapters on binary image processing, for example, help explain and suggest how to improve the many commercial devices now available. And the material on photometric stereo and the extended Gaussian image points the way to what may be the next thrust in commercialization of the results in this area. Chapters in the first part of the book emphasize the development of simple symbolic descriptions from images, while the remaining chapters deal with methods that exploit these descriptions. The final chapter offers a detailed description of how to integrate a vision system into an overall robotics system, in this case one designed to pick parts out of a bin. The many exercises complement and extend the material in the text, and an extensive bibliography will serve as a useful guide to current research. Errata (164k PDF)

3,783 citations

Journal ArticleDOI
TL;DR: It is established that the Fourier series expansion is optimal and unique with respect to obtaining coefficients insensitive to starting point and the amplitudes are pure form invariants as well as are certain simple functions of phase angles.
Abstract: A method for the analysis and synthesis of closed curves in the plane is developed using the Fourier descriptors FD's of Cosgriff [1]. A curve is represented parametrically as a function of arc length by the accumulated change in direction of the curve since the starting point. This function is expanded in a Fourier series and the coefficients are arranged in the amplitude/phase-angle form. It is shown that the amplitudes are pure form invariants as well as are certain simple functions of phase angles. Rotational and axial symmetry are related directly to simple properties of the Fourier descriptors. An analysis of shape similarity or symmetry can be based on these relationships; also closed symmetric curves can be synthesized from almost arbitrary Fourier descriptors. It is established that the Fourier series expansion is optimal and unique with respect to obtaining coefficients insensitive to starting point. Several examples are provided to indicate the usefulness of Fourier descriptors as features for shape discrimination and a number of interesting symmetric curves are generated by computer and plotted out.

1,973 citations

Journal ArticleDOI

1,747 citations