Model-based recognition in robot vision

doi:10.1145/6462.6464

Home
/
Papers
/
Model-based recognition in robot vision

Journal Article•DOI•

Model-based recognition in robot vision

Roland T. Chin¹, Charles R. Dyer¹•Institutions (1)

University of Wisconsin-Madison¹

01 Mar 1986-ACM Computing Surveys (ACM)-Vol. 18, Iss: 1, pp 67-108

TL;DR: This paper presents a comparative study and survey of model-based object-recognition algorithms for robot vision, and an evaluation and comparison of existing industrial part- recognition systems and algorithms is given, providing insights for progress toward future robot vision systems.

read less

Abstract: This paper presents a comparative study and survey of model-based object-recognition algorithms for robot vision. The goal of these algorithms is to recognize the identity, position, and orientation of randomly oriented industrial parts. In one form this is commonly referred to as the "bin-picking" problem, in which the parts to be recognized are presented in a jumbled bin. The paper is organized according to 2-D, 2½-D, and 3-D object representations, which are used as the basis for the recognition algorithms. Three central issues common to each category, namely, feature extraction, modeling, and matching, are examined in detail. An evaluation and comparison of existing industrial part-recognition systems and algorithms is given, providing insights for progress toward future robot vision systems.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

A method for registration of 3-D shapes

[...]

Paul J. Besl¹, H.D. McKay¹•Institutions (1)

General Motors¹

01 Feb 1992-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: In this paper, the authors describe a general-purpose representation-independent method for the accurate and computationally efficient registration of 3D shapes including free-form curves and surfaces, based on the iterative closest point (ICP) algorithm, which requires only a procedure to find the closest point on a geometric entity to a given point.

...read moreread less

Abstract: The authors describe a general-purpose, representation-independent method for the accurate and computationally efficient registration of 3-D shapes including free-form curves and surfaces. The method handles the full six degrees of freedom and is based on the iterative closest point (ICP) algorithm, which requires only a procedure to find the closest point on a geometric entity to a given point. The ICP algorithm always converges monotonically to the nearest local minimum of a mean-square distance metric, and the rate of convergence is rapid during the first few iterations. Therefore, given an adequate set of initial rotations and translations for a particular class of objects with a certain level of 'shape complexity', one can globally minimize the mean-square distance metric over all six degrees of freedom by testing each initial registration. One important application of this method is to register sensed data from unfixtured rigid objects with an ideal geometric model, prior to shape inspection. Experimental results show the capabilities of the registration algorithm on point sets, curves, and surfaces. >

...read moreread less

17,598 citations

Journal Article•DOI•

Comparing images using the Hausdorff distance

[...]

Daniel P. Huttenlocher¹, G.A. Klanderman¹, William J. Rucklidge¹•Institutions (1)

Cornell University¹

01 Sep 1993-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Efficient algorithms for computing the Hausdorff distance between all possible relative positions of a binary image and a model are presented and it is shown that the method extends naturally to the problem of comparing a portion of a model against an image.

...read moreread less

Abstract: The Hausdorff distance measures the extent to which each point of a model set lies near some point of an image set and vice versa. Thus, this distance can be used to determine the degree of resemblance between two objects that are superimposed on one another. Efficient algorithms for computing the Hausdorff distance between all possible relative positions of a binary image and a model are presented. The focus is primarily on the case in which the model is only allowed to translate with respect to the image. The techniques are extended to rigid motion. The Hausdorff distance computation differs from many other shape comparison methods in that no correspondence between the model and the image is derived. The method is quite tolerant of small position errors such as those that occur with edge detectors and other feature extraction methods. It is shown that the method extends naturally to the problem of comparing a portion of a model against an image. >

...read moreread less

4,194 citations

Journal Article•DOI•

Neural network-based face detection

[...]

Henry Allan Rowley¹, Shumeet Baluja², Takeo Kanade¹•Institutions (2)

Carnegie Mellon University¹, Justsystem Pittsburgh Research Center²

01 Jan 1998-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A neural network-based upright frontal face detection system that arbitrates between multiple networks to improve performance over a single network, and a straightforward procedure for aligning positive face examples for training.

...read moreread less

Abstract: We present a neural network-based upright frontal face detection system. A retinally connected neural network examines small windows of an image and decides whether each window contains a face. The system arbitrates between multiple networks to improve performance over a single network. We present a straightforward procedure for aligning positive face examples for training. To collect negative examples, we use a bootstrap algorithm, which adds false detections into the training set as training progresses. This eliminates the difficult task of manually selecting nonface training examples, which must be chosen to span the entire space of nonface images. Simple heuristics, such as using the fact that faces rarely overlap in images, can further improve the accuracy. Comparisons with several other state-of-the-art face detection systems are presented, showing that our system has comparable performance in terms of detection and false-positive rates.

...read moreread less

4,105 citations

Journal Article•DOI•

Alignment by Maximization of Mutual Information

[...]

Paul A. Viola¹, William M. Wells¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Sep 1997-International Journal of Computer Vision

TL;DR: A new information-theoretic approach is presented for finding the pose of an object in an image that works well in domains where edge or gradient-magnitude based methods have difficulty, yet it is more robust than traditional correlation.

...read moreread less

Abstract: A new information-theoretic approach is presented for finding the pose of an object in an image. The technique does not require information about the surface properties of the object, besides its shape, and is robust with respect to variations of illumination. In our derivation few assumptions are made about the nature of the imaging process. As a result the algorithms are quite general and may foreseeably be used in a wide variety of imaging situations. Experiments are presented that demonstrate the approach registering magnetic resonance (MR) images, aligning a complex 3D object model to real scenes including clutter and occlusion, tracking a human head in a video sequence and aligning a view-based 2D object model to real images. The method is based on a formulation of the mutual information between the model and the image. As applied here the technique is intensity-based, rather than feature-based. It works well in domains where edge or gradient-magnitude based methods have difficulty, yet it is more robust than traditional correlation. Additionally, it has an efficient implementation that is based on stochastic approximation.

...read moreread less

3,584 citations

Proceedings Article•DOI•

Method for registration of 3-D shapes

[...]

Paul J. Besl¹, Neil David Mckay¹•Institutions (1)

General Motors¹

30 Apr 1992

TL;DR: In this paper, the authors describe a general purpose representation independent method for the accurate and computationally efficient registration of 3D shapes including free-form curves and surfaces, based on the iterative closest point (ICP) algorithm, which requires only a procedure to find the closest point on a geometric entity to a given point.

...read moreread less

Abstract: This paper describes a general purpose, representation independent method for the accurate and computationally efficient registration of 3-D shapes including free-form curves and surfaces. The method handles the full six-degrees of freedom and is based on the iterative closest point (ICP) algorithm, which requires only a procedure to find the closest point on a geometric entity to a given point. The ICP algorithm always converges monotonically to the nearest local minimum of a mean-square distance metric, and experience shows that the rate of convergence is rapid during the first few iterations. Therefore, given an adequate set of initial rotations and translations for a particular class of objects with a certain level of 'shape complexity', one can globally minimize the mean-square distance metric over all six degrees of freedom by testing each initial registration. For examples, a given 'model' shape and a sensed 'data' shape that represents a major portion of the model shape can be registered in minutes by testing one initial translation and a relatively small set of rotations to allow for the given level of model complexity. One important application of this method is to register sensed data from unfixtured rigid objects with an ideal geometric model prior to shape inspection. The described method is also useful for deciding fundamental issues such as the congruence (shape equivalence) of different geometric representations as well as for estimating the motion between point sets where the correspondences are not known. Experimental results show the capabilities of the registration algorithm on point sets, curves, and surfaces.

...read moreread less

2,377 citations

Cites background from "Model-based recognition in robot vi..."

...The reader may consult [6][14] for pre-1985 work in these areas....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132

Collapse

References

PDF

Open Access

More filters

Book•

Computer vision

[...]

Dana H. Ballard, Chris Brown

01 Jan 1982

5,834 citations

Journal Article•DOI•

Generalizing the hough transform to detect arbitrary shapes

[...]

Dana H. Ballard¹•Institutions (1)

University of Rochester¹

01 Jan 1987-Pattern Recognition

TL;DR: It is shown how the boundaries of an arbitrary non-analytic shape can be used to construct a mapping between image space and Hough transform space, which makes the generalized Houghtransform a kind of universal transform which can beused to find arbitrarily complex shapes.

...read moreread less

4,310 citations

"Model-based recognition in robot vi..." refers methods in this paper

...Recognition is based on template matching between the model edge template and the edge image in the generalized Hough transform space [Ballard 1981a]....
[...]

Book•

Robot Vision

[...]

Berthold K. P. Horn¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Mar 1986

TL;DR: Robot Vision as discussed by the authors is a broad overview of the field of computer vision, using a consistent notation based on a detailed understanding of the image formation process, which can provide a useful and current reference for professionals working in the fields of machine vision, image processing, and pattern recognition.

...read moreread less

Abstract: From the Publisher: This book presents a coherent approach to the fast-moving field of computer vision, using a consistent notation based on a detailed understanding of the image formation process. It covers even the most recent research and will provide a useful and current reference for professionals working in the fields of machine vision, image processing, and pattern recognition. An outgrowth of the author's course at MIT, Robot Vision presents a solid framework for understanding existing work and planning future research. Its coverage includes a great deal of material that is important to engineers applying machine vision methods in the real world. The chapters on binary image processing, for example, help explain and suggest how to improve the many commercial devices now available. And the material on photometric stereo and the extended Gaussian image points the way to what may be the next thrust in commercialization of the results in this area. Chapters in the first part of the book emphasize the development of simple symbolic descriptions from images, while the remaining chapters deal with methods that exploit these descriptions. The final chapter offers a detailed description of how to integrate a vision system into an overall robotics system, in this case one designed to pick parts out of a bin. The many exercises complement and extend the material in the text, and an extensive bibliography will serve as a useful guide to current research. Errata (164k PDF)

...read moreread less

3,783 citations

Journal Article•DOI•

Fourier Descriptors for Plane Closed Curves

[...]

Charles T. Zahn¹, Ralph Roskies²•Institutions (2)

Stanford University¹, Yale University²

01 Mar 1972-IEEE Transactions on Computers

TL;DR: It is established that the Fourier series expansion is optimal and unique with respect to obtaining coefficients insensitive to starting point and the amplitudes are pure form invariants as well as are certain simple functions of phase angles.

...read moreread less

Abstract: A method for the analysis and synthesis of closed curves in the plane is developed using the Fourier descriptors FD's of Cosgriff [1]. A curve is represented parametrically as a function of arc length by the accumulated change in direction of the curve since the starting point. This function is expanded in a Fourier series and the coefficients are arranged in the amplitude/phase-angle form. It is shown that the amplitudes are pure form invariants as well as are certain simple functions of phase angles. Rotational and axial symmetry are related directly to simple properties of the Fourier descriptors. An analysis of shape similarity or symmetry can be based on these relationships; also closed symmetric curves can be synthesized from almost arbitrary Fourier descriptors. It is established that the Fourier series expansion is optimal and unique with respect to obtaining coefficients insensitive to starting point. Several examples are provided to indicate the usefulness of Fourier descriptors as features for shape discrimination and a number of interesting symmetric curves are generated by computer and plotted out.

...read moreread less

1,973 citations

Journal Article•DOI•

The psychology of computer vision

[...]

Patrick Henry Winston

01 Jul 1976-Pattern Recognition

1,747 citations