scispace - formally typeset
Search or ask a question

Showing papers on "3D single-object recognition published in 1991"


Book
04 Jan 1991
TL;DR: This book describes an extended series of experiments into the role of geometry in the critical area of object recognition, providing precise definitions of the recognition and localization problems, the methods used to address them, the solutions to these problems, and the implications of this analysis.
Abstract: With contributions from Tomas LozanoPerez and Daniel P. Huttenlocher.An intelligent system must know "what "the objects are and "where "they are in its environment. Examples of this ubiquitous problem in computer vision arise in tasks involving hand-eye coordination (such as assembling or sorting), inspection tasks, gauging operations, and in navigation and localization of mobile robots. This book describes an extended series of experiments into the role of geometry in the critical area of object recognition. It provides precise definitions of the recognition and localization problems, describes the methods used to address them, analyzes the solutions to these problems, and addresses the implications of this analysis.The solution to problems of object recognition are of fundamental importance in many real applications and versions of the techniques described here are already being used in industrial settings. Although a number of questions remain to be solved, the authors provide a valuable framework for understanding both the strengths and limitations of using object shape to guide recognition.W. Eric L. Grimson is Matsushita Associate Professor in the Department of Electrical Engineering and Computer Science at MIT.Contents: Introduction. Recognition as a Search Problem. Searching for Correspondences. Two-Dimensional Constraints. Three-Dimensional Constraints. Verifying Hypotheses. Controlling the Search Explosion. Selecting Subspaces of the Search Space. Empirical Testing. The Combinatorics of the Matching Process. The Combinatorics of Hough Transforms. The Combinatorics of Verification. The Combinatorics of Indexing. Evaluating the Methods. Recognition from Libraries. Parameterized Objects. The Role of Grouping. Sensing Strategies. Applications. The Next Steps.

896 citations


Journal ArticleDOI
TL;DR: It is argued that a system to infer automatically a model appropriate for vision tasks from the manufacturing model is needed to efficiently create a large database (more than 100 objects) of 3-D models to evaluate matching strategies.
Abstract: The topic of model-building for 3-D objects is examined. Most 3-D object recognition systems construct models either manually or by training. Neither approach has been very satisfactory, particularly in designing object recognition systems which can handle a large number of objects. Recent interest in integrating mechanical CAD systems and vision systems has led to a third type of model building for vision: adaptation of preexisting CAD models of objects for recognition. If a solid model of an object to be recognized is already available in a manufacturing database, then it should be possible to infer automatically a model appropriate for vision tasks from the manufacturing model. Such a system has been developed. It uses 3-D object descriptions created on a commercial CAD system and expressed in both the industry-standard IGES form and a polyhedral approximation and performs geometric inferencing to obtain a relational graph representation of the object which can be stored in a database of models for object recognition. Relational graph models contain both view-independent information extracted from the IGES description and view-dependent information (patch areas) extracted from synthetic views of the object. It is argued that such a system is needed to efficiently create a large database (more than 100 objects) of 3-D models to evaluate matching strategies. >

158 citations


Journal ArticleDOI
TL;DR: BONSAI identifies and localizes 3D objects in range images of one or more parts that have been designed on a computer-aided-design (CAD) system via constrained search of the interpretation tree, using unary and binary constraints to prune the search space.
Abstract: BONSAI, a model-based 3D object recognition system, is described. It identifies and localizes 3D objects in range images of one or more parts that have been designed on a computer-aided-design (CAD) system. Recognition is performed via constrained search of the interpretation tree, using unary and binary constraints (derived automatically from the CAD models) to prune the search space. Attention is focused on the recognition procedure, but the model-building, image acquisition, and segmentation procedures are also outlined. Experiments with over 200 images demonstrate that the constrained search approach to 3D object recognition has an accuracy comparable to that of previous systems. >

147 citations


Journal ArticleDOI
01 Nov 1991
TL;DR: A cooperative feature-matching technique is proposed that is implemented by a Hopfield neural network to globally match all the objects in the input scene against all the object models in the model-database at the same time.
Abstract: A two-dimensional model-based object recognition technique is introduced to identify and locate isolated or overlapping 2-D objects in any position and orientation. A cooperative feature-matching technique is proposed that is implemented by a Hopfield neural network. The proposed matching technique uses the parallelism of the neural network to globally match all the objects in the input scene against all the object models in the model-database at the same time. A global model graph representing all the object models is constructed where each node in the graph represents a feature that has a numerical feature value and is connected to other nodes by an arc representing the relationship or compatibility between them. Object recognition is formulated as matching this global model with an input scene graph representing a single object or several overlapping objects. The performance of the proposed technique is compared with that of a relaxation technique. >

110 citations


Proceedings ArticleDOI
03 Jun 1991
TL;DR: A method based on the extended Gaussian image (EGI) which can be used to determine the pose of a 3-D object is presented, which decouples the orientation and translation determination into two distinct least-squares problems.
Abstract: A method based on the extended Gaussian image (EGI) which can be used to determine the pose of a 3-D object is presented. In this scheme, the weight associated with each outward surface normal is a complex weight. The normal distance of the surface from the predefined origin is encoded as the phase of the weight, while the magnitude of the weight is the visible area of the surface. This approach decouples the orientation and translation determination into two distinct least-squares problems. Experiments involving synthetic data of two polyhedral and two smooth objects as well as real range data of the same smooth objects indicate the feasibility of this method. >

74 citations


Patent
14 Feb 1991
TL;DR: A system for recognizing numerical characters which appear as both numerical characters and words on a document includes an image capture device (12) for capturing the image of a document and a recognition circuit (14) performs recognition of the numerical characters as discussed by the authors.
Abstract: A system (10) for recognizing numerical characters which appear as both numerical characters and words on a document includes an image capture device (12) for capturing the image of a document. A recognition circuit (14) performs recognition of the numerical characters. A recognition circuit (16) performs recognition of the words corresponding to the numerical characters. A comparator circuit (20) compares the recognition signals generated by the recognition circuit (16) to the recognition signals generated by the recognition circuit (14) to determine if the numerical characters recognized by the recognition circuit (14) is correct.

63 citations


Journal ArticleDOI
TL;DR: Two algorithms for invariant object recognition are compared using a neural network and two statistical classifiers as classifiers for features taken from noiseless and noisy images and it is found that Fourier-Mellin descriptors perform as well as moment based features for noisy images but perform significantly better when noise is added.

53 citations


Journal ArticleDOI
TL;DR: This system, INGEN (INference engine for GENeric object recognition), uses a data-driven approach to determine the pose and size of objects with generic shapes such as parallelepipeds and cylinders and has been used successfully to guide a robot in removing postal objects from a pile.
Abstract: Generic shape recognition is the problem of determining the pose and dimensions of objects for which only shape models are available and the object's size is unknown. One application domain for generic object recognition is the handling and sorting of postal objects. Because metrical information relating object features to one another is not available, the more common featurebased approaches are inadequate. Our system, INGEN (INference engine for GENeric object recognition), uses a data-driven approach to determine the pose and size of objects with generic shapes such as parallelepipeds and cylinders. This system successfully recognizes occluded objects in heaps. It also handles scenes which have irregularities in surfaces and edges—such irregularities are common to postal objects—as well as shadows and irregularities in the range data itself. The three most important parts of INGEN are (1) the procedures for constructing object hypotheses, computing their attributes, and evaluating how well they fit the data, (2) the geometric reasoning process which determines the size of object hypotheses by finding points of contact with other object hypotheses and also detects geometric inconsistencies in the scene interpretation, and (3) the recognition process which allows backtracking when object hypotheses are rejected due to insufficient support or geometric conflict with other object hypotheses. INGEN has been used successfully to guide a robot in removing postal objects from a pile. We show the results of these experiments.

38 citations


Proceedings Article
24 Aug 1991
TL;DR: This paper extends the technique of learning from examples to deal with real objects that suffer from noise and occlusions and to exploit negative examples during the learning phase, and compares different versions of the multi-layer networks corresponding to the technique among themselves and with a standard Nearest Neighbor classifier.
Abstract: Even if represented in a way which is invariant to illumination conditions, a 3D object gives rise to an infinite number of 2D views, depending on its pose. It has been recently shown ([13]) that it is possible to synthesize a module that can recognize a specific 3D object from any viewpoint, by using a new technique of learning from examples, which are, in this case, a small set of 2D views of the object. In this paper we extend the technique, a) to deal with real objects (isolated paper clips) that suffer from noise and occlusions and b) to exploit negative examples during the learning phase. We also compare different versions of the multi-layer networks corresponding to our technique among themselves and with a standard Nearest Neighbor classifier. The simplest version, which is a Radial Basis Functions network, performs less well than a Nearest Neighbor classi-fier. The more powerful versions, trained with positive and negative examples, perform significantly better. Our results, which may have interesting implications for computer vision despite the relative simplicity of the task studied, are especially interesting for understanding the process of object recognition in biological vision. 1 Introduction Shape-based visual recognition of 3D objects may be solved by first hypothesizing the viewpoint (e.g., using information on feature correspondences between the image and a 3D model), then computing the appearance of the model of the object to be recognized from that viewpoint and comparing it with the actual image ([6; 20; 9; 11; 21]). Most recognition schemes developed in computer vision over the last few years employ 3D models of objects. Automatic learning of 3D models, however, is in itself a difficult problem that has not been much addressed in the past and which presents difficulties, especially for any theory that wants to account for human ability in visual recognition. Recently, recognition schemes have been suggested that, relying on a set of 2D views of the object instead of a 3D model ([2; 5; 13]), offer a natural solution to the problem of model acquisition. In particular, Poggio and Edelman ([13]) have argued that for each object there exists a smooth function mapping any perspective view into a "standard" view of the object and that this mul-tivariate function may be approximatevely synthesized from a small number of views of the object. Such a function would be object specific, with different functions corresponding to different 3D objects. …

35 citations


Proceedings ArticleDOI
03 Jun 1991
TL;DR: The authors construct a definition of a generic object category in terms of the function required of the object, based on qualitative reasoning about 3-D shape, which has the potential to lead to recognition systems of much greater generality than current CAD-based or model-based approaches.
Abstract: The work which demonstrates the feasibility of a different approach to 3-D object recognition is described. The authors construct a definition of a generic object category, such as a chair, in terms of the function required of the object. This definition is based on qualitative reasoning about 3-D shape, and does not imply any particular geometric or structural model for an object. Thus, this approach has the potential to lead to recognition systems of much greater generality than current CAD-based or model-based approaches. >

33 citations


Proceedings ArticleDOI
A. Pizano1, M.-I. Tan1, N. Gambo1
11 Sep 1991
TL;DR: A pattern recognition system is described that classifies digitized images of business forms according to a predefined set of templates and its performance has been proven to be satisfactory.
Abstract: A pattern recognition system is described that classifies digitized images of business forms according to a predefined set of templates. The process involves a training phase, where images of the template forms are scanned, analyzed and stored in a data dictionary; and a recognition phase, during which scanned form images are compared to templates in a dictionary to determine their class membership. The system has been tested under a variety of conditions and its performance has been proven to be satisfactory. >

Proceedings ArticleDOI
08 Jul 1991
TL;DR: Preliminary experiments with computer simulation show that this approach is promising for both of the applications of the proposed model of selective attention, which has a function of segmenting patterns, as well as the function of recognizing patterns.
Abstract: Selective attention is one of the most essential mechanisms for visual pattern recognition. One of the authors had previously proposed a model of selective attention, which has a function of segmenting patterns, as well as the function of recognizing patterns. The idea of this selective attention model can be extended to be used for several applications. The structure of the model used for connected character recognition is discussed. The authors offer two examples of its applications. One is the recognition and segmentation of connected characters in cursive handwriting of English words. Another example is the recognition of Chinese characters. Preliminary experiments with computer simulation, in which only a small number of characters have been taught to the models, show that this approach is promising for both of the applications. >

Journal ArticleDOI
TL;DR: A two-dimensional (2D) transform is proposed for the classification of planar objects with a centroid referenced polar representation that samples the multiple intersections of N radii with the object using the mass center and is made invariant to scaling.

Proceedings ArticleDOI
03 Jun 1991
TL;DR: A polynomial time algorithm is presented (pruned correspondence search) with good average case complexity for solving a wide class of geometric maximal matching problems, including the problem of recognizing 3-D objects from a single 2-D image.
Abstract: A polynomial time algorithm is presented (pruned correspondence search, PCS) with good average case complexity for solving a wide class of geometric maximal matching problems, including the problem of recognizing 3-D objects from a single 2-D image. The PCS algorithm is connected with the geometry of the underlying recognition problem only through calls to a verification algorithm. Efficient verification algorithms are given for the case of affine transformations among vector spaces and for the case of rigid 2-D and 3-D transformations with scale. Among the known algorithms that solve the bounded error recognition problem exactly and completely, the PCS algorithm currently has the lowest complexity. Some preliminary experiments suggest that PCS is a practical algorithm. >

Proceedings ArticleDOI
01 Mar 1991
TL;DR: An object classifier based on a 2-D object model is discussed, which works reliably under favorable conditions and is implemented on a multi-processors system and tested in real-world experiments.
Abstract: Object recognition is necessary for any mobile robot operating autonomously in the realworld. This paper discusses an object classifier based on a 2-D object model. Obstaclecandidates are tracked and analyzed, false alarms generated by the object detector arerecognized and rejected. The methods have been implemented on a multi-processorsystem and tested in real-world experiments. They work reliably under favorableconditions, but sometimes problems occur, e.g. when objects contain many features(edges) or move in front of structured background. Introduction An autonomous vehicle participating in road traffichas to master various traffic situations. Avoiding acollision with a moving or static object is an impor-tant subtask. Such objects will be called obstaclesin the sequel, regardless of their nature. Obstaclesmay appear in the environment of the autonomousvehicle at any time. Due to reasons of safety, it isnecessary to reliably detect, locate and analyze allobstacles without exception. In spite of the requiredhigh reliability each of these processes must be per-formed quickly, because the maximum permissiblespeed of the vehicle depends on the time requiredfor recognizing an obstacle and initiating an avoid-

Journal ArticleDOI
TL;DR: The mathematical characteristics of these operators are investigated in order to achieve a formal theory and the object recognition, localization, and corner and circle detection algorithms are developed.
Abstract: The integration of representation and recognition of rigid solid objects is becoming increasingly important in computer-aided design (CAD), computer-aided manufacturing (CAM), computer graphics, computer vision, and other fields that deal with spatial phenomena. The mathematical framework used for modeling solid objects is mathematical morphology, which is based on set-theoretic concept. The mathematical characteristics of these operators are investigated in order to achieve a formal theory. Using mathematical morphology as a tool, our theoretical research aims at studying the representation schemes for the dimension and tolerance of the geometric structure. Object features can be also extracted by using the mathematical morphology approach. Through a distance transformation, we can obtain the shape number, significant points database, and skeleton. We have also developed the object recognition, localization, and corner and circle detection algorithms.

Proceedings ArticleDOI
02 Jun 1991
TL;DR: The authors present an approach to the recovery and recognition of 3-D objects from a single 2-D image based on grouping the regions into aspects and using the aspect hierarchy to infer a set of volumetric primitives and their connectivity.
Abstract: The authors present an approach to the recovery and recognition of 3-D objects from a single 2-D image. Given a recognition domain consisting of a database of objects, they select a set of object-centered 3-D volumetric modeling primitives that can be used to construct the objects. They take the set of primitives and generate a hierarchical aspect representation based on their projected surfaces; conditional probabilities capture the ambiguity of mappings between levels of the hierarchy. From a region segmentation of the input image, they present a novel formulation of the recovery problem based on grouping the regions into aspects. Once the aspects are recovered, they use the aspect hierarchy to infer a set of volumetric primitives and their connectivity. The recovered primitives are then used as indices into the object database for recognition. >

Proceedings ArticleDOI
02 Jun 1991
TL;DR: The PREMIO system as discussed by the authors combines techniques of analytic graphics and computer vision to predict how features of the object will appear in images under various assumptions of lighting, viewpoint, sensor, and image processing operators.
Abstract: A model-based vision system attempts to find a correspondence between features of an object model and features detected in an image. Most feature-based matching schemes assume that all the features that are potentially visible in a view of all object will appear with equal probability. The resultant matching algorithms have to allow for 'errors' without really understanding what they mean. PREMIO is an object recognition/localization system under construction at the University of Washington that attempts to model some of the physical processes that can cause these 'errors'. PREMIO combines techniques of analytic graphics and computer vision to predict how features of the object will appear in images under various assumptions of lighting, viewpoint, sensor, and image processing operators. These analytic predictions are used in a probabilistic matching algorithm to guide the search and to greatly reduce the search space. >

Proceedings ArticleDOI
09 Apr 1991
TL;DR: An approach to generalize the hypothesis and test recognition paradigm for multisensory environments and fairly generic object models based on a generic representation of feature accuracy performs fusion both at the numeric (geometric) and at the symbolic (recognition) levels.
Abstract: The authors propose an approach to generalize the hypothesis and test recognition paradigm for multisensory environments and fairly generic object models. Matching, prediction and localization procedures are based on a generic representation of feature accuracy. This generic approach performs fusion both at the numeric (geometric) and at the symbolic (recognition) levels. Its reliability is illustrated by several real-world examples demonstrating recognition of real objects in complex cluttered environments using four types of sensory data: contour images (two viewpoints), stereovision 3-D line segments, range 3-D faces, and color images. >

Book ChapterDOI
21 Mar 1991
TL;DR: The algorithm presented is an extension of Weinberg's algorithm for determining isomorphisms of planar triply connected graphs and can be utilized for various purposes in artificial intelligence, robotics, assembly planning and machine vision.
Abstract: In this paper we present a simple and efficient algorithm for determining the rotational symmetries of polyhedral objects in O(m2) time using O(m) space where m represents the number of edges of the object. Our algorithm is an extension of Weinberg's algorithm for determining isomorphisms of planar triply connected graphs. The symmetry information detected by our algorithm can be utilized for various purposes in artificial intelligence, robotics, assembly planning and machine vision. In particular, an application of symmetry analysis to object recognition will be described in some detail.



Proceedings ArticleDOI
19 Jun 1991
TL;DR: The authors propose the use of proximity sensors on a robot hand and non-contact guarded and compliant motions to sense these features and compute geometric constraints by explicitly considering uncertainty in the acquired features.
Abstract: In a recognition and localization process, active sensing techniques allow one to exploit the discriminant power of geometric features, although they are partially occluded to some sensors. The authors propose the use of proximity sensors on a robot hand and non-contact guarded and compliant motions to sense these features. To carry out recognition, the author compute geometric constraints by explicitly considering uncertainty in the acquired features. >


Patent
31 Oct 1991
TL;DR: In this paper, the object is then displayed as white on a black background and the contour data is in the form of points indicating object surface edges, and successive points are joined to form a closed outline of the object.
Abstract: The method involves collecting object (10) data using a scanning system (11), e.g. video camera or ultrasonic, and converting the data into a binary format. The object is then displayed as white on a black background. The monitor image is scanned to detect contrast boundaries, to form contour data. The contour data is in the form of points indicating object surface edges. Consecutive points are joined to form a closed outline of the object. A computer (12) performs the necessary processing involved. ADVANTAGE - Improved recognition of closed contours.

Proceedings ArticleDOI
01 Feb 1991
TL;DR: This paper is concerned with choosing the type of operators that will be used in the combination or accrual of the functional evidence of the recognition system, to be implemented and tested using the category " chair" for case study.
Abstract: A recognition system which represents object categories by properties which can he deduced by analysis of 3-D shape has been implemented and tested using the category " chair" for case study. Functional description is used to recognize classes and identify subclasses of known categories of objects even if the specific object has never been encountered previously. Interpretation of the functionality of an object is accomplished through qualitative reasoning about its shape. This is to our knowledge the first implemented system to explore the use of purely function-based representation (that is no geometric or structural object model) to recognize 3-D objects. During the recognition process evidence is gathered as to how well the functional requirements are met by the Input structure. This paper is concerned with choosing the type of operators that will be used in the combination or accrual of the functional evidence. Three pairs of conjunctive and disjunctive operators aie evaluated. Each pair is uSe(i in the recognition process of the 100+ test objects. The results of all tests run are compared and diffeieiices are discussed.© (1991) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

Journal ArticleDOI
TL;DR: Experimental results demonstrate that the integrated system can successfully provide the 3D motion description and the identification of the vehicle in all the stereo image pairs.
Abstract: An integrated system for 3D motion estimation and object recognition with outdoor stereo image sequences as inputs is presented. The goals are to obtain from input stereo images the 3D motion description and the identification of a vehicle moving against a stationary background. In order to accomplish the desired goals, the system consists of four stages; motion estimation, distinctive feature extraction, model database, and object recognition. The results of 3D motion estimation are used to narrow down considerably the search space in the stage of object recognition. The system is applied to eight sets of stereo image pairs. Experimental results demonstrate that the system can successfully provide the 3D motion description and the identification of the vehicle in all the stereo image pairs. >

Proceedings ArticleDOI
18 Nov 1991
TL;DR: A model-based approach using B-splines for curve representation and a backpropagation neural network for text/marking recognition is adopted to have the object recognizer fast and computationally simple.
Abstract: Recognizing three-dimensional (3D) shape based on cues extracted from object curves, is discussed. For fast recognition it is desirable that the 3D curve representation is inherently simple and invariant to affine and projective transformations, and to have the object recognizer fast and computationally simple. These goals are achieved by adopting a model-based approach using B-splines for curve representation and a backpropagation neural network for text/marking recognition. The object shape was computed from the image curves using stereo imaging, and the object type was identified by recognizing or reading the text/markings on the object based on features that are invariant to the object shape, to rotation, to scaling, and to translation. >

Proceedings ArticleDOI
09 Apr 1991
TL;DR: The authors present a hierarchical problem-solving technique to match models to objects in the image using a linear function of the number of feature points and find that the storage requirements are less than those of other systems.
Abstract: Since occlusion is present in all but the most constrained environments, the recognition of partially occluded objects is of prime importance. However, almost all of the existing algorithms which resolve the occlusion problem are complicated and required extensive computation. An attempt is thus made to develop a recognition system with moderate cost and speed, and high rate of recognition. The authors present a hierarchical problem-solving technique to match models to objects in the image. They evaluate the system and find that the storage requirements are less than those of other systems. Furthermore, the time complexity of the system is merely a linear function of the number of feature points. >

Proceedings ArticleDOI
10 Nov 1991
TL;DR: A skeletonization algorithm based on the Euclidean distance function using the sequential maxima-tracking method is described which, when applied to a connected image, generates a connected skeleton composed of simple digital arcs which can be easily achieved in shape recognition.
Abstract: A skeletonization algorithm based on the Euclidean distance function using the sequential maxima-tracking method is described which, when applied to a connected image, generates a connected skeleton composed of simple digital arcs. With a slight modification, the algorithm can preserve the more important features in the skeletal branches which touch the object boundary at corners. Therefore its application to shape recognition can be easily achieved. >