scispace - formally typeset
Search or ask a question

Showing papers on "3D single-object recognition published in 1990"


Journal ArticleDOI
01 Oct 1990
TL;DR: An efficient matching algorithm, which assumes affine approximation to the prospective viewing transformation, is proposed and was successfully tested in recognition of industrial objects appearing in composite occluded scenes.
Abstract: New techniques are described for model-based recognition of the objects in 3-D space. The recognition is performed from single gray-scale images taken from unknown viewpoints. The objects in the scene may be overlapping and partially occluded. An efficient matching algorithm, which assumes affine approximation to the prospective viewing transformation, is proposed. The algorithm has an offline model preprocessing (shape representation) phase which is independent of the scene information and a recognition phase based on efficient indexing. It has a straightforward parallel implementation. The algorithm was successfully tested in recognition of industrial objects appearing in composite occluded scenes. >

348 citations



Journal ArticleDOI
TL;DR: The authors found that the human visual reference frame is tied to egocentric coordinates and that people can recognize an object in different orientations only when an object can be identified uniquely by the arrangement of its parts along a single dimension.
Abstract: How do people recognize an object in different orientations? One theory is that the visual system describes the object relative to a reference frame centered on the object, resulting in a representation that is invariant across orientations. Chronometric data show that this is true only when an object can be identified uniquely by the arrangement of its parts along a single dimension. When an object can only be distinguished by an arrangement of its parts along more than one dimension, people mentally rotate it to a familiar orientation. This finding suggests that the human visual reference frame is tied to egocentric coordinates.

199 citations


Journal ArticleDOI
TL;DR: The data management aspects of this approach: the insertion and deletion of object models, a performance/space trade-off which can be used to improve the recognition capabilities of the authors' approach, and a secondary memory implementation of the approach are emphasized.
Abstract: We have been involved with formulating query processing strategies for an object-oriented, integrated, textual/iconic database management system. For this task, model-based representations of images by content are being used, as well as a new, efficient object recognition approach which also permits the efficients insertion and deletion of object models, called data-driven indexed hypotheses. This paper emphasizes the data management aspects of this approach: the insertion and deletion of object models, a performance/space trade-off which can be used to improve the recognition capabilities of our approach, and a secondary memory implementation of our approach

165 citations


Book ChapterDOI
01 Apr 1990
TL;DR: Extensions of the basic paradigm which reduce its worst case recognition complexity are discussed, and the Geometric Hashing with the Hough Transform and the alignment techniques are compared.
Abstract: The Geometric Hashing paradigm for model-based recognition of objects in cluttered scenes is discussed. This paradigm enables a unified approach to rigid object recognition under different viewing transformation assumptions both for 2-D and 3-D objects obtained by different sensors, e.g. vision, range, tactile. It is based on an intensive off-line model preprocessing (learning) stage, where model information is indexed into a hash-table using minimal, transformation invariant features. This enables the on-line recognition algorithm to be particularly efficient. The algorithm is straightforwardly parallelizable. Initial experimentation of the technique has led to successful recognition of both 2-D and 3-D objects in cluttered scenes from an arbitrary viewpoint. We, also, compare the Geometric Hashing with the Hough Transform and the alignment techniques. Extensions of the basic paradigm which reduce its worst case recognition complexity are discussed.

148 citations


Proceedings ArticleDOI
04 Dec 1990
TL;DR: A benchmark evaluation test of a model-based recognition system was tested on a series of aerial reconnaissance images to evaluate recognition performance on the task of airfield monitoring.
Abstract: A benchmark evaluation test of a model-based recognition system is discussed. The system was tested on a series of aerial reconnaissance images to evaluate recognition performance on the task of airfield monitoring. The effectiveness of the model pose constraint for recognition is discussed as well as an approach for selection of model features. The use of distance transforms for model hypothesis confirmation is also discussed. >

39 citations


Proceedings ArticleDOI
04 Dec 1990
TL;DR: The authors address the problem of learning object-specific recognition strategies from object descriptions and sets of interpreted training images, to build a strategy that minimizes the expected cost of recognition, subject to accuracy constraints imposed by the user.
Abstract: The problem of automatically learning knowledge-directed control strategies is considered. In particular, the authors address the problem of learning object-specific recognition strategies from object descriptions and sets of interpreted training images. A separate recognition strategy is developed for every object in the domain. The goal of each recognition strategy is to identify any and all instances of the object in an image, and give the 3-D position (relative to the camera) of each instance. The goal of the learning process is to build a strategy that minimizes the expected cost of recognition, subject to accuracy constraints imposed by the user. >

32 citations


Book ChapterDOI
01 Jan 1990
TL;DR: In the context of computer vision, the recognition of three-dimensional objects typically consists of image capture, feature extraction, and object model matching.
Abstract: In the context of computer vision, the recognition of three-dimensional objects typically consists of image capture, feature extraction, and object model matching. During the image capture phase, a camera senses the brightness at regularly spaced points, or pixels, in the image. The brightness at these points is quantized into discrete values; the two-dimensional array of quantized values forms a digital image, the input to the computer vision system. During the feature extraction phase, various algorithms are applied to the digital image to extract salient features such as lines, curves, or regions. The set of these features, represented by a data structure, is then compared to the database of object model data structures in an attempt to identify the object. Clearly, the type of features that need to be extracted from the image depends on the representation of objects in the database.

29 citations


Book ChapterDOI
23 Apr 1990
TL;DR: It is shown that correct termination procedures can reduce the exponential search to quartic, which agrees with empirical data for cluttered object recognition and implies that one must select subsets of the data likely to have come from one object, before finding a correspondence between data and model features.
Abstract: Earlier work on using constrained search to locate objects in cluttered scenes showed that the expected search is quadratic in the number of features, if all the data comes from one object, but is exponential if spurious data is included. Consequently, many methods terminate search once a “good” interpretation is found. Here, we show that correct termination procedures can reduce the exponential search to quartic. This analysis agrees with empirical data for cluttered object recognition. These results imply that one must select subsets of the data likely to have come from one object, before finding a correspondence between data and model features.

28 citations


Proceedings ArticleDOI
04 Dec 1990
TL;DR: BONSAI identifies and localizes 3-D objects in range images of one or more parts which have been designed on a CAD system via constrained search of the interpretation tree, using unary and binary constraints to prune the search space.
Abstract: A description is presented of BONSAI, a model-based 3-D object recognition system, which identifies and localizes 3-D objects in range images of one or more parts which have been designed on a CAD system. Recognition is performed via constrained search of the interpretation tree, using unary and binary constraints (derived automatically from the CAD models) to prune the search space. Experiments with over 200 images of 20 different parts demonstrate that the constrained search approach to 3-D object recognition has comparable accuracy to other existing systems. >

28 citations


Proceedings ArticleDOI
03 Jul 1990
TL;DR: Dynamic vision may be utilized for detecting and classifying objects that could be obstacles for a mobile robot if conditions are fairly favorable obstacles are reliably recognized and false alarms are rejected.
Abstract: Dynamic vision may be utilized for detecting and classifying objects that could be obstacles for a mobile robot. Methods for accomplishing this are introduced. They have been implemented on a multiprocessor vision system and tested in outdoor environments. If conditions are fairly favorable obstacles are reliably recognized and false alarms (e.g. caused by shadows) are rejected. Among the main problems which have not yet been completely solved are the tracking of the road in a great distance, the recognition of the contours of an object when many features are visible on the object's surface, and the separation of the object from the background. >

Journal ArticleDOI
TL;DR: In this article, a new approach to 3D object recognition using multiple 2D camera views is proposed, which includes a turntable, a top camera, and a lateral camera.
Abstract: A new approach to 3D object recognition using multiple 2D camera views is proposed. The recognition system includes a turntable, a top camera, and a lateral camera. Objects are placed on the turntable for translation and rotation in the recognition process. 3D object recognition is accomplished by matching sequentially input 2D silhouette shape features against those of model shapes taken from a set of fixed camera views. This is made possible through the use of top-view shape centroids and principal axes for shape registration, as well as the use of a decision tree for feature comparison. The process is simple and efficient, involving no complicated 3D surface data computation and 3D object representation. The learning process can also be performed automatically. Good experimental results and fast recognition speed prove the feasibility of the proposed approach.

Proceedings ArticleDOI
16 Jun 1990
TL;DR: The authors describe the geometrical criteria which define viewpoint-invariant features to be extracted from 2-D line drawings of 3-D objects and discuss the extraction of these features, which forms the initial stage of a generic object recognition system, the Primal Access Recognition of Visual Objects (PARVO) system.
Abstract: The authors describe the geometrical criteria which define viewpoint-invariant features to be extracted from 2-D line drawings of 3-D objects. They also discuss the extraction of these features, which forms the initial stage of a generic object recognition system, the Primal Access Recognition of Visual Objects (PARVO) system. In this system, part-based qualitative descriptions are built and matched to coarse 3-D object models for recognition. The segmentation and labeling of the constituent parts of an object rely on the 3-D properties inferred from the presence of its 2-D features. The original motivation for PARVO its recognition by components, a theory of human image understanding from the field of psychology. Definitions of the geometrical criteria defining the viewpoint-invariant features are introduced. Examples of results obtained by applying these criteria to a typical line drawing are shown. >

Journal ArticleDOI
TL;DR: A new data-driven technique for part recognition that models an object or a scene as a composition of several components and efficiently identifies a component of the unanalyzed portion of the scene using a k-d tree-based component-index.

Proceedings ArticleDOI
16 Jun 1990
TL;DR: A technique for recognizing a 2-D unoccluded polygonal object by combining the alignment method with efficient string matching algorithms is presented, based on a single anchor point: the gravitation center of the contour (GCC).
Abstract: A technique for recognizing a 2-D unoccluded polygonal object by combining the alignment method with efficient string matching algorithms is presented. The approach is based on a single anchor point: the gravitation center of the contour (GCC) of the object. The GCC is stable and insensitive to digitization errors, and it can always be found and calculated very efficiently. Additionally, it is universal and does not depend on the specific library of known objects. In this approach, most of the work is done in the preprocessing stage, when all of the library objects are transformed into a GCC canonical representation form. For the recognition stage, the canonical representation of the unknown object concatenated to itself is considered as a 'text' and the canonical representation of the known library objects as a set of 'words'. The recognition problem is then reduced to finding all of the occurrences of the 'words' in the 'text', for which an efficient O(n) algorithm is introduced, where n is the order of the polygon being recognized. This approach was evaluated with several sets of objects from different fields, and very satisfactory results were obtained. >

Proceedings ArticleDOI
16 Jun 1990
TL;DR: A 3-D moment method of object identification and positioning is proposed and moment features of range data can be used in view-independent object recognition when the three-layer perceptron encodes the feature space distribution of the object in the weights of the network.
Abstract: A 3-D moment method of object identification and positioning is proposed. Moments are computed from 3-D CAT image functions, 2.5-D range data, space curves, and discrete 3-D points. Objects are recognized by their shapes via moment invariants. Using an algebraic method, scalars and vectors are extracted from a compound of moments using Clebsch-Gordon expansion. The vectors are used to estimate position parameters of the object. Moment features of range data can be used in view-independent object recognition when the three-layer perceptron encodes the feature space distribution of the object in the weights of the network. Objects are recognized from an arbitrary viewpoint by the trained network. >

Proceedings ArticleDOI
01 Mar 1990
TL;DR: A fast method for detecting the presence of known multi-colored objects in a scene based on the assumption that the color histogram of an image can contain object "signatures" which are invariant over a wide range of scenes and object poses is presented.
Abstract: Fast object recognition is critical for robots in the real world. However, geometry-based object recognition methods calculate the pose of the object as part of the recognition process and hence are inherently slow. As a result, they are not suitable for tasks such as searching for an object in a room. If pose calculation is eliminated from the process and a scheme is used that simply detects the likely presence of the object in a scene, considerable efficiency can be gained. This paper contains a discussion of the requirements of any searching task and presents a fast method for detecting the presence of known multi-colored objects in a scene. The method is based on the assumption that the color histogram of an image can contain object "signatures" which are invariant over a wide range of scenes and object poses. The resulting algorithm has been easily implemented in off-the-shelf hardware and used to build a robot system which can sweep its gaze over a room searching for an object.

Proceedings ArticleDOI
04 Dec 1990
TL;DR: A vision system is presented which automatically generates an object recognition strategy from a 3D model, and recognizes the object using this strategy using the line representation generated from the3D model and the image features to localize the object.
Abstract: A vision system is presented which automatically generates an object recognition strategy from a 3D model, and recognizes the object using this strategy. In this system, the appearances of an object from various viewpoints are described with visible 2D features, such as parallel lines and ellipses. Then, the features in the appearances are ranked according to the number of viewpoints from which they are visible. The rank and the feature extraction cost for each feature are considered to generate a tree-like strategy graph. It shows an efficient feature search order when the viewpoint is unknown, starting with commonly occurring features and ending with features specific to a certain viewpoint. The system searches for features in the order indicated by the graph. After detection, the system compares the line representation generated from the 3D model and the image features to localize the object. >

Proceedings ArticleDOI
04 Dec 1990
TL;DR: A model-based technique is presented for recognizing 3-D objects using a novel object representation and a novel generic matching engine for automatic target recognition system built by the authors.
Abstract: A model-based technique is presented for recognizing 3-D objects using a novel object representation and a novel generic matching engine. Each object to be recognized is defined by an appearance model (AM) describing the expected appearances of the object over a range of aspects in a specific type of imagery. The AMs for all the objects to be recognized are organized in an AM hierarchy (AMH) defining object classes. This AMH, together with an image event extracted from the input imagery and primitives obtained by decomposing the event, constitute the input to the matching engine. This engine determines which object in the AMH most likely corresponds to the event and its primitives. The approach described is at the heart of an automatic target recognition system built by the authors. >

Proceedings ArticleDOI
13 May 1990
TL;DR: A model-based vision system is proposed in which a commercial CAD system has been used for object modeling and certain features of the model are extracted, while others are precalculated and stored.
Abstract: A model-based vision system is proposed in which a commercial CAD system has been used for object modeling. Assuming that the model is known, the corresponding object in the scene is located. Given the CAD model of an object, certain features of the model are extracted, while others are precalculated and stored. The given dense 3-D range image is segmented into a set of homogeneous surface patches using a segmentation procedure. Properties such as curvature, surface normal, and surface area are approximated for each surface patch. For each extracted surface patch, three filters are applied to the previously obtained model features to find the best match. Then, a global consistency filter is applied to remove ambiguities and to find the best matched model. >

Book ChapterDOI
01 Apr 1990
TL;DR: A new algorithm is presented for recognising 3D polyhedral objects in a 2D segmented image using local geometric constraints between 2D line segments that is potentially very fast, and contributes to its robustness.
Abstract: A new algorithm is presented for recognising 3D polyhedral objects in a 2D segmented image using local geometric constraints between 2D line segments. Results demonstrate the success of the algorithm at coping with poorly segmented images that would cause substantial problems for many current algorithms. The algorithm adapts to use with either 3D line data or 2D polygonal objects; either case increases its efficiency. The conventional approach of searching an interpretation tree and pruning it using local constraints is discarded; the new approach accumulates the information available from the local constraints and forms match hypotheses subject to two global constraints that are enforced using the competitive paradigm. All stages of processing consist of many extremely simple and intrinsically parallel operations. This parallelism means that the algorithm is potentially very fast, and contributes to its robustness. It also means that the computation can be guaranteed to complete after a known time.

Proceedings ArticleDOI
16 Jun 1990
TL;DR: A novel approach to 3-D object recognition based on fuzzy subset theory using Zernike moment invariants of the silhouette of the unknown object to form a set of fuzzy-weighted quantities called fuzzy quaternions, which are matched against those of known objects at predetermined viewpoints.
Abstract: A novel approach to 3-D object recognition based on fuzzy subset theory is described. This method uses Zernike moment invariants of the silhouette of the unknown object to form a set of fuzzy-weighted quantities called fuzzy quaternions. These are matched against those of known objects at predetermined viewpoints. The determination of the Zernike moment invariants can be faster if the equivalent contour integrals are calculated instead. By employing a novel rho -correction scheme, errors due to the digitization are reduced. To speed up the recognition process, a modified Nelder-Mead simplex method is used. Preliminary results demonstrate the potential of the fuzzy quaternion as a viable basis for discrimination. It is concluded that the primary merits of this approach are the ease of model formation, the simplicity of the recognition scheme, and the speed of object recognition. Its disadvantages include the inability to recognize occluded objects and a poor object recognition rate for high perspective distortion of objects. >

Proceedings ArticleDOI
06 Nov 1990
TL;DR: It is shown that HONNs are superior to ID3 with respect to recognition accuracy, whereas, on a sequential machine, ID3 classifies examples faster once trained.
Abstract: The authors present results of experiments comparing the performance of the ID3 symbolic learning algorithm with a higher-order neural network (HONN) in the distortion invariant object recognition domain. In this domain, the classification algorithm needs to be able to distinguish between two objects regardless of their position in the input field, their in-plane rotation, or their scale. It is shown that HONNs are superior to ID3 with respect to recognition accuracy, whereas, on a sequential machine, ID3 classifies examples faster once trained. A further advantage of HONNs is the small training set required. HONNs can be trained on just one view of each object, whereas ID3 needs an exhaustive training set. >

Proceedings ArticleDOI
13 May 1990
TL;DR: A method is presented for using the high-level descriptions of objects (i.e. their models) to recognize them in an image to detect and recognize partially occluded and camouflaged objects.
Abstract: A method is presented for using the high-level descriptions of objects (i.e. their models) to recognize them in an image. A complex object is viewed as a congregation of a set of component parts with simple shapes. The model of an object, therefore, describes the shapes of its component parts and states the geometrical relationships among those parts. This method also includes a recognition strategy which is a simple high-level description of how that object must be recognized. The shape descriptions of the parts are first used to extract a set of candidates for each part from the image. An object candidate is formed whenever a group of part candidates satisfy the model's geometrical relationships. A model-based prediction and verification scheme is used to verify (or refute) the existence of the object candidates with low certainty. The scheme not only substantially increases the accuracy of recognition, but also makes it possible to detect and recognize partially occluded and camouflaged objects. Another advantage of the approach is that to recognize a new object, one only needs to define its model, and thus no programming is required. The user's task is further simplified by the fact that each newly defined model is sufficient for recognizing a new category of objects. >

Proceedings ArticleDOI
04 Dec 1990
TL;DR: A novel approach is presented for pruning the amount of search needed to match image features to object models, and the authors show a dramatic reduction in search provided by activation nets.
Abstract: A novel approach is presented for pruning the amount of search needed to match image features to object models. The technique relies on active networks which capture various visibility and geometric constraints between features of a model to prune these features from search space during matching. The networks, which can be efficiently implemented in Boolean logic, integrate harmoniously with the previous work in feature recognition and object matching. A method is proposed for clustering model features (vsets) and four types of constraints which assist in building the networks. The authors show, both analytically and empirically, the dramatic reduction in search provided by activation nets. >

Proceedings ArticleDOI
27 Nov 1990
TL;DR: A vision system for color object recognition is presented that is low in cost and sufficiently fast and reliable for simple industrial applications and based on closest path, thresholding, and mathematical modeling algorithms.
Abstract: A vision system for color object recognition is presented. The proposed system is low in cost and sufficiently fast and reliable for simple industrial applications. The system comprises a fiber-optic illumination system, lenses, camera, frame grabber and IBM PC/AT microcomputer with a color monitor. The recognition algorithms are based on RGB and HSI modes. The approach and development of these algorithms are elaborated. The accuracy, reliability, and speed of recognition has been very good. The recognition is based on closest path, thresholding, and mathematical modeling algorithms. >

Proceedings ArticleDOI
Y. Okamoto1, Yoshinori Kuno1, S. Okada
27 Nov 1990
TL;DR: A vision system that automatically generates an object recognition strategy from a 3D model and recognizes the object by this strategy is presented and shows an efficient feature search order when the viewer direction is unknown.
Abstract: A vision system that automatically generates an object recognition strategy from a 3D model and recognizes the object by this strategy is presented. In this system, the appearances of an object from various view directions are described with 2D features, such as parallel lines and ellipses. These appearances are then ranked, and a tree-like strategy graph is generated. It shows an efficient feature search order when the viewer direction is unknown. The object is recognized by feature detection guided by the strategy. After the features are detected, the system compares the line representation generated from a 3D model and the image features to localize the object. Perspective projection is used in the localization process to obtain the precise position and attitude of the object, while orthographic projection is used in the strategy generation process to allow symbolic manipulation. >

Proceedings ArticleDOI
01 Feb 1990
TL;DR: This paper presents a method of recognizing planar objects in 3-D space from a single image by representing each object by its dominant points, and introduces a measure, known as sphericity, derived from an affine transform to indicate the quality of match among dominant points.
Abstract: Object recognition is a major theme in computer vision. In this paper, we present a method of recognizing planar objects in 3-D space from a single image. Objects in a scene may be occluded, and the orientation of the objects is arbitrary. We represent each object by its dominant points, and pose the recognition problem as a dominant-point matching problem. We introduce a measure, known as sphericity, derived from an affine transform to indicate the quality of match among dominant points. A clustering algorithm, probe-and-block, is used to guide the matching. We use a least squares fit among dominant points to estimate object location in the scene. A heuristic measure is finally computed to verify the match.

Proceedings ArticleDOI
17 Jun 1990
TL;DR: An ART-1 neural network is applied to the problem of 3-D object recognition using a multiple-view modeling scheme and is used at the coarse search stage to reduce the search space in the model database.
Abstract: An ART-1 neural network is applied to the problem of 3-D object recognition using a multiple-view modeling scheme. In this scheme, the 3-D object model database consists of sets of features extracted from 2-D projections rendered by a number of predetermined viewpoints on a view sphere enclosing the object. To recognize the object, a coarse-to-fine search strategy is adopted. ART-1 is used at the coarse search stage to reduce the search space in the model database. Experiments carried out to corroborate the proposed scheme are discussed

Proceedings ArticleDOI
01 Oct 1990
TL;DR: The authors address the problem of industrial scene object recognition for the purposes of sensor-based robot assembly with a technique in which the representation of the models is performed by using a finite set of primitives and relations between them, characterized by a proper set of parameters.
Abstract: The authors address the problem of industrial scene object recognition for the purposes of sensor-based robot assembly. They propose a technique in which the representation of the models (training phase) is performed by using a finite set of primitives and relations between them (object elements), characterized by a proper set of parameters. They describe a new index building mechanism, using both the recognized object primitives and the relations between the primitives. They obtain the characteristic set of primitives and relations for each model of a given industrial object set by eliminating the common (similar) object primitives and relations. The characteristic set is referred to as the global index. The real scene object analysis (recognition phase) includes the recognition of scene primitives and relations between them, as well as their location in the global index. The proposed new index organization gives direct access to the particular object elements. significantly speeding up the hypothesis generation for the recognized scene object. The generated hypothesis is verified using the whole object representation obtained during the training phase. >