scispace - formally typeset
Search or ask a question

Showing papers on "3D single-object recognition published in 1988"


Proceedings ArticleDOI
05 Dec 1988
TL;DR: A general method for model-based object recognition in occluded scenes is presented based on geometric hashing, which stands out for its efficiency and applications both in 3-D and 2-D.
Abstract: A general method for model-based object recognition in occluded scenes is presented. It is based on geometric hashing. The method stands out for its efficiency. We describe the general framework of the method and illustrate its applications for various recogni- tion problems both in 3-D and 2-D. Special attention is given to the recognition of 3-D objects in occluded scenes from 2-D gray scale images. New experimental results are included for this important case.

933 citations


Journal ArticleDOI
01 Aug 1988
TL;DR: Issues and techniques are discussed to automatically compile object and sensor models into a visual recognition strategy for recognizing and locating an object in three-dimensional space from visual data.
Abstract: Issues and techniques are discussed to automatically compile object and sensor models into a visual recognition strategy for recognizing and locating an object in three-dimensional space from visual data. Automatic generation of recognition programs by compilation, in an attempt to automate this process, is described. An object model describes geometric and photometric properties of an object to be recognized. A sensor model specifies the sensor characteristics in predicting object appearances and variations of feature values. It is emphasized that the sensors, as well as objects, must be explicitly modeled to achieve the goal of automatic generation of reliable and efficient recognition programs. Actual creation of interpretation trees for two objects and their execution for recognition from a bin of parts are demonstrated. >

203 citations


Proceedings ArticleDOI
05 Jun 1988
TL;DR: The author starts from classical theories for differential and algebraic invariants not previously used in image understanding, and studies general projective transformations, which include both perspective and orthographic projections as special cases.
Abstract: A major goal of computer vision is object recognition, which involves matching of images of an object, obtained from different, unknown points of view. Since there are infinitely many points of view, one is faced with the problem of a search in a multidimensional parameter space. A related problem is the stereo reconstruction of 3-D surfaces from multiple 2-d images. The author proposes to solve these fundamental problems by using geometrical properties of the visible shape that are invariant to a change in the point of view. To obtain such invariants, he starts from classical theories for differential and algebraic invariants not previously used in image understanding. As they stand, these theories are not directly applicable to vision. He suggests extensions and adaptations of these methods to the needs of machine vision. He then studies general projective transformations, which include both perspective and orthographic projections as special cases. >

182 citations


Journal ArticleDOI
01 Aug 1988
TL;DR: The author provides a general introduction to computer vision by focusing on two-dimensional object recognition, i.e. recognition of an object whose spatial orientation, relative to the viewing direction, is known.
Abstract: The author provides a general introduction to computer vision. He discusses basic techniques and computer implementations, and also indicates areas in which further research is needed. He focuses on two-dimensional object recognition, i.e. recognition of an object whose spatial orientation, relative to the viewing direction is known. >

106 citations


Proceedings ArticleDOI
05 Jun 1988
TL;DR: The approach presented is to develop an object shape representation that incorporates a component subpart hierarchy to allow for efficient and correct indexing into an automatically generated model library as well as for relative parametrization among subparts, and a scale hierarchy, to allowed for a general to specific recognition procedure.
Abstract: A description is given of the development of a model-based vision system that utilizes hierarchies of both object structure and object scale. The focus of the research is to use these hierarchies to achieve robust recognition based on effective organization and indexing schemes for model libraries. The goal of the system is to recognize parameterized instances of nonrigid model objects contained in a large knowledge base, despite the presence of noise and occlusion. The approach presented is to develop an object shape representation that incorporates a component subpart hierarchy, to allow for efficient and correct indexing into an automatically generated model library as well as for relative parametrization among subparts, and a scale hierarchy, to allow for a general to specific recognition procedure. The implemented system uses a representation based on significant contour curvature changes and recognition engine based on geometric constraints of feature properties. Examples of the system's performance are given, followed by an analysis of the results. >

104 citations


Proceedings ArticleDOI
01 Feb 1988
TL;DR: In this paper, the authors established formal bounds on the combinatorics of this approach and showed that the expected complexity of recognizing isolated objects is quadratic in the number of model and sensory fragments, but exponential in the size of the correct interpretation.
Abstract: The problem of recognizing rigid objects from noisy sensory data has been successfully attacked in previous work by using a constrained search approach. Empirical investigations have shown the method to be very effective when recognizing and localizing isolated objects, but less effective when dealing with occluded objects where much of the sensory data arises from objects other than the one of interest. When clustering techniques such as the Hough transform are used to isolate likely subspaces of the search space, empirical performance in cluttered scenes improves considerably. In this note, we establish formal bounds on the combinatorics of this approach. Under some simple assumptions, we show that the expected complexity of recognizing isolated objects is quadratic in the number of model and sensory fragments, but that the expected complexity of recognizing objects in cluttered environments is exponential in the size of the correct interpretation. We also provide formal bounds on the efficacy of using the Hough transform to preselect likely subspaces, showing that the problem remains exponential, but that in practical terms the size of the problem is significantly decreased.

58 citations


Proceedings ArticleDOI
05 Dec 1988
TL;DR: This paper concentrates on sensor modeling and its relationship with strategy generation, because it is regarded as the bottle neck to automatic generation of object recognition programs.
Abstract: One of the most important and systematic methods to build modelbased vision systems is that to generate object recognition programs automatically from given geometric models. Automatic generation of object recognition programs requires several key components to be developed: object models to describe the geometric and photometric properties of an object to be recognized, sensor models to predict object appearances from the object model under a given sensor, strategy generation using the pred,icted appearances to produce an recognition strategy, and program generation converting the recognition strategy into an executable code. This paper concentrates on sensor modeling and its relationship with strategy generation, because we regard it as the bottle neck to automatic generation of object recognition programs. We consider two aspects of sensor characteristics: sensor detectability and sensor reliability. Sensor detectability specifies what kinds of features can be detected and in what condition the features are detected; sensor reliability is a confidence for the detected features. We define the configuration space to represent sensor characteristics. We propose a representation method for sensor detectability and rcliability in the configuration space. Finally, we investigate how to use the proposed sensor modcl in automatic generation of an objcct recognition program.

34 citations


01 Jan 1988
TL;DR: One major emphasis of this paper is that sensors, as well as objects, must be explicitly modeled in order to achieve the goal of automatic generation of reliable and efficient recognition programs.
Abstract: This paper discusses issues and techniques to automatically compile object and sensor models into a visual recognition strategy for recognizing and locating an object in three-dimensional space from visual data. Historically, and even today, most successful model-based vision programs are handwritten; relevant knowledge of objects for recognition is extracted from examples of the object, tailored for the particular environment, and coded into the program by the implementors. If this is done properly, the resulting program is effective and efficient, but it requires long development time and many vision experts. Automatic generation of recognition programs by compilation attempts to automate this process. In particular, it extracts from the object and sensor models those features that are useful for recognition, and the control sequence which must be applied to deal with possible variations of the object appearances. The key components in automatic generation are: object modeling, sensor modeling, prediction of appearances, strategy generation, and program generation. An object model describes geometric and photometric properties of an object to be recognized. A sensor model specifies the sensor characteristics in predicting object appearances and variations of feature values. The appearances can be systematically grouped into aspects, where aspects are topologically equivalent classes with respect to the object features "visible" to the sensor. Once aspects are obtained, a recognition strategy is generated in the form of an interpretation tree from the aspects and their predicted feature values. An interpretation tree consists of two parts: a part which classifies an unknown region into one of the aspects, and a part which determines its precise attitude (position and orientation) within the classified aspect. Finally, the strategy is converted into a executable program by using object-oriented programming. One major emphasis of this paper is that sensors, as well as objects, must be explicitly modeled in order to achieve the goal of automatic generation of reliable and efficient recognition programs. Actual creation of interpretation trees for two toy objects and their execution for recognition from a bin of parts are demonstrated. University Libraries Carnegie Mellon University Pittsburgh, Pennsylvania 1521

34 citations


Proceedings ArticleDOI
05 Dec 1988
TL;DR: An approach for using the perspective projection aspect gruph representation to alleviate the problems encountered by previous researchers is outlined, a particular implementation of this general approach is described, and data is presented to illustrate the effectiveness.
Abstract: Several researchers have previously described approaches to 3-D object recognition which use nonlinear optimization to control the matching of features of a 3-D object niodel to features found in an image. Recognition, in this context, includes estimating the parameters of translation and orientation of the object. A major problem acknowledged by previous researchers is how to efficiently choose a set of starting paranleter estimates which will avoid recognition errors due to local minima. The unique contribution of this paper is that it outlines an approach for using the perspective projection aspect gruph representation to alleviate the problems encountered by previous researchers, describes a particular implementation of this general approach, and presents data to illustrate the effectiveness (of the approac!].

33 citations


Proceedings ArticleDOI
14 Nov 1988
TL;DR: After a brief summary of range acquisition techniques, some of the progress made with three-dimensional data in the field of computer vision is examined.
Abstract: After a brief summary of range acquisition techniques, some of the progress made with three-dimensional data in the field of computer vision is examined. The present state of three-dimensional object recognition using range data is surveyed. Some work involving three-dimensional object recognition using intensity images is also included when it is applicable to, or extendable to, recognition with range data. >

33 citations


Proceedings ArticleDOI
G.B. Dunn1, J. Segen1
24 Apr 1988
TL;DR: A method that enables a robot system to learn how to grasp an object is presented, and makes fewer assumptions, and requires less prior information about objects, than nonlearning grasp-determination methods.
Abstract: A method that enables a robot system to learn how to grasp an object is presented. The method combines an automatic grasp discovery process with a visual recognition technique. When an object is seen the first time, the system experiments with it, seeking a way of grasping and lifting the object by trial and error, using visual information and input from the robot gripper. A discovered grasp configuration is saved along with the object's shape. When the same object is presented again in a different position and orientation, and the system recognizes its shape, the grasp information is retrieved and transformed to match the position and orientation of the object, so it can be picked in the first trial. The approach makes fewer assumptions, and requires less prior information about objects, than nonlearning grasp-determination methods. The presented method was implemented in a system with a robot and a servo-controlled two-finger gripper. Several examples of its operation are reported. >

Proceedings ArticleDOI
05 Jun 1988
TL;DR: An approach to 2D model-based object recognition is developed, suitable for implementation on a highly parallel SIMD (single-instruction, multiple data stream) computer, and is extremely robust in the presence of occlusion.
Abstract: An approach to 2D model-based object recognition is developed, suitable for implementation on a highly parallel SIMD (single-instruction, multiple data stream) computer. Object models and image data are represented as contour features. Transformation sampling is used to determine the optimal model-feature-to-image-feature transformation by sampling the space of possible transformations. Only a small part of this space need actually be sampled due to the constraints placed on transformations by individual matches of image features to model features. The procedure requires O(Kmn) processors and O(log/sup 2/ (Kmn)) time, where m is the number of model features, n is the number of image features, and K depends on the size of the image. The procedure works well and is extremely robust in the presence of occlusion. An implementation of the procedure on the Connection Machine is described, and some experimental results given. >

Journal ArticleDOI
TL;DR: An application of possibility theory to object recognition presents, an object oriented knowledge representation and fuzzy pattern matching procedures, issued from possibility theory, described in a system called Classic.

Proceedings ArticleDOI
05 Dec 1988
TL;DR: The canonical form of a shape is introduced and a linear algorithm for transforming a shape to its canonical form and this approach reduces the similarity recognition under affine transformation from a search problem in six dimensional space into one dimensional search problem.
Abstract: This paper presents new techniques for recog:nition of similarity between shapes under affine transformation. We introduce the canonical form of a shape and give a linear algorithm for transforming a shape to its canonical form. The recognition of similarity between shapes under affine transforination can he done by computing the difference between their canonical forms. Our approach reduces the similarity recognition1 under affine transformation from a search problem in six dimensional space into one dimensional search problem. Furtherm'ore, we apply these techniques to recognition of objects under similarity transformation and problem of point matching. We obtain a linear algorithm for object recognition under similarily transformation and a linear average time algorithm for point matching under affine transformation. Some experimental results are reported.

Proceedings ArticleDOI
05 Dec 1988
TL;DR: It is proved that combinations of three co-planar points and lines can determine the alignment transformation uniquely or almost uniquely.
Abstract: The ability to determine the geometrical transformation which aligns a 3D object with its 2D image plays an important role in recognition and other visual tasks. We suggest that since usually the possible. transformations are constrained, they can be determined by using a small number of features. We prove that combinations of three co-planar points and lines can determine the alignment transformation uniquely or almost uniquely.

Proceedings ArticleDOI
14 Nov 1988
TL;DR: First, feature-point correspondence is extracted between an input pattern and a reference pattern by elastic matching and a deformation vector field (DVF) is generated.
Abstract: The problem of extraction, description, and estimation of handwriting deformation in online character recognition is discussed. First, feature-point correspondence is extracted between an input pattern and a reference pattern by elastic matching and a deformation vector field (DVF) is generated. Second, the DVF is expanded into an infinite series by applying iterative local affine transformations. Finally, the interpattern distance is calculated between the input pattern and the reference pattern superposed by low-order components of local affine transformations. Recognition tests made on cursive kanji character data have revealed the high discrimination ability of this method. >

Journal ArticleDOI
TL;DR: The maximum a posteriori (MAP) estimation concept is applied to the problem of object recognition with several distributed sensors and it is shown that in binary object recognition the MAP object recognition also minimizes the mean-square error.
Abstract: The maximum a posteriori (MAP) estimation concept is applied to the problem of object recognition with several distributed sensors. It is shown that in binary object recognition the MAP object recognition also minimizes the mean-square error. Simulation results show that the performance of the MAP object recognition is, in general, at least as good as the best performance by the sensors used. >

Proceedings ArticleDOI
22 Aug 1988
TL;DR: The first version of the recognition program has been written and applied to the recognition of a jet airplane in synthetic aperture radar (SAR) images and has used a SAR simulator as a sensor model, so that it can predict those object features which are reliably detectable by the sensors.
Abstract: This paper presents a model-based object recognition method which combines a bottom-up evidence accumulation process and a top-down hypothesis verification process. The hypothesize-and-test paradigm is fundamental in model-based vision. However, research issues remain on how the bottom-up process gathers pieces of evidence and when the top-down process should take the lead. To accumulate pieces of evidence, we use a configuration space whose points represent a configuration of an object (ie. position and orientation of an object in an image). If a feature is found which matches a part of an object model, the configuration space is updated to reflect the possible configurations of the object. A region in the configuration space where multiple pieces of evidence from such feature-part matches overlap suggests a high probability that the object exists in an image with a configuration in that region. The cost of the bottom-up process to further accumulate evidence for localization, and that of the top-down process to recognize the object by verification, are compared by considering the size of the search region and the probability of success of verification. If the cost of the top-down process becomes lower, hypotheses are generated and their verification processes are started. The first version of the recognition program has been written and applied to the recognition of a jet airplane in synthetic aperture radar (SAR) images. In creating a model of an object, we have used a SAR simulator as a sensor model, so that we can predict those object features which are reliably detectable by the sensors. The program is being tested with simulated SAR images, and shows promising performance.

01 Jan 1988
TL;DR: Range image interpretation with a view of obtaining low-level features to guide mid-level and high-level segmentation and recognition processes is described, and various applications of surface curvatures in mid and high level recognition processes are discussed.
Abstract: Three dimensional scene analysis in an unconstrained and uncontrolled environment is the ultimate goal of computer vision. Explicit depth information about the scene is of tremendous help in segmentation and recognition of objects. Range image interpretation with a view of obtaining low-level features to guide mid-level and high-level segmentation and recognition processes is described. No assumptions about the scene are made and algorithms are applicable to any general single viewpoint range image. Low-level features like step edges and surface characteristics are extracted from the images and segmentation is performed based on individual features as well as combination of features. A high level recognition process based on superquadric fitting is described to demonstrate the usefulness of initial segmentation based on edges. A classification algorithm based on surface curvatures is used to obtain initial segmentation of the scene. Objects segmented using edge information are then classified using surface curvatures. Various applications of surface curvatures in mid and high level recognition processes are discussed. These include surface reconstruction, segmentation into convex patches and detection of smooth edges. Algorithms are run on real range images and results are discussed in detail.

ReportDOI
01 Feb 1988
TL;DR: This paper generates a recognition program from an interpretation tree that classifies an object into an appropriate attitude group, which has a similar appearance, and converts each feature extracting or matching operation into an individual processing entity, called an object.
Abstract: : This paper presents an approach to using object-oriented programming for the generation of a object recognition program that recognizes a complex 3-D object within a jumbled pile. We generate a recognition program from an interpretation tree that classifies an object into an appropriate attitude group, which has a similar appearance. Each node of an interpretation tree represents a feature matching. We convert each feature extracting or matching operation into an individual processing entity, called an object. Two kinds of objects have been prepared: data objects and event objects. A data object is used for representing geometric objects (such as edges and regions) and extracting features from geometric objects. An event object is used for feature matching and attitude determination. A library of prototypical objects is prepared and an executable program is constructed by properly selecting and instantiating modules from it. The object-oriented programming paradigm provides modularity and extensibility. This method has been applied to the generation of a recognition program for a toy wagon. The generated program has been tested with real scenes and has recognized the wagon in a pile. Keywords: Robotics; Libraries.

Proceedings ArticleDOI
22 Aug 1988
TL;DR: This paper presents a framework between an object model and the object's appearances, and considers two aspects of sensor characteristics: sensor detectability and sensor reliability; and defines the configuration space to represent sensor characteristics.
Abstract: A model-based vision system requires models in order to predict object appearances. How an object appears in the image is the result of interaction between the object properties and the sensor characteristics. Thus in model-based vision, we ought to model the sensor as well as the object. Previously, the sensor model was not used in model-based vision or, at least, was contained in the object model implicitly. This paper presents a framework between an object model and the object's appearances. We consider two aspects of sensor characteristics: sensor detectability and sensor reliability. Sensor detectability specifies what kinds of features can be detected and in what condition the features are detected; sensor reliability is a confidence for the detected features. We define the configuration space to represent sensor characteristics. We propose a representation method for sensor detectability and reliability in the configuration space. Finally, we investigate how to apply the sensor model to a model-based vision system, in particular, automatic generation of an object recognition program from a given model.

Journal ArticleDOI
TL;DR: It is shown that geometric modeling can be used to provide an object description (visible surface orientation referred to as a needle map) suited to the problem of 3-D object recognition and attitude evaluation.
Abstract: A vision system to recognize 3-D objects in a scene and to evaluate their attitude is presented. This system uses a model-based approach to determine the attitude of 3-D industrial parts in the context of bin packing. It is based on the matching of the extended Gaussian image (EGI) extracted from the surface normal map (needle map) of an observed object with the EGI generated from a 3-D model using geometric modeling. It is shown that geometric modeling can be used to provide an object description (visible surface orientation referred to as a needle map) suited to the problem of 3-D object recognition and attitude evaluation. >

Proceedings ArticleDOI
14 Nov 1988
TL;DR: A prototype 3D object recognition system is described that is composed of three major sections; an object representation module, a feature extraction and matching module, and a recognition control strategy module.
Abstract: A prototype 3D object recognition system is described that is composed of three major sections; an object representation module, a feature extraction and matching module, and a recognition control strategy module. The object representation module uses an algorithm which constructs perspective projection aspect graphs of convex polyhedra. The feature extraction and matching module uses Fourier descriptors to characterize the complete 2D projection of an object. The recognition control strategy module uses the aspect graph object representation to control the application of a constrained optimization algorithm. The system is implemented in C on a Sun workstation, and some simple recognition experiments are reported that demonstrate the validity of the overall concept. >

Proceedings ArticleDOI
14 Nov 1988
TL;DR: A method is described for the recognition of partially occluded 2-D objects that considers a set of corners, parallel lines, and so on as typical features of an object.
Abstract: A method is described for the recognition of partially occluded 2-D objects. This method considers a set of corners, parallel lines, and so on as typical features of an object. Possible candidate models are estimated from these features, and structural matching is performed between these models and features obtained from a picture by checking the combinations of various features. Even if the whole structure is not obtained due to a partial occlusion, the system can infer an object if some unique features of the object are obtained. Partial shapes and extracted lines are matched in detail with model candidates when they are limited to one or a few. >

Proceedings ArticleDOI
19 Feb 1988
TL;DR: An object recognition system is presented to handle the computational complexity posed by a large model base, an unconstrained viewpoint, and the structural complexity and detail inherent in the projection of an object.
Abstract: An object recognition system is presented to handle the computational complexity posed by a large model base, an unconstrained viewpoint, and the structural complexity and detail inherent in the projection of an object. The design is based on two ideas. The first is to compute descriptions of what the objects should look like in the im-age, called predictions, before the recognition task begins. This reduces actual recognition to a 2D matching process, speeding up recognition time for 3D objects. The second is to represent all the predictions by a single, combined IS-A and PART-OF hierarchy called a prediction hierarchy. The nodes in this hierarchy are partial descriptions that are common to views and hence constitute shared processing subgoals during matching. The recognition time and storage demands of large model bases and complex models are substantially reduced by subgoal sharing: projections with similarities explicitly share the recognition and representation of their common aspects. A prototype system for the automatic compilation of a prediction hierarchy from a 3D model base is demonstrated using a set of polyhedral objects and projections from an unconstrained range of viewpoints. In addition, the adaptation of prediction hierarchies for use on the UMass Image Understanding Architecture is considered. Object recognition using prediction hierar-chies can naturally exploit the hierarchical parallelism of this machine.

Proceedings ArticleDOI
14 Nov 1988
TL;DR: A model-based vision system which recognizes 3D objects in an image is presented, and a candidate transformation is quantitatively hypothesized for each object by initially matching a few corresponding points in the primitive.
Abstract: A model-based vision system which recognizes 3D objects in an image is presented. The procedure is divided into two phases: qualitative and quantitative. First, component primitives of objects are qualitatively detected in the image, so as to invoke efficiently as few candidate object models as necessary from a number of models and to get corresponding points between models and data. Then, a candidate transformation is quantitatively hypothesized for each object by initially matching a few corresponding points in the primitive; the match is tested and adjusted for verification by matching all the points in the model. >

Proceedings ArticleDOI
22 Aug 1988
TL;DR: This paper discusses the approach to recognizing objects in range images using CAD databases, and formally defines the features used in the approach and discuss their strengths and weaknesses.
Abstract: The marriage of machine vision systems and CAD databases will be useful in solving many industrial problems. In this paper, we discuss our approach to recognizing objects in range images using CAD databases. The models in the databases are used to generate recognition strategies and to test hypothesis generated by the early recognition system. The role of features has been recognized since early days of pattern recognition research. Here we discuss what are the desirable characteristics of features in a 3-D object recognition system. We formally define the features used in our approach and discuss their strengths and weaknesses.

Proceedings ArticleDOI
14 Nov 1988
TL;DR: The fuzzy-weighted distance presented shows that every feature of every sample has a different effect on distance in pattern recognition and classification, and is determined by human thinking in object recognition.
Abstract: The fuzzy-weighted distance presented shows that every feature of every sample has a different effect on distance in pattern recognition and classification. The difference is described by using the concept of feature odds defined by the authors, which is derived from fuzzy set theory and is determined by human thinking in object recognition. The reasonableness of the proposed distance is discussed. Experimental results are presented that show the advantages of this distance. >

01 Aug 1988
TL;DR: A model based approach is presented for computing geometric features, which is used to estimate orientation of objects with respect to the line of sight.
Abstract: Geometric and intensity features are very useful in object recognition. An intensity feature is a measure of contrast between object pixels and background pixels. Geometric features provide shape and size information. A model based approach is presented for computing geometric features. Knowledge about objects and imaging system is used to estimate orientation of objects with respect to the line of sight.

Proceedings ArticleDOI
14 Nov 1988
TL;DR: A local feature-aggregation method for recognizing two-dimensional objects based on their CAD models that can handle cases in which the objects are translated, rotated, scaled and occluded, and it is well suited for parallel implementation.
Abstract: A local feature-aggregation method for recognizing two-dimensional objects based on their CAD models is presented. The method can handle cases in which the objects are translated, rotated, scaled and occluded, and it is well suited for parallel implementation. Two types of local features, the L structures and the U structures, are extracted from the input image and matched with those of a model to search for an object similar to the model. Each of the matches hypothesizes the locations of the object in the input image, and score (similarity measure) is computed and associated with the hypothesized location to indicate the probability of the match. Matches that hypothesize the same location will have the score associated with the location incremented. A cluster of hypothesized locations with high scores indicates the probable existence of the object in the input image. >