scispace - formally typeset
Search or ask a question

Showing papers on "3D single-object recognition published in 1987"


Journal ArticleDOI
TL;DR: Recognition-by-components (RBC) provides a principled account of the heretofore undecided relation between the classic principles of perceptual organization and pattern recognition.
Abstract: The perceptual recognition of objects is conceptualized to be a process in which the image of the input is segmented at regions of deep concavity into an arrangement of simple geometric components, such as blocks, cylinders, wedges, and cones. The fundamental assumption of the proposed theory, recognition-by-components (RBC), is that a modest set of generalized-cone components, called geons (N £ 36), can be derived from contrasts of five readily detectable properties of edges in a two-dimensiona l image: curvature, collinearity, symmetry, parallelism, and cotermination. The detection of these properties is generally invariant over viewing position an$ image quality and consequently allows robust object perception when the image is projected from a novel viewpoint or is degraded. RBC thus provides a principled account of the heretofore undecided relation between the classic principles of perceptual organization and pattern recognition: The constraints toward regularization (Pragnanz) characterize not the complete object but the object's components. Representational power derives from an allowance of free combinations of the geons. A Principle of Componential Recovery can account for the major phenomena of object recognition: If an arrangement of two or three geons can be recovered from the input, objects can be quickly recognized even when they are occluded, novel, rotated in depth, or extensively degraded. The results from experiments on the perception of briefly presented pictures by human observers provide empirical support for the theory. Any single object can project an infinity of image configurations to the retina. The orientation of the object to the viewer can vary continuously, each giving rise to a different two-dimensional projection. The object can be occluded by other objects or texture fields, as when viewed behind foliage. The object need not be presented as a full-colored textured image but instead can be a simplified line drawing. Moreover, the object can even be missing some of its parts or be a novel exemplar of its particular category. But it is only with rare exceptions that an image fails to be rapidly and readily classified, either as an instance of a familiar object category or as an instance that cannot be so classified (itself a form of classification).

5,464 citations


Proceedings ArticleDOI
01 Mar 1987
TL;DR: It is demonstrated that the affine viewing transformation is a reasonable approximation to perspective and a clustering approach, which produces a set of consistent assignments between vertex-pairs in the model and in the image is described.
Abstract: It is demonstrated that the affine viewing transformation is a reasonable approximation to perspective. A group of image vertices and edges, called the vertex-pair, which fully determines the affine transformation between a three-dimensional model and a two-dimensional image is defined. A clustering approach, which produces a set of consistent assignments between vertex-pairs in the model and in the image is described. A number of experimental results on outdoor images are presented.

271 citations


Journal ArticleDOI
TL;DR: A set of rules to find out what appropriate features are to be used in what order to generate an efficient and reliable interpretation tree are developed and applied in a task for bin-picking objects that include both planar and cylindrical surfaces.
Abstract: This article describes a method to generate 3D-object recognition algorithms from a geometrical model for bin-picking tasks. Given a 3D solid model of an object, we first generate apparent shapes of an object under various viewer directions. Those apparent shapes are then classified into groups (representative attitudes) based on dominant visible faces and other features. Based on the grouping, recognition algorithms are generated in the form of an interpretation tree. The interpretation tree consists of two parts: the first part for classifying a target region in an image into one of the shape groups, and the second part for determining the precise attitude of the object within that group. We have developed a set of rules to find out what appropriate features are to be used in what order to generate an efficient and reliable interpretation tree. Features used in the interpretation tree include inertia of a region, relationship to the neighboring regions, position and orientation of edges, and extended Gaussian images. This method has been applied in a task for bin-picking objects that include both planar and cylindrical surfaces. As sensory data, we have used surface orientations from photometric stereo, depth from binocular stereo using oriented-region matching, and edges from an intensity image.

193 citations


Journal Article
TL;DR: It is explained that model-based recognition, programming, and control of manipulators is one of the key paradigms in computer vision and robotics and the authors concern themselves only with expected objects in the task environment of a robot.
Abstract: This article explains that model-based recognition, programming, and control of manipulators is one of the key paradigms in computer vision and robotics To recognize and manipulate objects, the authors concern themselves only with expected objects in the task environment of a robot They use data from multiple sensors (such as television, range, force, torque, touch, etc) and a priori knowledge (object models, strategies for sensing, recognition and manipulation, etc) to equip the robot with intelligence so that it can sense, plan, and manipulate objects in its environment As shown, the basic computational model for object recognition and manipulation is strongly goal-directed Their goal is the capability to acquire a variety of 3D object models automatically with all the desired information for recognition and manipulation CAD has provided new opportunities and challenges for the use of models of 3D objects Using the available 3D models of objects, one can plan recognition and manipulation strategies during the off-line phase and do efficient real-time recognition and manipulation during the runtime phase

50 citations


Journal ArticleDOI
TL;DR: In this article, the authors present a system for object representation and recognition from dense range maps, which addresses three problems, namely (i) object representation from single view dense range map, (ii) integrating data or descriptions from multiple views for model construction, and (iii) matching descriptions from a single unknown view to models.
Abstract: In this paper, we present a system for object representation and recognition from dense range maps. The system addresses three problems, namely (i) object representation from single view dense range map, (ii) integrating data or descriptions from multiple views for model construction, and (iii) matching descriptions from a single unknown view to models. Although the main goal of this paper is to develop an algorithm for solving the problem (iii) stated above, to give a complete overview of our system, we will briefly outline our solution techniques with the aid of examples, for problems (i) and (ii) as well. The objects and models are represented by regions that are a collection of surface patches homogeneous in certain intrinsic surface properties. The recognition scheme is based on matching object surface descriptions with model surface descriptions. The recognition task includes both locating the overall object and identifying its features. Location is achieved by finding a geometrical registration function that correctly superimposes an arbitrary instance of the known model and the model. A localization technique is presented which requires that correspondence be established exactly, between one point on the object surface and one on the model surface. Once the single point correspondence is specified, closed-form solutions are given for determining the attitude of the unknown view of the object in 3 space with respect to the model.

47 citations


Journal ArticleDOI
TL;DR: A new method is presented for the recognition of polyhedra in range data based on a hypothesis accumulation scheme which allows parallel implementations and creates a compact cluster in the transformation space.
Abstract: A new method is presented for the recognition of polyhedra in range data. The method is based on a hypothesis accumulation scheme which allows parallel implementations. The different objects to be recognized are modeled by a set of local geometrical patterns. Local patterns of the same nature are extracted from the scene. For the recognition of an object, local scene and model patterns having the same geometrical characteristics are matched. For each of the possible matches, the geometric transformations (i.e., rotations and translations) are computed, which allows the overlapping of the model elements with those from the scene. This transformation permits the establishment of a hypothesis on the location of the object in the scene and the determination of a point in the transformation space. The presence of an object similar to a model involves the generation of several compatible hypotheses and creates a compact cluster in the transformation space. The recognition of the object is based on the detection of this cluster. The cluster coordinates give the values of the rotations and the translations to be applied to the model such that it corresponds to the object in the scene. The exact location of this object is given by the transformed model.

35 citations


Journal ArticleDOI
TL;DR: An object representation system motivated by object recognition instead of object depiction is defined, representing the strongly visible features and relationships of non-polyhedral man-made objects.

22 citations


Proceedings ArticleDOI
01 Apr 1987
TL;DR: The properties of the log-polar visibility suggests the use of partial information to implement some forms of "reduced template matching", which imply substantial reduction in the computational cost.
Abstract: The technique of template matching in the log-polar visibility (magnitude of Fourier transform in the logarithmic radius and azimuth plane) is proposed as a promising tool for many problems of isolated object recognition irrespective of its position, scale, and rotation. The properties of the log-polar visibility suggests the use of partial information to implement some forms of "reduced template matching". These techniques imply substantial reduction in the computational cost.

18 citations


Proceedings ArticleDOI
21 Aug 1987
TL;DR: Results of the study are presented to determine which different types of surface descriptors can be reliably recovered from biquadratic equation models of various surfaces.
Abstract: We are studying classification of symbolic surface de-scriptors in classes that will allow fast approaches for 3-D object recognition. In our approach for object recognition, we will use features to hypothesize objects using parallel distributed approach, and then use models of objects to find objects that are present in a scene. Symbolic surface descriptors represent global features of an object and do not change when the object is partially occluded, while local features (such as corners or edges) may disappear en-tirely. We have developed a technique to segment surfaces and compute their polynomial surface descriptors. In this paper we present results of our study to determine which different types of surface descriptors (such as cylindrical, spherical, elliptical, hyperbolic, etc) can be reliably recovered from biquadratic equation models of various surfaces.

13 citations


Journal ArticleDOI
TL;DR: A computational method is presented for recovering 3-D shapes by exploiting multiple views of an object by providing a volumetric representation of the object starting from the occluding contours, i.e., the contours which separate the object image from the background.
Abstract: A computational method is presented for recovering 3-D shapes by exploiting multiple views of an object. The method provides a volumetric representation of the object starting from the occluding contours, i.e., the contours which separate the object image from the background. A detailed description of the geometrical transformations applied to build the volumetric representation of objects from multiple occluding contours of real images is reported. The only input parameters necessary are the orientation of the TV cameras used. The volumetric representation obtained is recognized by matching it with a library of models. Object recognition is performed in two steps. In the first step, some 3-D global structure parameters are used. If these are not sufficient to unambiguously recognize the object, in the second step the candidates chosen among the models are projected into the image planes of the TV cameras, and some 2-D features are used to perform the recognition, The models can be built either by means of a CAD technique or via the same method used to reconstruct the volumetric representation of the object.

13 citations


Journal ArticleDOI
TL;DR: A simple boundary-based shape descriptor, the normal contour distance (NCD) signature, is introduced, which does not require knowledge of the complete boundary and is suitable for recognition of partially occluded objects.

01 Jan 1987
TL;DR: In the paper the authors present a new representation for polyhedral objects which is a continuous generalization of multiview representations, and shows how to represent views or images of an object continuously over all viewpoints.
Abstract: In the paper the authors present a new representation for polyhedral objects which is a continuous generalization of multiview representations. That is, they show how to represent views or images of an object (i.e., the appearance of the object) continuously over all viewpoints. The authors then show how this representation overcomes some of the drawbacks of multiview representation for object recognition, namely, that the image to be recognized will usually not match one of the stored images exactly, that the representation contains no information about how to interpolate between views, and that the representation is large.


Book ChapterDOI
01 Jan 1987
TL;DR: This paper describes an approach to the recognition of stacked objects with planar and curved surfaces by a combination of data-driven and model-driven search processes.
Abstract: This paper describes an approach to the recognition of stacked objects with planar and curved surfaces. The system works in two phases. In the learning phase, a scene containing a single object is shown one at a time. The range data of a scene are obtained by a range finder. The data points are grouped into many surface elements consisting of several points. The surface elements are merged together into regions. The regions are classified into three classes: plane, curved and undefined. The program extends the curved regions by merging adjacent curved and undefined regions. Thus the scene is represented by plane regions and smoothly curved regions. The description of each scene is built in terms of properties of regions and relations between them. This description is stored as an object model. In the recognition phase, an unknown scene is described in the same way as in the learning phase. Then the description is matched to the object models so that the stacked objects are recognized sequentially. Efficient matching is achieved by a combination of data-driven and model-driven search processes. Experimental results for blocks and machine parts are shown.

Journal ArticleDOI
TL;DR: A novel approach to object recognition and classification for robotic applications using the automated decision tree generation technique, which relies upon simple statistical measurements extracted from object classes and represented in the form of a distance matrix ‘D’ to form a decision tree.
Abstract: The objective of this article is to present a novel approach to object recognition and classification for robotic applications using the automated decision tree generation technique. The method developed relies upon simple statistical measurements extracted from object classes and represented in the form of a distance matrix ‘D’ to form a decision tree. The algorithms presented here are computationally efficient and simple to implement. The effectiveness of the features are automatically assessed, allowing for the automatic selection of only those features needed to accomplish object recognition and classification. The performance of the algorithms are successfully tested and demonstrated.

Journal ArticleDOI
01 Aug 1987
TL;DR: A robotic vision system is being developed which uses three-dimensional laser range data to sense its environment and the use of theorem proving techniques to hypothesize object identities and recognize the viewed object as an instance of the appropriate viewpoint independent model descriptor.
Abstract: A robotic vision system is being developed which uses three-dimensional laser range data to sense its environment The recognition subsystem incorporates topological as well as geometric information to identify viewed objects Theorem-proving techniques are used to produce symbolic pattern matches The major contributions of the recognition subsystem are 1) the use of viewpoint independent descriptors as the basis for representing known object models and 2) the use of theorem proving techniques to hypothesize object identities and recognize the viewed object as an instance of the appropriate viewpoint independent model descriptor The representation scheme permits describing objects at a variety of topological and geometric levels Furthermore, the use of viewpoint independent descriptors facilitates object recognition from a single arbitrary view despite missing information or the inclusion of viewpoint dependent artifacts The theorem-proving approach establishes a symbolic correspondence between viewpoint independent features in the (recognized) model and features in the observed data The recognition process uses a three-phase approach First, hypotheses are generated which correspond to model descriptors that are likely to match the data Evidence is applied to viable hypotheses to produce a partial match The partial match is then used to constrain the full recognition process which leads to object identification This strategy has been found to constrain strongly the search space of possible matches and leads to large reductions in recognition times Results of the recognition process on synthetic and actual laser range data are presented for several objects The system is shown to operate with robustness and alacrity

Proceedings Article
01 Jan 1987
TL;DR: In this paper, a distributed associative memory is used to classify an object, reconstruct the memorized version of the object, and estimate the magnitude of changes in scale or rotation.
Abstract: This paper describes an approach to 2-dimensional object recognition. Complex-log conformal mapping is combined with a distributed associative memory to create a system which recognizes objects regardless of changes in rotation or scale. Recalled information from the memorized database is used to classify an object, reconstruct the memorized version of the object, and estimate the magnitude of changes in scale or rotation. The system response is resistant to moderate amounts of noise and occlusion. Several experiments, using real, gray scale images, are presented to show the feasibility of our approach.

Proceedings ArticleDOI
01 Apr 1987
TL;DR: An algorithm for object recognition which is based on translation and rotation invariant signatures is presented and good classification results where obtained even with objects having similar shapes.
Abstract: An algorithm for object recognition which is based on translation and rotation invariant signatures is presented. The Radon transform is used as a tool for efficient implementation of the algorithm. Good classification results where obtained even with objects having similar shapes.

Proceedings ArticleDOI
08 Jan 1987
TL;DR: The algorithm that enables a machine vision system to recognize a three-dimensional object from a series of multiple views uses pseudo reflection tomography to map N images of the object into a single signature image that incorporates information from all N views.
Abstract: We discuss the algorithm that enables a machine vision system to recognize a three-dimensional object from a series of multiple views. The algorithm uses pseudo reflection tomography to map N images of the object into a single signature image that incorporates information from all N views. Rotation, scale, and brightness-invariant features are extracted from the signature image that enable robust recognition of the object.

Journal ArticleDOI
TL;DR: A method of three-dimensional motion analysis for multiple moving objects which may include an object for which a unique interpretation of the motion is difficult, and the displacement vector can be determined accurately by iterating the estimation of the object motion.
Abstract: Optical flow analysis is a powerful means to extract an object from the dynamic image. Not only can it extract an object, but it can also recover the three-dimensional structure. However, in practice, there are many cases where the optical flow is difficult to obtain or the motion of the object does not permit a unique interpretation. Based on the gradient method, this paper proposes a method of three-dimensional motion analysis for multiple moving objects which may include an object for which a unique interpretation of the motion is difficult. By the gradient method, the three-dimensional motion parameters of a rigid object can be estimated without using the optical flow as a least-square-error solution for the system of linear equations, if the three-dimensional structure of the object is already given. Based on the residual square-sum error in the estimation, the image plane is segmented if it contains regions of different motions. If the motions are recognized as being similar, the regions are merged. Thus, the object is extracted by iterating the segmentation and merge on the image plane. The uniqueness of the motion is evaluated by the condition number for the coefficient matrix of the system equations. If the motion of the extracted object can be interpreted uniquely, the displacement vector can be determined accurately by iterating the estimation of the object motion.

Proceedings ArticleDOI
01 Mar 1987
TL;DR: The particular concentration of this paper is the development of an active model matching approach called a procedural model that uses the surface geometry available in a range image to limit the amount of searching that must be done.
Abstract: A general three dimensional image analysis environment is presented. The environment proved extremely flexible, and a three dimensional object recognition system was designed using this environment. The recognition system addresses all the issues associated with view independent object recognition. The particular concentration of this paper is the development of an active model matching approach called a procedural model The procedural model uses the surface geometry available in a range image to limit the amount of searching that must be done. The surface geometry is determined by solving an eigensystem, and utilized to determine the pose of the object.

Proceedings ArticleDOI
01 Dec 1987
TL;DR: This paper presents efficient techniques for object representation and recognition from dense range (depth) maps, where the objects and models are represented by regions that are a collection of surface patches homogeneous in curvature-based surface properties.
Abstract: Object manipulation by a robot requires some degree of sensing, and of all possible forms of sensory perception, visual sensing is one of the most attractive because of its non-contact nature, high speed and accuracy. Visual data can be obtained in two or three-dimensional (depth map) form from commercially available sensors. Interpreting visual data has been one of the major themes of computer vision researchers in the past decade.In this paper, we present efficient techniques for object representation and recognition from dense range (depth) maps. The objects and models are represented by regions that are a collection of surface patches homogeneous in curvature-based surface properties. The recognition scheme is based on matching object surface descriptions with model surface descriptions. The recognition task includes both locating the overall object and identifying each of its features. Location is achieved by finding a geometrical “registration” function that correctly superimposes an arbitrary instance of the known model and the model. A localization technique is presented which requires that correspondence be established exactly, between one point on the object surface and one on the model surface. Once the single point correspondence is specified, closed form solutions are given for determining the attitude of the unknown view of the object in 3 space with respect to the model.

Proceedings ArticleDOI
21 Aug 1987
TL;DR: A review of optical pattern recognition algorithms and techniques for various levels of computer vision, reaching the recent upper levels of artificial intelligence, are presented, and briefly summarized.
Abstract: A review of optical pattern recognition algorithms and techniques for various levels of computer vision, reaching the recent upper levels of artificial intelligence, are presented, and briefly summarized.

01 Jul 1987
TL;DR: Algorithms developed under NASA sponsorship for Space Station applications to demonstrate the value of a hypothesized architecture for a Video Image Processor (VIP) are presented and the potential for deployment of highly-parallel multi-processor systems for these algorithms are discussed.
Abstract: Computer vision, especially color image analysis and understanding, has much to offer in the area of the automation of Space Station tasks such as construction, satellite servicing, rendezvous and proximity operations, inspection, experiment monitoring, data management and training. Knowledge-based techniques improve the performance of vision algorithms for unstructured environments because of their ability to deal with imprecise a priori information or inaccurately estimated feature data and still produce useful results. Conventional techniques using statistical and purely model-based approaches lack flexibility in dealing with the variabilities anticipated in the unstructured viewing environment of space. Algorithms developed under NASA sponsorship for Space Station applications to demonstrate the value of a hypothesized architecture for a Video Image Processor (VIP) are presented. Approaches to the enhancement of the performance of these algorithms with knowledge-based techniques and the potential for deployment of highly-parallel multi-processor systems for these algorithms are discussed.

Proceedings ArticleDOI
27 Mar 1987
TL;DR: In this article, the usefulness of applying the complex moment features into the tactile image for object recognition has been explored and some complex moment invariants have been derived and implementation of those features has been conducted.
Abstract: Complex moments have been considered as useful features for object recognition in general. In this paper, the usefulness of applying the complex moment features into the tactile image for object recognition has been explored. Some complex moment invariants have been derived and implementation of those features has been conducted. With those moment invariants, we can elimiate the effect of lateral displacement and rotation from the tactile images. Through the generation of a decision tree and the utilization of the complex moment features, the shape of the objects from the tactile sensor can easily be recognized.

Proceedings ArticleDOI
30 Apr 1987
TL;DR: The thesis is that there are no shortcuts in recognition, and a recognition methodology must pay substantial attention to each of the following five steps: conditioning, labeling, grouping, extracting, and matching.
Abstract: Computer recognition and inspection of objects is, in general , a complex procedure requiring a variety of kinds of steps which successively transform the iconic data to recognition information. We hypothesize that the difficulty of today's computer vision and recognition technology to be able to handle unconstrained environments is due to the fact that the existing algorithms are specialized and do not develop one or more of the necessary steps to a high enough degree. Our thesis is that there are no shortcuts. A recognition methodology must pay substantial attention to each of the following five steps: conditioning, labeling, grouping, extracting, and matching.

Book ChapterDOI
01 Jan 1987
TL;DR: Following the human example, any artificial vision system should process information such that the results are invariant to the vagaries of the data acquisition process.
Abstract: Artificial Intelligence (AI) deals with the types of problem solving and decision making that humans continuously face in dealing with the world. Such activity involves by its very nature complexity, uncertainty, and ambiguity which can “distort” the phenomena (e.g., imagery) under observation. However, following the human example, any artificial vision system should process information such that the results are invariant to the vagaries of the data acquisition process.

Proceedings ArticleDOI
01 Jan 1987
TL;DR: The method used by the authors involves computing the Mellin transform of the CHF coefficients, generating an appropriate feature vector based on these objects, and comparing these vectors with a reference and if the test feature vector is near enough the reference then a match is declared.
Abstract: INTRODUCTIONRotation invariant recognition of two -dimensional objects (e.g., images) can be done robustlyusing circular harmonic function (CHF) expansion coefficients.1-4Scale (size) invariant recognitioncan be done using Mellin transforms.5 The two activities together can be combined into a singlealgorithm that efficiently enables scale and rotation invariant recognition. Such algorithms, whetherbased on CHF's or other methods, are described in the literature.6 -7 The method used by the authorsinvolves computing the Mellin transform of the CHF coefficients, generating an appropriate featurevector based on these objects, and comparing these vectors with a reference. If the test featurevector is near enough the reference then a match is declared. The procedure seems to be quite robust.Let us then begin with the premise that we can recognize an object in an image regardless of theorientation and scale of the object. How is this useful to us in recognizing real (i.e., three -dimensional) objects? A three -dimensional (3 -D) object is much more complex than a two -dimensional

Proceedings ArticleDOI
27 Mar 1987
TL;DR: This paper uses a special approach to recognized the block pictures of Chinese characters by comparing their stochastic sectionalgrams which are obtained from original samples by using the Markovian dynamic programming algorithm.
Abstract: This paper uses a special approach to recognized the block pictures of Chinese characters by comparing their stochastic sectionalgrams which are obtained from original samples. In order to calculate the risk the absolute value of the difference between the image-occurr-ence probabilities of corresponding quanta in two sectionalgrams is summed. One of these two sectionalgrams is derived from the input pattern and the other from the prototype pattern. The input pattern recognition rate is inverse proportional to the value of the risk. The Markovian dynamic programming is used in this paper to check the risk. Further more, the circular layer code approach is used in the recognition system such that the recognition of input object is independent of the object's input direction. By following the different type of quanta expression, there are two Markovian dynamic programming algorithm presented in this paper to recognize the circular layer code pattern.

Proceedings ArticleDOI
27 Mar 1987
TL;DR: Some solutions for a simple domain of 2-D sticklike objects and some aspects of the implementa-tion that might be useful for shape indexing in other domains are offered.
Abstract: The visual features most important for object recognition are those having to do with the shape of an object. In one approach to recognition The image is decomposed an image into a set of relatively simple shapes (parts) and predicates that describe these parts and the relationships between them. Recognition is a matter of matching this parts-relations description to some parts-relations model in a large database of models. Matching is computationally intensive; unless the parts-relations representation is organized for efficient indexing, recognition times get intractably long. We review general requirements for such a representation as proposed by Marr and Nishihara[1]. Their prescription leaves open general questions that must be answered in any particular implementation. We offer some solutions for a simple domain of 2-D sticklike objects and point out some aspects of the implementa-tion that might be useful for shape indexing in other domains.