scispace - formally typeset
Search or ask a question

Showing papers on "Pose published in 1990"


Proceedings ArticleDOI
Homer H. Chen1
04 Dec 1990
TL;DR: The author describes a polynomial method that, unlike previous methods, does not require prior knowledge about the location of the object.
Abstract: Consideration is given to a specific pose determination problem in which the sensory features are lines and the matched reference features are planes. The lines discussed are different from edge lines of an object in that they are not the intersection of boundary faces of the object. The author describes a polynomial method that, unlike previous methods, does not require prior knowledge about the location of the object. Closed-form solutions for orthogonal, coplanar, and parallel feature configurations of critical importance in real applications are derived. Basic findings concerning the necessary and sufficient conditions under which the pose determination problem can be solved are presented. >

42 citations


Proceedings ArticleDOI
13 May 1990
TL;DR: An algorithm is proposed for pose estimation based on the volume measurement of tetrahedra composed of feature-point triplets extracted from an arbitrary quadrangular target and the lens center of the vision system that makes it a potential candidate for real-time robotic tasks.
Abstract: An algorithm is proposed for pose estimation based on the volume measurement of tetrahedra composed of feature-point triplets extracted from an arbitrary quadrangular target and the lens center of the vision system. Using a pinhole model (lens distortion is taken into account separately) and a quadrangular target, for which only the six distance measurements between all pairs of feature points are known, the complete pose is determined using an all-geometric closed-form solution for the six parameters of the pose (three translation components and three rotation components). This method has been tested using synthetic and real data and shown to be efficient, accurate, and robust. Its speed, in particular, makes it a potential candidate for real-time robotic tasks. >

39 citations


Proceedings ArticleDOI
04 Dec 1990
TL;DR: It is shown that curved planar objects have shape descriptors that are unaffected by the position, orientation and intrinsic parameters of the camera, which means that the pose of an object can be determined by backprojecting known conics.
Abstract: It is shown that curved planar objects have shape descriptors that are unaffected by the position, orientation and intrinsic parameters of the camera. These shape descriptors can be used to index quickly and efficiently into a large model base of curved planar objects, because their value is independent of pose and unaffected by perspective. Thus, recognition can proceed independent of calculating pose. Object curves are represented using conics, attached with a fitting technique that commutes with projection. This means that the pose of an object can be determined by backprojecting known conics. The authors show examples of recognition and pose determination using real image data. >

32 citations


Book ChapterDOI
01 Jan 1990
TL;DR: In the context of computer vision, the recognition of three-dimensional objects typically consists of image capture, feature extraction, and object model matching.
Abstract: In the context of computer vision, the recognition of three-dimensional objects typically consists of image capture, feature extraction, and object model matching. During the image capture phase, a camera senses the brightness at regularly spaced points, or pixels, in the image. The brightness at these points is quantized into discrete values; the two-dimensional array of quantized values forms a digital image, the input to the computer vision system. During the feature extraction phase, various algorithms are applied to the digital image to extract salient features such as lines, curves, or regions. The set of these features, represented by a data structure, is then compared to the database of object model data structures in an attempt to identify the object. Clearly, the type of features that need to be extracted from the image depends on the representation of objects in the database.

29 citations


Proceedings ArticleDOI
13 May 1990
TL;DR: It is shown that the approximate model can be used to build lookup tables for each of the triangles of a model and that they speed up the estimation of an object pose.
Abstract: An exact method for computing the position of a triangle in space from its image is presented. Also presented is an approximate method based on orthoperspective, an approximation of perspective which produces lower errors for off-center triangle images than scaled orthographic projection. A comparison is made of exact and approximate solutions for the triangle pose. This comparison gives the relative combinations of image and triangle characteristics which are likely to generate the largest errors. Model-based pose estimation techniques which match image and model triangles require large numbers of matching operations in real-world applications. It is shown that the approximate model can be used to build lookup tables for each of the triangles of a model and that they speed up the estimation of an object pose. >

23 citations


Journal ArticleDOI
01 Nov 1990
TL;DR: The multisensory information integration approach represents sensory information in a sensor-independent form and formulates an optimization problem to find a minimum-error solution to the problem.
Abstract: A general approach is presented for the integration of vision, range, proximity, and touch sensory data to derive a better estimate of the position and orientation (pose) of an object appearing in the work space. Efficient and robust methods for analyzing vision and range data to derive an interpretation of input images are discussed. Vision information analysis includes a model-based object recognition module and an image-to-world coordinate transformation module to identify the three-dimensional (3-D) coordinates of the recognized objects. The range information processing includes modules for reprocessing, segmentation, and 3-D primitive extraction. The multisensory information integration approach represents sensory information in a sensor-independent form and formulates an optimization problem to find a minimum-error solution to the problem. The capabilities of a multisensor robotic system are demonstrated by performing a number of experiments using an industrial robot equipped with several sensors of differing types. >

14 citations


Proceedings ArticleDOI
01 Mar 1990
TL;DR: A fast method for detecting the presence of known multi-colored objects in a scene based on the assumption that the color histogram of an image can contain object "signatures" which are invariant over a wide range of scenes and object poses is presented.
Abstract: Fast object recognition is critical for robots in the real world. However, geometry-based object recognition methods calculate the pose of the object as part of the recognition process and hence are inherently slow. As a result, they are not suitable for tasks such as searching for an object in a room. If pose calculation is eliminated from the process and a scheme is used that simply detects the likely presence of the object in a scene, considerable efficiency can be gained. This paper contains a discussion of the requirements of any searching task and presents a fast method for detecting the presence of known multi-colored objects in a scene. The method is based on the assumption that the color histogram of an image can contain object "signatures" which are invariant over a wide range of scenes and object poses. The resulting algorithm has been easily implemented in off-the-shelf hardware and used to build a robot system which can sweep its gaze over a room searching for an object.

12 citations


Proceedings ArticleDOI
13 May 1990
TL;DR: In this article, the use of one or more monocular images to estimate the three-dimensional position of objects is investigated, where the identities of the objects are known, and geometric models are assumed to be available.
Abstract: The use of one or more monocular images to estimate the three-dimensional position of objects is investigated. The identities of the objects are known, and geometric models are assumed to be available. Linear features extracted from sensor data are interpreted as corresponding with model features by search of an interpretation tree built using prior position estimates. Object positions are updated by maximum-likelihood estimation. Position estimation results from an implemented system are presented, demonstrating the location of partially occluded objects in a cluttered scene. >

12 citations


Proceedings ArticleDOI
13 May 1990
TL;DR: The results of experiments which use the INGEN (inference engine for generic object recognition) system to guide a robot in the manipulation of postal objects are presented.
Abstract: Generic shape recognition is the problem of determining the pose and dimensions of objects for which only topological models are available (that is, the shape of the object is known but the size is unknown). One such application domain is the handling and sorting of postal objects. Because metrical information relating object features to one another is not available, the more common model-based approaches are inadequate. The INGEN (inference engine for generic object recognition) system uses a data-driven approach to recognize objects with generic shapes, such as parallelepipeds and cylinders, when the dimensions of the objects are unknown. Size and pose estimation are facilitated by a geometric reasoning process which extends objects in the direction away from the sensor until they physically contact other objects in the scene. The results of experiments which use the INGEN system to guide a robot in the manipulation of postal objects are presented. >

8 citations


Proceedings ArticleDOI
01 Jan 1990
TL;DR: For complex planar objects, pose determination can be reduced to the simpler problem of pose determination for a pair of known planar conies for which the available information does in fact determine the pose of the model.
Abstract: Projectively invariant shape descriptors efficiently identify instances of object models in images without reference to object pose. These descriptions rely on frame independent representations of planar curves, using plane conies. We show that object pose can be determined from coplanar curves, given such a frame independent representation. This result is demonstrated for real image data. The shape of objects in images changes as the camera is moved around. This extremely simple observation represents the dominant problem in model based vision. Nielsen [4, 5] first suggested using projectively invariant labels as landmarks for navigation. Recent papers [1, 2] have shown that it is possible to compute shape descriptors of arbitrary plane objects that are unaffected by camera position. These descriptors are known as transformational invariants. At no stage in this process, however, is the pose of the model determined. In this paper, we show that the available information does in fact determine the pose of the model. In particular, for complex planar objects, pose determination can be reduced to the simpler problem of pose determination for a pair of known planar conies.

7 citations


Book ChapterDOI
17 Apr 1990

Proceedings ArticleDOI
01 Feb 1990
TL;DR: This paper presents a method of recognizing planar objects in 3-D space from a single image by representing each object by its dominant points, and introduces a measure, known as sphericity, derived from an affine transform to indicate the quality of match among dominant points.
Abstract: Object recognition is a major theme in computer vision. In this paper, we present a method of recognizing planar objects in 3-D space from a single image. Objects in a scene may be occluded, and the orientation of the objects is arbitrary. We represent each object by its dominant points, and pose the recognition problem as a dominant-point matching problem. We introduce a measure, known as sphericity, derived from an affine transform to indicate the quality of match among dominant points. A clustering algorithm, probe-and-block, is used to guide the matching. We use a least squares fit among dominant points to estimate object location in the scene. A heuristic measure is finally computed to verify the match.

Proceedings ArticleDOI
03 Jul 1990
TL;DR: In this paper, a method for determining the six degrees-of-freedom geometric relation between two objects is described based on utilizing a combination of a simple fibre-optic proximity sensor and robot motions.
Abstract: A method for determining the six degrees-of-freedom geometric relation between two objects is described. It is based on utilizing a combination of a simple fibre-optic proximity sensor and robot motions. The basic principle is to estimate the six translational and orientational parameters describing the geometric relation between two objects with Newton's method. Experimental tests show that the method is well applicable for correcting robot motions from gripping errors. >

Journal ArticleDOI
03 Jan 1990
TL;DR: An algorithm is presented which uses an enclosing object and the central moments for an initial estimate and a refined estimation is achieved by a nonlinear least squares technique.
Abstract: This paper is concerned with determining the position and orientation of a three-dimensional object using computer vision. An algorithm is presented which uses an enclosing object and the central moments for an initial estimate. A refined estimation is achieved by a nonlinear least squares technique.

Journal Article
TL;DR: An approach for estimating the pose and motion of a known moving object in three dimensions from a sequence of monocular images is considered, with the ultimate goal of using the obtained estimates for controlling the movements of a robot arm.
Abstract: An approach for estimating the pose and motion of a known moving object in three dimensions from a sequence of monocular images is considered. The principle is to obtain initial estimates of the pose and motion parameters and to update them by using feature location measurements made from subsequent monocular ima.ge frames. The ultimate goal is to use the obtained estimates for controlling the movements of a robot arm.

Dissertation
01 Jan 1990
TL;DR: The thesis discusses the difficulties when an operator is controlling a remote robot to perform manipulation tasks and how object localization techniques can be used together with other techniques such as predictor display and time desynchronization to help to overcome these difficulties.
Abstract: Object localization and its applications in tele-autonomous systems are studied in this thesis. The thesis first gives a thorough investigation of the object localization problem and then presents two object localization algorithms together with the methods of extracting several important types of object features. The first algorithm is based on line-segment to line-segment matching. Line range sensors are used to extract line-segment features from an object, the features may be boundary edges of planar surfaces, the axes of cylindrical surfaces, conic surfaces, or other types of surfaces of revolution. The extracted features are matched to corresponding model features to compute the location of the object. The second algorithm is more general. The inputs of the algorithm are not limited only to the line features. Featured points (point-to-point matching) and featured unit direction vectors (vector-to-vector matching) can also be used as the inputs of the algorithm, and there is no upper limit on the number of the features inputed. The algorithm will allow the use of redundant features to find a better solution. The algorithm uses dual number quaternions to represent the position and orientation of an object and uses the least squares optimization method to find an optimal solution for the object's location. The advantage of using this representation is that the method solves for the location estimation by minimizing a single cost function associated with the sum of the orientation and position errors and thus has a better performance on the estimation, both in accuracy and in speed, than that of other similar algorithms. The thesis discusses the difficulties when an operator is controlling a remote robot to perform manipulation tasks. The main problems facing the operator are time-delays on the signal transmission and the uncertainties of the remote environment. How object localization techniques can be used together with other techniques such as predictor display and time desynchronization to help to overcome these difficulties are then discussed. The thesis discusses two cases where object localization can help: (1) the case where direct manual control is used to perform a tele-manipulation task; (2) the case where the remote system has certain degree of automation ability.

Proceedings ArticleDOI
01 Mar 1990
TL;DR: A new algorithm of pose estimation is proposed, taking advantage of all the geometric information inherent in the target and its image, and recovering the vectors joining the effective focal-point and each of the target points using an all-geometric close form unique solution.
Abstract: Pose estimation is an important operation for many robotic tasks. In this paper, we propose a new algorithm of pose estimation. The input to this algorithm are the six distances joining all feature pairs and the image coordinates of the quadrangular target. The output of this algorithm are (1) the effective focal-length of the vision system, (2) the interior orientation parameters of the target, (3) the exterior orientation parameters of the camera with respect to an arbitrary coordinate system if the coordinates of the target are known in this frame, and (4) the final pose of the camera. The contribution of this method is the fast recovery of the vectors joining the effective focal-point and each of the target points using an all-geometric close form unique solution. Taking advantage of all the geometric information inherent in the target and its image, each of these vectors is recovered in six different ways. This redundancy is exploited in order to minimize the effect of random errors in the target sizing or in the recovery of its image coordinates. Knowing the relative position of the vision system frame with respect to a fixed coordinate system, the exterior orientation parameters are recovered in the form of a matrix transformation relating the fixed coordinate system to the target coordinate system. The decomposition of the latter matrix transformation into a translation and three rotations about the major axes provides the final pose of the camera.

Proceedings ArticleDOI
01 Jun 1990
TL;DR: The number of considered pairs of image and model features is reduced by selecting at random only a few of all the possible image features and matching them to appropriate model features.
Abstract: We present a method for recognizing polyhedral objects from range images. An object is said to be recognized as one of the models of a library of object models when many features of the model can be made to match the features of the observed object by the same rotation-translation transformation (the object pose). In the proposed approach, the number of considered pairs of image and model features is reduced by selecting at random only a few of all the possible image features and matching them to appropriate model features. The rotation and translation required for each match are computed, and a robust LMS (Least Median of Squares) method is applied to determine clusters in translation and rotation spaces. The validity of the object pose suggested by the clusters is verified by a similarity measure which evaluates how well a model in the suggested pose would fit the original range image. The pose estimation and verification are performed for all models in the model library. The recognized model is the model which yields the smallest value of the similarity measure, and the pose of the object is found in the process.

01 Jan 1990
TL;DR: A stochastic method is proposed which considers camera positioning as a Markov process, hence allowing simulated annealing to be applied for optimization and to speed up the search, using the characteristic view concept.
Abstract: One approach to model based computer vision as used for recognition is to store a database of wireframe models and then compare these to some digitized image of a scene. For even a single object, how one should effectively match an object model to an image remains an open question. We restrictively define recognition as determining the presence or absence of an object, and determining its pose--position and orientation fixed by six spatial parameters of translation and rotation. These parameters are incorporated into a camera model which sets the viewpoint of an observer and obtains a 2-D perspective projection of the object model. One objective is to efficiently position the camera such that the model's projection coincides with an object's digitized image--an object of unknown pose. Several researchers have described the above as an optimization problem and have proposed various deterministic solutions which are all somewhat limited in effectiveness. We propose a stochastic method which considers camera positioning as a Markov process, hence allowing simulated annealing to be applied for optimization. The objective function is a least squares distance between model projection and input image vertices. In addition, to speed up the search we employ the characteristic view concept. A "qualitative" match is first performed by checking that the number of visible object faces agree between model and image. An underlying philosophy (inspired by the Marr paradigm) that long term computer vision advances will come mainly from biologically plausible algorithms spawns discussions that examine links between human and machine vision.

Proceedings ArticleDOI
01 Jan 1990
TL;DR: The relational pyramid representation is a hierarchical relational description of an object that can be used in recognition and pose estimation as mentioned in this paper, where primitive features appear at the bottom of the pyramid, relations among primitives appear at level one, and relations among lower-level structures appear at levels above the levels in which these structures were defined.
Abstract: The relational pyramid representation is a hierarchical relational description of an object that can be used in recognition and pose estimation In this representation, primitive features appear at the bottom of the pyramid, relations among primitives appear at level one, and, in general, relations among lower-level structures appear at the levels above the levels in which these structures were defined A pose-estimation system is constructed that uses the relational pyramid representation for the view classes of a 3D object and for the description of the unknown view of an object extracted from an image A previously described system uses summary information to rank and select view classes to which the unknown view is compared This paper describes a best-first search procedure to find correspondences between the selected view class pyramid and the image pyramid

01 Mar 1990
TL;DR: Two object localization algorithms are presented together with the methods of extracting several important types of object features and the method has a better performance on the estimation, both in accuracy and speed, than that of other similar algorithms.
Abstract: Object localization and its application in tele-autonomous systems are studied. Two object localization algorithms are presented together with the methods of extracting several important types of object features. The first algorithm is based on line-segment to line-segment matching. Line range sensors are used to extract line-segment features from an object. The extracted features are matched to corresponding model features to compute the location of the object. The inputs of the second algorithm are not limited only to the line features. Featured points (point to point matching) and featured unit direction vectors (vector to vector matching) can also be used as the inputs of the algorithm, and there is no upper limit on the number of the features inputed. The algorithm will allow the use of redundant features to find a better solution. The algorithm uses dual number quaternions to represent the position and orientation of an object and uses the least squares optimization method to find an optimal solution for the object's location. The advantage of using this representation is that the method solves for the location estimation by minimizing a single cost function associated with the sum of the orientation and position errors and thus has a better performance on the estimation, both in accuracy and speed, than that of other similar algorithms. The difficulties when the operator is controlling a remote robot to perform manipulation tasks are also discussed. The main problems facing the operator are time delays on the signal transmission and the uncertainties of the remote environment. How object localization techniques can be used together with other techniques such as predictor display and time desynchronization to help to overcome these difficulties are then discussed.

Proceedings ArticleDOI
01 Jan 1990
TL;DR: In this article, the authors describe an overall system architecture for the types of visual surveillance tasks that RAMBO is to perform, and then provide an overview of their research in vision and planning.
Abstract: This paper describes research being performed in the area of visual surveillance. In the context of a project we call RAMBO - for Robot Acting on Moving Bodies - we are developing algorithms for object pose estimation, object motion estimation, and visual planning. Many of these algorithms are massively parallel algorithms that have been implemented on a Connection Machine II. We describe an overall system architecture for the types of visual surveillance tasks that RAMBO is to perform,and then provide an overview of our research in vision and planning.

Journal ArticleDOI
TL;DR: A formal language for describing object represented as edges in terms of arcs, parametrized by knot angle, curvature and length, developed earlier, is now expressed in the context of a novel structure referred to as the arc space, which is shown to have properties important in supporting the machine vision tasks efficiently with respect to both time and space factors.