scispace - formally typeset
Search or ask a question

Showing papers in "Cvgip: Image Understanding in 1991"


Journal ArticleDOI
TL;DR: A model of deformation which solves some of the problems encountered with the original method of energy-minimizing curves and makes the curve behave like a balloon which is inflated by an additional force.
Abstract: The use of energy-minimizing curves, known as “snakes,” to extract features of interest in images has been introduced by Kass, Witkin & Terzopoulos (Int. J. Comput. Vision 1, 1987, 321–331). We present a model of deformation which solves some of the problems encountered with the original method. The external forces that push the curve to the edges are modified to give more stable results. The original snake, when it is not close enough to contours, is not attracted by them and straightens to a line. Our model makes the curve behave like a balloon which is inflated by an additional force. The initial curve need no longer be close to the solution to converge. The curve passes over weak edges and is stopped only if the edge is strong. We give examples of extracting a ventricle in medical images. We have also made a first step toward 3D object reconstruction, by tracking the extracted contour on a series of successive cross sections.

2,432 citations


Journal ArticleDOI
TL;DR: In this paper, the problem of estimating the position and orientation of an object is formulated as an optimization problem using dual number quaternious, and the problem is solved by minimizing a single cost function associated with the sum of the orientation and position errors and thus is expected to have a better performance on the estimation.
Abstract: This paper describes a new algorithm for estimating the position and orientation of objects. The problem is formulated as an optimization problem using dual number quaternious. The advantage of using this representation is that the method solves for the location estimate by minimizing a single cost function associated with the sum of the orientation and position errors and thus is expected to have a better performance on the estimation, both in accuracy and in speed. Several forms of sensory information can be used by the algorithm. That is, the measured data can be a combination of measured points on an object’s surfaces and measured unit direction vectors located on the object. Simulations have been carried out on a Compaq 386/20 computer and the SiIIIUkItiOU reSUltS are analyzed. 0 1991 Academic press, inc.

346 citations


Journal ArticleDOI
TL;DR: It is found that phase signals are occasionally very sensitive to spatial position and to variations in scale, in which cases incorrect measurements occur, and the primary cause for this instability is the existence of singularities in phase signals.
Abstract: The measurement of image disparity is a fundamental precursor to binocular depth estimation. Recently, Jenkin and Jepson (in Computational Processes in Human Vision (V. Pylyshyn, Ed.), Ablex, New Jersey, 1988) and Sanger (Biol. Cybernet, 59, 1988 , 405–418) described promising methods based on the output phase behavior of bandpass Gabor filters. Here we discuss further justification for such techniques based on the stability of bandpass phase behavior as a function of typical distortions that exist between left and right views. In addition, despite this general stability, we show that phase signals are occasionally very sensitive to spatial position and to variations in scale, in which cases incorrect measurements occur. We find that the primary cause for this instability is the existence of singularities in phase signals. With the aid of the local frequency of the filter output (provided by the phase derivative) and the local amplitude information, the regions of phase instability near the singularities are detected so that potentially incorrect measurements can be identified. In addition, we show how the local frequency can be used away from the singularity neighbourhoods to improve the accuracy of the disparity estimates. Some experimental results are reported.

315 citations


Journal ArticleDOI
TL;DR: In this article, the skeleton of a continuous shape can be approximated from the Voronoi diagram of points sampled along the shape boundary, and the regeneration error from the sampling density and the regularity parameter is bound.
Abstract: The skeleton of a continuous shape can be approximated from the Voronoi diagram of points sampled along the shape boundary. To bound the error of this approximation, one must relate the spatial complexity of the shape to the boundary sampling density. The regular set model of mathematical morphology provides a practical basis to establish such a relationship. Given a binary image shape, we exhibit a corresponding continuous, regular shape such that the sequence of points describing its boundary constitutes a sufficiently dense sampling for an accurate skeleton approximation. Additionally, we bound the regeneration error from the sampling density and the regularity parameter. This approach opens significant new possibilities for shape analysis by the exact, Euclidean skeleton. As a simple example, we describe how the skeleton can be refined by pruning, without introducing significant error in the regenerated image.

300 citations


Journal ArticleDOI
TL;DR: Much attention is given to specific methods for building openings and closings in an economical way; in particular they study annular openings and inf-overfilters, which are very special classes of idempotent operators.
Abstract: For pt.I see ibid., vol.50, p.245-295, 1990. In (Part I) the authors introduced and investigated an abstract algebraic framework for mathematical morphology. The main assumption is that the object space is a complete lattice. Of interest are all (increasing) operators which are invariant under a given abelian group of automorphisms on the lattice. In Part I the authors were mainly concerned with the basic operations dilation and erosion. In this paper they concentrate on openings and closings, which are very special classes of idempotent operators. Much attention is given to specific methods for building openings and closings in an economical way; in particular they study annular openings and inf-overfilters. They also consider the possibility of generating new openings by iteration of anti-extensive operators. Some examples illustrate the abstract theory.

231 citations


Journal ArticleDOI
TL;DR: The concept of surviving cell as a local maximum of an interest operator as well as a root is defined as a particular surviving cell in the context of shape decomposition in the adaptive pyramid.
Abstract: The adaptive pyramid is a new framework for 2D image analysis. It is based on the principle of local evidence accumulation for global interpretation. The adaptive structure of this pyramid is a new approach of reduced resolution. We introduce the concept of surviving cell as a local maximum of an interest operator. A root is then defined as a particular surviving cell. Examples are shown in segmentation for different images. The limits of this technique are presented on an example of a highly textured image. A particular top-down process is presented in the context of shape decomposition.

183 citations


Journal ArticleDOI
TL;DR: This method is an extension to the 3D case of the optimal 2D edge detector recently introduced by R. Deriche and presents better theoretical and experimental performances than some classical approachs used at this date.
Abstract: This paper proposes a new algorithm for three-dimensional edge detection. This method is an extension to the 3D case of the optimal 2D edge detector recently introduced by R. Deriche ( Int. J. Comput. Vison 1, 1987). It presents better theoretical and experimental performances than some classical approachs used at this date. Experimental results obtained on magnetic resonance images and on echographic images are shown. We stress that this approach can be used to detect edges in other multidimensional data, for instance 2D − t or 3D − t images.

142 citations


Journal ArticleDOI
TL;DR: An efficient, non-iterative polynomial time approximation algorithm which minimizes the proximal uniformity cost function and establishes correspondence over n frames and combines the qualities of the gradient and token based methos for motion correspondence is proposed.
Abstract: Given n frames taken at different time instants and m points in each frame, the problem of motion correspondence is to map a point in one frame to another point in the next frame such that no two points map onto the same point. This problem is combinatorially explosive; one needs to introduce constraints to limit the search space. We propose a proximal uniformity constraint to solve the correspondence problem. According to this constraint, most objects in the real world follow smooth paths and cover a small distance in a small time. Therefore, given a location of a point in a frame, its location in the next frame lies in the proximity of its previous location. Further, resulting trajectories are smooth and uniform and do not show abrupt changes in velocity vector over time. An efficient, non-iterative polynomial time approximation algorithm which minimizes the proximal uniformity cost function and establishes correspondence over n frames is proposed. It is argued that any method using smoothness of motion alone cannot operate correctly without assuming correct initial correspondence, the correspondence in the first two frames. Therefore, we propose the use of gradient based optical flow for establishing the initial correspondence. This way the proposed approach combines the qualities of the gradient and token based methos for motion correspondence. The algorithm is then extended to take care of restricted cases of occlusion. A metric called distortion measure for measuring the goodness of solution to this n frame correspondence problem is also proposed. The experimental results for real and synthetic sequences are presented to support our claims.

119 citations


Journal ArticleDOI
TL;DR: In this article, an algorithm for recovering the six degrees of freedom of motion of a vehicle from a sequence of range images of a static environment taken by a range camera rigidly attached to the vehicle is described.
Abstract: An algorithm is described for recovering the six degrees of freedom of motion of a vehicle from a sequence of range images of a static environment taken by a range camera rigidly attached to the vehicle. The technique utilizes a least-squares minimization of the difference between the measured rate of change of elevation at a point and the rate predicted by the so-called elevation rate constmint equation. It is assumed that most of the surface is smooth enough so that local tangent planes can be constructed, and that the motion between frames is smaller than the size of most features in the range image. This method does not depend on the determination of correspondences between isolated high-level features in the range images. The algorithm has been successfully applied to data obtained from the range imager on the Autonomous Land Vehicle (ALV). Other sensors on the ALV provide an initial approximation to the motion between frames. It was found that the outputs of the vehicle sensors themselves are not suitable for accurate motion recovery because of errors in dead reckoning resulting from such problems as wheel slippage. The sensor measurements are used only to approximately register range data. The algorithm described here then recovers the difference between the true motion and that estimated from the sensor outputs. s 1991 Academic Press. Inc.

116 citations


Journal ArticleDOI
TL;DR: A new definition of disparity is presented that is tied to the interocular phase difference in bandpass versions of the monocular images, and how this technique surmounts some of the difficulties encountered by current disparity detection mechanisms is shown.
Abstract: Many different approaches have been suggested for the measurement of structure in space from spatially separated cameras. In this report we critically examine some of these techniques. Through a series of examples we show that none of the current mechanisms of disparity measurement are particularly robust. By considering some of the implications of disparity in the frequency domain, we present a new definition of disparity that is tied to the interocular phase difference in bandpass versions of the monocular images. Finally, we present a new technique for measuring disparity as the local phase difference between bandpass versions of the two images, and we show how this technique surmounts some of the difficulties encountered by current disparity detection mechanisms.

114 citations


Journal ArticleDOI
TL;DR: A new algorithm, hierarchical basis conjugate gradient descent, is used to provide a faster solution to the shape from shading problem, similar to the multigrid techniques that have previously been used to speed convergence, but it does not require heuristic approximations to the true irradiance equation.
Abstract: Extracting surface orientation and surface depth from a shaded image is one of the classic problems in computer vision. Many previous algorithms either violate integrability, i.e., the surface normals do not correspond to a feasible surface, or use regularization, which biases the solution away from the true answer. A recent iterative algorithm proposed by Horn overcomes both of these limitations but converges slowly. This paper uses a new algorithm, hierarchical basis conjugate gradient descent, to provide a faster solution to the shape from shading problem. This approach is similar to the multigrid techniques that have previously been used to speed convergence, but it does not require heuristic approximations to the true irradiance equation. The paper compares the accuracy and the convergence rates of the new techniques to previous algorithms.

Journal ArticleDOI
TL;DR: Filter parameters and performance criteria are presented for several designs, and experimental results are presented on a variety of images which demonstrate the behavior in the presence of very adverse noise, with respect to scale, and as compared to other “optimal” IIR filters which have been reported.
Abstract: We present formal optimality criteria and a complete design methodology for a family of zero crossing based, infinite impulse response (recursive) edge detection filters. In particular, we adapt the optimality criteria proposed by Canny (IEEE Trans. Pattern Anal. Mach. IntelligencePAMI-8, 1986, 679–714) to filters designed to respond with a zero crossing in the output at an edge location and additionally to impulse responses which are (allowed to be) infinite in extent. The spurious response criterion is captured directly by an appropriate measure of filter spatial extent for infinite responses. Infinite duration impulse responses may be implemented efficiently with recursive filtering techniques and so require constant computation time with respect to scale. As we will show, we can achieve both superior performance and increased speed by designing directly for an infinite impulse response than by any of the proposed finite duration approaches. We also show that the optimal filter which responds with a zero crossing in its output may not be implemented by designing the optimal peak responding filter (similar to Canny) and taking an additional derivative. It is necessary to formulate the criteria and design for a zero crossing response from the outset, else optimality is sacrificed. Filter parameters and performance criteria are presented for several designs, and experimental results are presented on a variety of images which demonstrate the behavior in the presence of very adverse noise, with respect to scale, and as compared to other “optimal” IIR filters which have been reported.

Journal ArticleDOI
TL;DR: This paper reviews the different approaches developed to estimate motion parameters from a sequence of two range images and gives the mathematical formulation of the problem along with the various modifications by different investigators to adapt the formulation to their algorithms.
Abstract: The estimation of motion of a moving object from a sequence of images is of prime interest in computer vision. This paper reviews the different approaches developed to estimate motion parameters from a sequence of two range images. We give the mathematical formulation of the problem along with the various modifications by different investigators to adapt the formulation to their algorithms. The shortcomings and the advantages of each method are also briefly mentioned. The methods are divided according to the type of feature used in the motion estimation task. We address the representational and the computational issues for each of the methods described. Most of the earlier approaches used local features such as corners (points) or edges (lines) to obtain the transformation. Local features are sensitive to noise and quantization errors. This causes uncertainties in the motion estimation. Using global features, such as surfaces, makes the procedure of motion computation more robust at the expense of making the procedure very complex. A common error is assuming that the best affine transform is the best estimate of the desired motion, which in general is false. It is important to make the distinction between the motion transform and the general affine transform, since an affine transform may not be realized physically by a rigid object.

Journal ArticleDOI
TL;DR: In this article, the authors aim to provoke discussion and actions that may lead to corrections in the field of computer vision, a field they consider to be a balanced science that can be applied in many disparate areas by following research approaches used in most successful applied scientific fields.
Abstract: After a period of tremendous excitement and enthusiasm, many industrial people and researchers are disenchanted with computer vision, and others are certainly much less enthusiastic about it. The time has come to regroup. To restore the upward trend of our field, critical introspection followed by serious corrective action is required. Active researchers in computer vision can make it a balanced science that can be applied in many disparate areas by following research approaches used in most of the successful applied scientific fields. Our aim in this paper is to provoke discussion and actions that may lead to corrections in our favorite research field.

Journal ArticleDOI
TL;DR: Three new fast fully parallel 2-D thinning algorithms using reduction operators with 11-pixel supports are presented and evaluated and are shown to approach closely or surpass these estimates and are in this sense near optimally fast.
Abstract: Three new fast fully parallel 2-D thinning algorithms using reduction operators with 11-pixel supports are presented and evaluated These are compared to earlier fully parallel thinning algorithms in tests on artificial and natural images; the new algorithms produce either superior parallel computation time (number of parallel iterations) or thinner medial curve results with comparable parallel computation time Further, estimates of the best possible parallel computation time are developed which are applied to the specific test sets used The parallel computation times of the new algorithms and one earlier algorithm are shown to approach closely or surpass these estimates and are in this sense near optimally fast

Journal ArticleDOI
TL;DR: A new accumulator space formulation for the whole line accumulation approach is described which fulfills an important property for robustness: the constancy of the detectability of a vanishing point whatever its location in the image plane.
Abstract: The extraction of 3-dimensional information from an image is a central problem in computer vision. Location of the vanishing point of a line in the image provides its 3-dimensional direction. This paper is concerned with the detection of the vanishing points in an image. A new accumulator space formulation for the whole line accumulation approach is described which fulfills an important property for robustness: the constancy of the detectability of a vanishing point whatever its location in the image plane. It is compared with the Gaussian sphere accumulator space. Results are presented from interpreting indoor scenes of a nuclear plant.

Journal ArticleDOI
TL;DR: The results of a study of projective invariants and their applications in image analysis and object recognition are presented, including a generalization into three dimensions of the invariant cross-ratio of distances between points on a line.
Abstract: This paper presents the results of a study of projective invariants and their applications in image analysis and object recognition. The familiar cross-ratio theorem, relating collinear points in the plane to the projections through a point onto a line, provides a starting point for their investigation. Methods are introduced in two dimensions for extending the cross-ratio theorem to relate noncollinear object points to their projections on multiple image lines. The development is further extended to three dimensions. It is well known that, for a set of points distributed in three dimensions, stereo pairs of images can be made and relative distances of the points from the film plane computed from measurements of the disparity of the image points in the stereo pair. These computations require knowledge of the effective focal length and baseline of the imaging system. It is less obvious, but true, that invariant metric relationships among the object points can be derived from measured relationships among the image points. These relationships are a generalization into three dimensions of the invariant cross-ratio of distances between points on a line. In three dimensions the invariants are cross-ratios of areas and volumes defined by the object points. These invariant relationships, which are independent of the parameters of the imaging system, are derived and demonstrated with examples.

Journal ArticleDOI
TL;DR: In this article, a negative shape is introduced in order to develop an algebraic system of geometric shapes within which one can add and subtract shapes exactly as one adds and subtracts within the integer number system.
Abstract: A new notion of negative shape is introduced in order to develop an algebraic system of geometric shapes within which one can add and subtract shapes exactly as one adds and subtracts within the integer number system. Concentrating on polygonal shapes in 2 dimensions, we show that this simple extension of our commonsense concept of geometric shapes opens up many new areas with a great potential for understanding and developing 2-dimensional geometry and geometric algorithms. In the course of this pursuit the concept of a new equivalence relation on convex polygons evolves that also appears to be significant in understanding convex polygons, particularly various symmetries in them. In constructing the algebraic system of shapes we use the Minkowski addition operation (in mathematical morphology dilation) as the composition Operation.

Journal ArticleDOI
TL;DR: A sequential distortion-compensation procedure is formulated that addresses the major distortion factors involved in the transformation of a curve from the object space to the image space and the effectiveness of the sub-pixel edge detector and the global interpolation technique is demonstrated.
Abstract: Accurate estimation of the parameters of a curve present in a grey-level image is required in various machine-vision and computer-vision problems. Quadratic curves are more common than other curve types in these fields. The accuracy of the estimated parameters depends not only on the global interpolation technique used, but, as well, on compensation of major sources of error. In this paper, first, as a preliminary step in accurate parameter estimation of quadratic curves, a sequential distortion-compensation procedure is formulated. This procedure addresses the major distortion factors involved in the transformation of a curve from the object space to the image space. Subsequently, as a means for accurate estimation of the coordinates of edge points, a new subpixel edge detector based on the principle of the sample-moment-preserving transform (SMPT) is developed. A circular-arc geometry is assumed for the boundary inside the detection area. The new arc-edge detector is designed as a cascade process using a linear-edge detector and a look-up table. Its performance is compared with that of a linear subpixel edge detector. Then, as a part of the main theme of the paper, the estimation of the five basic parameters of an elliptical shape based on its edge-point data is addressed. To achieve the desired degree of accuracy, a new error function is introduced and as the basis for a comparative study, an objective and independent measure for “goodness” of fit is derived. The proposed new error function and two other error functions previously developed are applied to six different situations. The comparative performance of these error functions is discussed. Finally, as the basis for evaluation of the total process, a 3D location estimation problem is considered. The objective is to accurately estimate the orientation and position in 3D of a set of circular features. The experimental results obtained are significant in two separate ways: in general, they show the validity of the overall process introduced here in the accurate estimation of 3D location; in particular, they demonstrate the effectiveness of the sub-pixel edge detector and the global interpolation technique, both developed here.

Journal ArticleDOI
TL;DR: This N-vector formalism is further extended to infer 3-D translational motions from 2-D motion images, and three computer vision applications are briefly discussed—interpretation of a rectangle, interpretation of a road, and interpretation of planar surface motion.
Abstract: A computational formalism is given to computer vision problems involving collinearity and concurrency of points and lines on a 2-D plane from the viewpoint of projective geometry. The image plane is regarded as a 2-D projective space, and points and lines are represented by unit vectors consiting of homogeneous coordinates, called N-vectors. Fundamental notions of projective geometry such as collineations, correlations, polarities, poles, polars, and conis are reformulated as “computational” processes in terms of N-vectors. They are also given 3-D interpretations by regarding 2-D images as perspective projection of 3-D scenes. This N-vector formalism is further extended to infer 3-D translational motions from 2-D motion images. Stereo is also viewed as a special type of translational motion. Three computer vision applications are briefly discussed—interpretation of a rectangle, interpretation of a road, and interpretation of planar surface motion.

Journal ArticleDOI
TL;DR: This paper presents a sampling strategy for gray-level functions based on mathematical morphology and shows that a reconstruction which dominates the original image is obtained by application of the adjoint erosion.
Abstract: This paper presents a sampling strategy for gray-level functions based on mathematical morphology. Flexible sampling strategies are required for the construction of multiresolution image representations like pyramids and quad-trees. The sampling operator introduced here is a dilation. A number of different algorithms to reconstruct an image from its sampled version are presented. It is shown that a reconstruction which dominates the original image is obtained by application of the adjoint erosion.

Journal ArticleDOI
TL;DR: In this paper, the location code addition and subtraction operations are defined for linear quadtree and octree data structures and generalized to linear octrees without difficulty to linear quadtrees.
Abstract: Linear quadtrees and octrees are data structures which are of interest in image processing, computer graphics, and solid modeling. Their representation involves spatial addresses called location codes. For many of the operations on objects in linear quadtree and octree representation, finding neighbors is a basic operation. By considering the components of a location code, named dilated integers, a representation and associated addition and subtraction operations may be defined which are efficient in execution. The operations form the basis for the definition of location code addition and subtraction, with which finding neighbors of equal size is accomplished in constant time. The translation of pixels is a related operation. The results for linear quadtrees can be generalized without difficulty to linear octrees.

Journal ArticleDOI
TL;DR: In this paper, a region matching 2-D motion estimation algorithm is presented, which searches for that set of motion parameters which describe best the 2D motion of the central projection.
Abstract: The problem of motion estimation is reviewed and a region matching 2-D motion estimation algorithm is presented. The central projection of a 3-D object moving in 3-D space is a nonrigid 2-D object which moves on the image plane. The motion of the central projection is called 2-D motion and is modeled by an affine linear transformation. If the 2-D motion is not linear or the segmentation is imperfect, this model will give a good approximation. An objective function is defined in terms of the motion parameters. The algorithm searches for that set of motion parameters which describe best the 2-D motion of the central projection. The algorithm is robust in the presence of noise and, it does not require small motion or smooth intensity profiles. It assumes that the images have been segmented. Its computation time can be significantly reduced by an appropriate choice of the initial values of the motion parameters. The performance of the algorithm is examined for different kinds of motion and various SNRs using computer generated and real images.

Journal ArticleDOI
TL;DR: The main result is that, contrary to previous belief, the image of the occluding boundary does not strongly constrain the surface solution and it is shown that characteristic strips are curves of steepest ascent on the imaged surface.
Abstract: For general objects, and for illumination from a general direction, we study the constraints on shape imposed by shading. Assuming generalized Lambertian reflectance, we argue that, for a typical image, shading determines shape essentially up to a finite ambiguity. Thus regularization is often unnecessary, and should be avoided. More conjectural arguments imply that shape is typically determined with little ambiguity. However, it is pointed out that the degree to which shape is constrained depends on the image. Some images uniquely determine the imaged surface, while, for others, shape can be uniquely determined over most of the image, but infinitely ambiguous in small regions bordering the image boundary, even though the image contains singular points. For these images, shape from shading is a partially well-constrained problem. The ambiguous regions may cause shape reconstruction to be unstable at the image boundary. Our main result is that, contrary to previous belief, the image of the occluding boundary does not strongly constrain the surface solution. Also, it is shown that characteristic strips are curves of steepest ascent on the imaged surface. Finally, a theorem characterizing the properties of generic images is presented.

Journal ArticleDOI
TL;DR: It is shown that for both uniform and textured Lambertian surfaces the equations which are functions of three independent variables can be reduced to a single nonlinear equation of depth, i.e., the distance between the camera and the point on the surface.
Abstract: The independent calculation of local position and orientation of the Lambertian surface of an opaque object is proposed using the photometric stereo method. A number of shaded video images are taken using different positions of an ideal point light source which is placed close to the object. Normally, three images are required for a uniform and four for a textured Lambertian surface. By restricting three light sources to lie in a straight line, the depth calculations for an arbitrary surface with textured Lambertian reflection characteristics can be also determined; however, in this case the orientation of the surface cannot be calculated independently. It is shown that for both uniform and textured Lambertian surfaces the equations which are functions of three independent variables, namely, depth (D) and surface normal direction vector (n = [p, q, − 1]), can be reduced to a single nonlinear equation of depth, i.e., the distance between the camera and the point on the surface. Both convergence and a unique solution are ensured because of the simple behavior of the nonlinear equation within a practical range of depth and gradient values. The robustness of the algorithm is demonstrated by synthetic as well as experimental data. The calculation of the approximate positions and orientations of discontinuous surfaces is demonstrated when random noise is added to the synthetically calculated image intensities. Two parallel planes with a gap, two sloped planes, and a spherical surface are used to demonstrate that the algorithms work well. An important feature of calculating both depth and orientation independently is that for smooth surfaces they must obey the partial differential expressions p = δDδxand q = δDδy. If we are certain that the experimental errors are within a known limit then the numerical approximation to these partial derivative expressions can be used to determine discontinuities within the image. On the other hand, if we know that the surfaces are smooth then errors in the numerical evaluation of these differential expressions allow the estimation of experimental errors.

Journal ArticleDOI
TL;DR: Simulation results indicate that the Kalman filter equations derived in this paper represent an accurate model for 3-D motion estimation in spite of the first-order approximation used in the derivation.
Abstract: This paper presents a Kalman filter approach for accurately estimating the 3-D position and orientation of a moving object from a sequence of stereo images. Emphasis is given to finding a solution for the following problem incurred by the use of a long sequence of images: the images taken from a longer distance suffer from a larger noise-to-signal ratio, which results in larger errors in 3-D reconstruction and, thereby, causes a serious degradation in motion estimation. To this end, we have derived a new set of discrete Kalman filter equations for motion estimation: (1) The measurement equation is obtained by analyzing the effect of white Gaussian noise in 2-D images on 3-D positional errors (instead of directly assigning Gaussian noise to 3-D feature points) and by incorporating an optimal 3-D reconstruction under the constraints of consistency satisfaction among 3-D feature points. (2) The state propagation equation, or the system dynamic equation, is formulated by describing the rotation between two consecutive 3-D object poses, based on quaternions and representing the error between the true rotation and the nominal rotation (obtained by 3-D reconstruction) in terms of the measurement noise in 2-D images. Furthermore, we can estimate object position from the estimation of object orientation in such a way that an object position can be directly computed once the estimation of an object orientation is obtained. Simulation results indicate that the Kalman filter equations derived in this paper represent an accurate model for 3-D motion estimation in spite of the first-order approximation used in the derivation. The accuracy of this model is demonstrated by the significant error reduction in the presence of large triangulation errors in a long sequence of images and by a shorter transition period for convergence to the true values.

Journal ArticleDOI
TL;DR: A new, efficient contour tracing algorithm based on the extended boundary concept, a common boundary representation which shifts the inter-pixel boundary by one-half pixel toward the right and lower directions is presented.
Abstract: In most literatures, the boundary of a region R is defined as the set of pixels in R that have at least one neighbor outside R. Applying this definition to images with multi-labels, for example results of segmentation, one will end up with nonoverlapping boundaries between regions. This contradicts the physical boundary concept and makes shape analysis among regions difficult. This paper presents a new, efficient contour tracing algorithm based on the extended boundary concept, a common boundary representation which shifts the inter-pixel boundary by one-half pixel toward the right and lower directions. It is shown that, in addition to the common boundary representation, the new algorithm has the following advantages over the old approaches: (1) It provides more detailed topological information in each contour. (2) The implementation in a table lookup method is easier and more efficient. (3) The complexity can be reduced to half because only one of each pair of neighboring regions needs to be traced. (4) Parallel processing is feasible. (5) The shapes of regions are well preserved.

Journal ArticleDOI
TL;DR: A system of nonlinear equations of the unknown estimate of the velocity vector field is introduced using a novel variational principle applied to the weighted average of the optical flow and the directional smoothness constraints and a stable iterative method for solving this system is developed.
Abstract: Changes in successive images from a time-varying image sequence of a scene can be characterized by velocity vector fields. The estimate of the velocity vector field is determined as a compromise between optical flow and directional smoothness constraints. The optical flow constraints relate the values of the time-varying image function at the corresponding points of the successive images of the sequence. The directional smoothness constraints relate the values of neighboring velocity vectors. To achieve the compromise, we introduce a system of nonlinear equations of the unknown estimate of the velocity vector field using a novel variational principle applied to the weighted average of the optical flow and the directional smoothness constraints. A stable iterative method for solving this system is developed. The optical flow and the directional smoothness constraints are selectively suppressed in the neighborhoods of the occluding boundaries by implicitly adjusting their weights. These adjustments are based on the spatial variations of the estimates of the velocity vectors and the spatial variations of the time-varying image function. The system of nonlinear equations is defined in terms of the time-varying image function and its derivatives. The initial image functions are in general discontinuous and cannot be directly differentiated. These difficulties are overcome by treating the initial image functions as generalized functions and their derivatives as generalized derivatives. These generalized functions are evaluated (observed) on the parametric family of testing (smoothing) functions to obtain parametric families of secondary images, which are used in the system of nonlinear equations. The parameter specifies the degree of smoothness of each secondary image. The secondary images with progressively higher degrees of smoothness are sampled with progressively lower resolutions. Then coarse-to-fine control strategies are used to obtain the estimate.

Journal ArticleDOI
TL;DR: It is shown that if N = 3, then in general there can be up to four solutions for the motion problem, though there can also be infinitely many solutions in certain cases where the image points are collinear.
Abstract: We consider the problem of determining the position and orientation in three-dimensional space of a camera knowing the locations of some (=N) points in the three-dimensional object space and their corresponding coordinates obtained by perspective projection onto the camera plane. A mathematically equivalent problem is that of the camera being attached to a vehicle which is moving in a 3D environment with landmarks at known locations and the goal being to guide this vehicle by observing images of these landmarks. (This is known as passive navigation.) We show that if N = 3, then in general there can be up to four solutions for the motion problem, though there can be infinitely many solutions in certain cases where the image points are collinear. Also, if N= 4 and the feature points are coplanar, we give conditions when there may be two or infinitely many solutions. In the cases where multiple solutions arise the image points are collinear, though the feature points themselves may not be. When N = 4 and the feature points are noncoplanar, the solution is generally unique, though there can be up to four solutions. We give several examples of multiple solutions obtained in each of these cases.

Journal ArticleDOI
TL;DR: Here the optimal rescaling factors are expressed as functions of the integer local distances for both 3 × 3 and 5 × 5 neighborhood distance functions, so the rescaling factor can be easily computed for any integer distance transform.
Abstract: In Comput. Vision Graphics Image Process. 43 , 1988 , 88–97, Vossepoel introduced rescaling of integer valued digital distance transforms. He showed that rescaling can improve the approximation to the Euclidean distance transform significantly. However, he computed the rescaling factors only for selected distance transforms, and only approximately, using numerical iteration. Here the optimal rescaling factors are expressed as functions of the integer local distances for both 3 × 3 and 5 × 5 neighborhood distance functions. Thus the rescaling factor can be easily computed for any integer distance transform. We also give the optimal, real-valued, local distances for the 5 × 5 neighborhood distance transform in the general case. Previously only the solution in a special case have been published, ( Borgefors, Comput. Vision Graphics image Process. 34 , 1986 , 344–371).