scispace - formally typeset
Search or ask a question

Showing papers on "Motion estimation published in 1989"


Journal ArticleDOI
TL;DR: Motion detection may be one of the first examples in computational neurosciences where common principles can be found not only at the cellular level (e.g., dendritic integration, spike propagation, synaptic transmission) but also at the level of computations performed by small neural networks.

526 citations


Journal ArticleDOI
TL;DR: The presented approach to error estimation applies to a wide variety of problems that involve least-squares optimization or pseudoinverse and shows, among other things, that the errors are very sensitive to the translation direction and the range of field view.
Abstract: Deals with estimating motion parameters and the structure of the scene from point (or feature) correspondences between two perspective views. An algorithm is presented that gives a closed-form solution for motion parameters and the structure of the scene. The algorithm utilizes redundancy in the data to obtain more reliable estimates in the presence of noise. An approach is introduced to estimating the errors in the motion parameters computed by the algorithm. Specifically, standard deviation of the error is estimated in terms of the variance of the errors in the image coordinates of the corresponding points. The estimated errors indicate the reliability of the solution as well as any degeneracy or near degeneracy that causes the failure of the motion estimation algorithm. The presented approach to error estimation applies to a wide variety of problems that involve least-squares optimization or pseudoinverse. Finally the relationships between errors and the parameters of motion and imaging system are analyzed. The results of the analysis show, among other things, that the errors are very sensitive to the translation direction and the range of field view. Simulations are conducted to demonstrate the performance of the algorithms and error estimation as well as the relationships between the errors and the parameters of motion and imaging systems. The algorithms are tested on images of real-world scenes with point of correspondences computed automatically. >

495 citations


Journal ArticleDOI
TL;DR: In this article, it was shown that the optical flow and the motion field can be interpreted as vector fields tangent to flows of planar dynamical systems, which can be used to reconstruct the 3D structure of a moving scene.
Abstract: It is shown that the motion field the 2-D vector field which is the perspective projection on the image plane of the 3-D velocity field of a moving scene, and the optical flow, defined as the estimate of the motion field which can be derived from the first-order variation of the image brightness pattern, are in general different, unless special conditions are satisfied. Therefore, dense optical flow is often ill-suited for computing structure from motion and for reconstructing the 3-D velocity field by algorithms which require a locally accurate estimate of the motion field. A different use of the optical flow is suggested. It is shown that the (smoothed) optical flow and the motion field can be interpreted as vector fields tangent to flows of planar dynamical systems. Stable qualitative properties of the motion field, which give useful informations about the 3-D velocity field and the 3-D structure of the scene, usually can be obtained from the optical flow. The idea is supported by results from the theory of structural stability of dynamical systems. >

309 citations


Journal ArticleDOI
TL;DR: It is shown that the second-order moment invariants can be used to predict whether the estimation using noisy data is reliable or not and the new derivation of vector forms also facilities the calculation of motion estimation in a tensor approach.
Abstract: The 3-D moment method is applied to object identification and positioning. A general theory of deriving 3-D moments invariants is proposed. The notion of complex moments is introduced. Complex moments are defined as linear combinations of moments with complex coefficients and are collected into multiplets such that each multiplet transforms irreducibly under 3-D rotations. The application of the 3-D moment method to motion estimation is also discussed. Using group-theoretic techniques, various invariant scalars are extracted from compounds of complex moments via Clebsch-Gordon expansion. Twelve moment invariants consisting of the second-order and third-order moments are explicitly derived. Based on a perturbation formula, it is shown that the second-order moment invariants can be used to predict whether the estimation using noisy data is reliable or not. The new derivation of vector forms also facilities the calculation of motion estimation in a tensor approach. Vectors consisting of the third-order moments can be derived in a similar manner. >

280 citations


Journal ArticleDOI
TL;DR: The constraint line clustering algorithm uses a statistical test to estimate the image flow velocity field in the presence of step discontinuities in the image irradiance or velocity field, with particular emphasis on motion estimation and segmentation in situations where motion is the only cue to segmentation.
Abstract: Image flow is the velocity field in the image plane caused by the motion of the observer, objects in the scene, or apparent motion, and can contain discontinuities due to object occlusion in the scene. An algorithm that can estimate the image flow velocity field when there are discontinuities due to occlusions is described. The constraint line clustering algorithm uses a statistical test to estimate the image flow velocity field in the presence of step discontinuities in the image irradiance or velocity field. Particular emphasis is placed on motion estimation and segmentation in situations such as random dot patterns where motion is the only cue to segmentation. Experimental results on a demanding synthetic test case and a real image are presented. A smoothing algorithm for improving the velocity field estimate is also described. The smoothing algorithm constructs a smooth estimate of the velocity field by approximating a surface between step discontinuities. It is noted that the velocity field estimate can be improved using surface reconstruction between velocity field boundaries. >

182 citations


Patent
Achim von Brandt1
22 Aug 1989
TL;DR: In this paper, a method for the determination of motion vector fields from digital image sequences derives a motion vector field from two successive image frames, with the relation is defined by a motion vectors which reproduces the displacement of the picture elements relative to one another and whereby all picture elements in a square or rectangular block of picture elements receive the same motion vector.
Abstract: A method for the determination of motion vector fields from digital image sequences derives a motion vector field from two successive image frames, with the motion vector field relating a picture element of the other image frame to every picture element of the one image frame, whereby the relation is defined by a motion vector which reproduces the displacement of the picture elements relative to one another and whereby respectively all picture elements in a square or rectangular block of picture elements receive the same motion vector. The determination of the motion vectors is carried out by minimization of a composite objective function which, first, takes into consideration the difference in the luminance values of the mutually allocated picture elements of the two established frames, and, then evaluates or weights the differences between adjacent or neighboring motion vectors, evaluating or weighting these with the assistance of a smoothing measure. The minimization of this objective function is carried out in such fashion that, first, the motion vectors minimizing the objective function are determined, given the restriction that the motion vectors in blocks larger than the blocks ultimately desired are constant, and that, subsequently, each of these blocks (16×6) is subdivided into smaller, preferably equal-sized blocks until the desired block size (4×4) is achieved.

174 citations


01 Jan 1989
TL;DR: This work begins by using cameras on-board a robot vehicle to estimate the motion of the vehicle by tracking 3-D feature-points or "landmarks", develops sequential methods for estimating the vehicle motion and updating the landmark model, and implements a system that successfully tracks landmarks through stereo image sequences.
Abstract: Sensing 3-D shape and motion is an important problem in autonomous navigation and manipulation. Stereo vision is an attractive approach to this problem in several domains. In this thesis, I address fundamental components of this problem by using stereo vision to estimate the 3-D structure of "depth" of objects visible to a robot, as well as to estimate the motion of the robot as its travels through an unknown environment. I begin by using cameras on-board a robot vehicle to estimate the motion of the vehicle by tracking 3-D feature-points or "landmarks". I formulate this task as a statistical estimation problem, develop sequential methods for estimating the vehicle motion and updating the landmark model, and implement a system that successfully tracks landmarks through stereo image sequences. In laboratory experiments, this system has achieved an accuracy of 2$\\$% of distance over 5.5 meters and 55 stereo image pairs. These results establish the importance of statistical modelling in this problem and demonstrate the feasibility of visual motion estimation in unknown environments. This work embodies a successful paradigm for feature-based depth and motion estimation, but the feature-based approach results in a very limited 3-D model of the environment. To extend this aspect of the system, I address the problem of estimating "depth maps" from stereo images. Depth maps specify scene depth for each pixel in the image. I propose a system architecture in which exploratory camera motion is used to acquire a narrow-baseline image pair by moving one camera of the stereo system. Depth estimates obtained from this image pair are used to "bootstrap" matching of a wide-baseline image pair acquired with both cameras of the system. I formulate the bootstrap operation statistically by modelling depth maps as random fields and developing Bayesian matching algorithms in which depth information from the narrow-baseline image pair determines the prior density for matching the wide baseline image pair. This leads to efficient, area-based matching algorithms that are applied independently for each pixel or each scanline of the image. Experimental results with images of complex, outdoor scene models demonstrate the power of the approach.

156 citations


Patent
18 Jul 1989
TL;DR: In this paper, a motion sequence pattern detector detects a periodic pattern of motion sequences within a succession of video fields, such as film mode or progressive scan mode, and comprises a motion detector for detecting the presence of motion from increment to increment within predetermined increments of the video fields and for thereupon putting out a first motion detection signal for each said increment.
Abstract: A motion sequence pattern detector detects a periodic pattern of motion sequences within a succession of video fields, such as film mode or progressive scan mode and comprises a motion detector for detecting the presence of motion from increment to increment within predetermined increments of the succession of video fields and for thereupon putting out a first motion detection signal for each said increment, and logic circuitry responsive to the first motion detection signal for detecting the periodic pattern of motion sequences within the succession of video fields.

126 citations


Journal ArticleDOI
TL;DR: In this paper, a differential motion estimation technique for television sequences is presented which measures the parameters of global motion in the image plane due to zoom and pan of the camera using a two-dimensional signal model and a three-parameter model for motion description.

111 citations


Proceedings ArticleDOI
04 Jun 1989
TL;DR: The authors present approaches to estimatingerrors in the optimal solutions, investigate the theoretical lower bounds on the errors in the solutions and compare them with actual errors, and analyze two types of algorithms of optimization: batch and sequential.
Abstract: The problem of estimating motion and structure of a rigid scene from two perspective monocular views is studied. The optimization approach presented is motivated by the following observations of linear algorithms: (1) for certain types of motion, even pixel-level perturbations (such as digitization noise) may override the information characterized by epipolar constraint; (2) existing linear algorithms do not use the constraints in the essential parameter matrix E in solving for this matrix. The authors present approaches to estimating errors in the optimal solutions, investigate the theoretical lower bounds on the errors in the solutions and compare them with actual errors, and analyze two types of algorithms of optimization: batch and sequential. The analysis and experiments show that, in general, a batch technique performs better than a sequential technique for any nonlinear problems. A recursive batch processing technique is proposed for nonlinear problems that require recursive estimation. >

108 citations


Journal ArticleDOI
TL;DR: The singular points of the motion field are the points where the field vanishes, and the time evolution of their local structure capture essential features of three-dimensional motion that make it possible to distinguish translation, rotation, and general motion and also make possible the computation of relevant motion parameters.
Abstract: The motion field, that is, the two-dimensional vector field associated with the velocity of points on the image plane, can be seen as the flow vector of the solution to a planar system of differential equations. Therefore the theory of planar dynamical systems can be used to understand qualitative and quantitative properties of motion. In this paper it is shown that singular points of the motion field, which are the points where the field vanishes, and the time evolution of their local structure capture essential features of three-dimensional motion that make it possible to distinguish translation, rotation, and general motion and also make possible the computation of the relevant motion parameters. Singular points of the motion field are the perspective projection onto the image plane of the intersection between a curve called the characteristic curve, which depends on only motion parameters, and the surface of the moving object. In most cases, singular points of the motion field are left unchanged in location and spatial structure by small perturbations affecting the vector field. Therefore a description of motion based on singular points can be used even when the motion field of an image sequence has not been estimated with high accuracy.

Journal ArticleDOI
TL;DR: The authors present a summary of several psychophysical results that illustrate the complex interrelationship between stimulus factors influencing Phi and the organization of the resulting motion percepts, and a neural network model whose mechanisms are capable of explaining these percepts.

Journal ArticleDOI
TL;DR: This paper deals with the problem of locating a rigid object and estimating its motion in three dimensions by determining the position and orientation of the object at each instant when an image is captured by a camera, and recovering the motion of theobject between consecutive frames.
Abstract: This paper deals with the problem of locating a rigid object and estimating its motion in three dimensions. This involves determining the position and orientation of the object at each instant when an image is captured by a camera, and recovering the motion of the object between consecutive frames. In the implementation scheme used here, a sequence of camera images, digitized at the sample instants, is used as the initial input data. Measurements are made of the locations of certain features (e.g., maximum curvature points of an image contour, corners, edges, etc.) on the 2-D images. To measure the feature locations a matching algorithm is used, which produces correspondences between the features in the image and the object. Using the measured feature locations on the image, an algorithm is developed to solve the location and motion problem. The algorithm is an extended Kalman filter modeled for this application.

Proceedings ArticleDOI
01 Nov 1989
TL;DR: This paper reviews these multiresolution techniques and discusses how they may be usefully combined in the future to match the requirements of human perception.
Abstract: Important new techniques for representing and analyzing image data at multiple resolutions have been developed over the past several years. Closely related multiresolution structures and procedures have been developed more or less independently in diverse scientific fields. For example, pyramid and subband representations have been applied to image compression, and promise excellent performance and flexibility. Similar pyramid structures have been developed as models for the neural coding of images within the human visual system. The pyramid has been developed in the computer vision field as a general framework for implementing highly efficient algorithms, including algorithms for motion analysis and object recognition. In this paper I review these multiresolution techniques and discuss how they may be usefully combined in the future. Methods used in image compression, for example, should match the requirements of human perception, and future 'smart' transmission systems will need to perform rapid analysis in order to selectively encode the most critical information in a scene.

Journal ArticleDOI
TL;DR: Matching preserved black/white identity regardless of whether frames were viewed binocularly or dichoptically, and results suggest that correspondence is computed by a weighted metric containing terms for image features coded early in visual processing.
Abstract: To maintain figural identity during motion perception, the visual system must match images over space and time. Correct matching requires a metric for identifying “corresponding” images, those representing the same physical object. To test whether matching is based on achromatic (black/white) polarity and chromatic (red/green) color, observers viewed an ambiguous motion display and judged the path of apparent motion. Matching preserved black/white identity regardless of whether frames were viewed binocularly or dichoptically. Red/green identity was also preserved, but coherence of motion depended in part on the number of frames in the motion sequence and on the background luminance. These results suggest that correspondence is computed by a weighted metric containing terms for image features coded early in visual processing.

Journal ArticleDOI
TL;DR: A spatio-temporal surface type inseparable model is proposed for motion detection and mathematically how the geometry of the intensity hypersurface gives information about motion in image is analyzed.
Abstract: We present an analysis of existing motion detectors for determining desirable characteristics of a motion detector. A spatio-temporal surface type inseparable model is then proposed for motion detection. Based on this model, we analyzed mathematically how the geometry of the intensity hypersurface gives information about motion in image. The local motion information, obtained from the parameters of the Monge patch approximating the intensity hypersurface in the spatio-temporal space, may be used for segmentation of dynamic scenes. Motion detection results for real sequences show the robustness of this detector.

Proceedings ArticleDOI
20 Mar 1989
TL;DR: It is shown that some of the difficulties inherent in the two-frame approach disappear when redundancy in the data is introduced, and the authors present two efficient ways to approximate the problem.
Abstract: The problem of using feature correspondences to recover the structure and 3D motion of a moving object from its successive images is analyzed. They formulate the problem as a quadratic minimization problem with a nonlinear constraint. Then they derive the condition for the solution to be optimal under the assumption of Gaussian noise in the input, in the maximum-likelihood-principle sense. The authors present two efficient ways to approximate it and discuss some inherent limitations of the structure-from-motion problem when two frames are used that should be taken into account in robotics applications that involve dynamic imagery. Finally, it is shown that some of the difficulties inherent in the two-frame approach disappear when redundancy in the data is introduced. This is concluded from experiments using a structure-from-motion algorithm that is based on multiple frames and uses only the rigidity assumption. >

Proceedings ArticleDOI
20 Mar 1989
TL;DR: In this paper, a number of constraints are proposed for which both the components of optical flow can be obtained by local differential techniques and the aperture problem can usually be solved, and experiments on real images are reported which show that the obtained optical flows allow the estimate of 3D motion parameters, detection of discontinuities in the flow field, and the segmentation of the image in different moving objects.
Abstract: A number of constraints are proposed for which both the components of optical flow can be obtained by local differential techniques and the aperture problem can usually be solved. The constraints are suggested by the observation that it is possible to describe spatial and temporal changes of the image brightness in terms of infinitesimal deformations. An arbitrary choice of two of the four equations which correspond to the elementary deformations of a 2-D pattern implies that the spatial gradient of the image brightness is stationary and leads to a linear system of equations for optical flow which seems best suited for numerical implementation on real data in the absence of a priori information. In that case, the error term between the computed optical flow and the motion field-that is, the 2-D vector field associated with the true displacement of points on the image plane-is derived and the conditions under which it can safely be neglected are discussed. Experiments on real images are reported which show that the obtained optical flows allow the estimate of 3-D motion parameters, the detection of discontinuities in the flow field, and the segmentation of the image in different moving objects. >

Proceedings ArticleDOI
04 Jun 1989
TL;DR: A nonlinear least-squares optimization technique is proposed which uses the Levenberg-Marquardt method and estimates the motion and structure parameters to a global scale factor by minimizing an objective function.
Abstract: A nonlinear least-squares optimization technique is proposed which uses the Levenberg-Marquardt method and estimates the motion and structure parameters to a global scale factor by minimizing an objective function. This objective function is the mean-square difference between the measured coordinates of feature points in the image plane and the coordinates predicted from the current state estimate. In comparison to existing approaches, this technique converges faster and yields better estimates. A recursive version of this algorithm is developed using the block approach. This algorithm is shown to also track eventful motion effectively. The performance of the proposed technique on real image sequences is also presented. Some performance results are indicated to illustrate the efficacy of this approach. >

Proceedings ArticleDOI
01 Nov 1989
TL;DR: In this article, the filtering of noise in image sequences using spatio-temporal motion compensated techniques is considered, and a number of filtering techniques are proposed and compared in this work.
Abstract: In this paper the filtering of noise in image sequences using spatio-temporal motion compensated techniques is considered. Noise in video signals degrades both the image quality and the performance of subsequent image processing algorithms. Although the filtering of noise in single images has been studied extensively, there have been few results in the literature on the filtering of noise in image sequences. A number of filtering techniques are proposed and compared in this work. They are grouped into recursive spatio-temporal and motion compensated filtering techniques. A 3-D point estimator which is an extension of a 2-D estimator due to Kak [5] belongs in the first group, while a motion compensated recursive 3-D estimator and 2-D estimators followed by motion compensated temporal filters belong in the second group. The motion in the sequences is estimated using the pel-recursive Wiener-based algorithm [8] and the block-matching algorithm. The methods proposed are compared experimentally on the basis of the signal-to-noise ratio improvement and the visual quality of the restored image sequences.

Proceedings ArticleDOI
04 Jun 1989
TL;DR: A correspondence method is developed for determining optical flow where the primitive motion tokens to be matched between consecutive time frames are regions, which is simple, computationally efficient, and more robust than iterative gradient methods, especially for medium-range motion.
Abstract: A correspondence method is developed for determining optical flow where the primitive motion tokens to be matched between consecutive time frames are regions. The computation of optical flow consists of three stages: region extraction, region matching, and optical flow smoothing. The computation is completed by smoothing the initial optical flow, where the sparse velocity data are either smoothed with a vector median filter or interpolated to obtain dense velocity estimates by using a motion-coherence regularization. The proposed region-based method for optical flow is simple, computationally efficient, and more robust than iterative gradient methods, especially for medium-range motion. >

Proceedings ArticleDOI
23 May 1989
TL;DR: A VLSI architecture that achieves a single-chip real-time implementation of motion estimation is presented, and the versatile architecture of the chip allows displacement vectors to be computed for various sizes of template block and matching window, depending on the pixel rate.
Abstract: A VLSI architecture that achieves a single-chip real-time implementation of motion estimation is presented. The case of a block matching algorithm that computes a displacement vector for each block of a segmented image is considered. The versatile architecture of the chip allows displacement vectors to be computed for various sizes of template block and matching window, depending on the pixel rate. The chip computes minimum and maximum values of distances, and the distances can be randomly accessed from the outside. Displacements of +or-7 at video rate or +or-15 for videophone, can be computed for 16*16 blocks. >

Proceedings ArticleDOI
23 May 1989
TL;DR: A novel algorithm for spatial interpolation of 2-to-1 interlaced television pictures based on two spatially varying image models that can be applied to color as well as black-and-white pictures is presented.
Abstract: The authors present a novel algorithm for spatial interpolation of 2-to-1 interlaced television pictures. The basic problem that this algorithm addresses is reconstruction of a frame from one of the fields. The algorithm is based on two spatially varying image models. It can be applied to color as well as black-and-white pictures. Experimental results demonstrate that the resulting frames contain less spatial aliasing and are sharper than frames generated with commonly used approaches involving zero- or first-order interpolation. >

Proceedings ArticleDOI
23 May 1989
TL;DR: The VLSI architecture is based on some special data-flow designs that allow sequential inputs but perform parallel processing with 100% efficiency for integer motion vector estimation and nearly 100% for fractional motion vectors estimation.
Abstract: VLSI architecture design and implementation of a chip pair for the motion compensation full search block matching algorithm are described This pair of ASICs (application-specific integrated circuits) is motivated by the intensive computational demands for performing motion compensation in real time They have been developed to calculate fractional motion vectors with quarter-pel precision The VLSI architecture is based on some special data-flow designs that allow sequential inputs but perform parallel processing with 100% efficiency for integer motion vector estimation and nearly 100% for fractional motion vector estimation The chip-pair design has been laid out and simulated using a silicon compiler tool, and the chip statistics are summarized Testing circuitry is included to increase the observability of the chips >


Proceedings ArticleDOI
01 Nov 1989
TL;DR: Subband coding was found to offer numerical and subjective improvement over DPCM coding alone and stable and better SNRs over non-subband coding schemes.
Abstract: Several methods to perform subband coding of video sequences are studied. The subband decomposition can be used in coding the error frames as well as in estimating the motion in the image sequence. When motion compensation was not used, subband coding was found to offer numerical and subjective improvement over DPCM coding alone. Robust motion estimation using the pel recursive motion estimation was performed with the QMF pyramid in a hierarchical structure. A method for subband coding the residue in a hierarchical motion estimation environment is given. This method gave stable and better SNRs over non-subband coding schemes. Finally motion compensated interpolation and extrapolation are studied using subband coding and motion estimation schemes.

Journal ArticleDOI
TL;DR: A modular and flexible architecture that realizes a parallel algorithm for real-time image template matching is described, which is especially suitable for applications in which adjustments of the dimension of the search area are constantly required.
Abstract: A modular and flexible architecture that realizes a parallel algorithm for real-time image template matching is described. Symmetrically permuted template data (SPTD) are employed in this algorithm to obtain a processing structure with a high degree of parallelism and pipelining, reduce the number of memory accesses to a minimum, and eliminate the use of delay elements that render the dimension of search area to be processed unchangeable. The inherent temporal parallelism and spatial parallelism of the algorithm are fully exploited in developing the hardware architecture. The latter, which is mainly constructed from two types of basic cells, exhibits a high degree of modularity and regularity. The architecture is especially suitable for applications in which adjustments of the dimension of the search area are constantly required. A hardware prototype has been constructed using standard integrated circuits for moving-object detection and interframe motion estimation. It is capable of operating on a search area of size up to 256*256 pixels in real time. >

Journal ArticleDOI
TL;DR: This work considers the problem of determining motion (3‐D rotation and translation) of rigid objects from their images taken at two time instants, and derives a variety of necessary and sufficient conditions a solution must satisfy, and uses algebraic geometry to derive the bound on the number of solutions.
Abstract: We consider the problem of determining motion (3-D rotation and translation) of rigid objects from their images taken at two time instants. We assume that the locations of the perspective projection on the image plane of n points from the surface of the rigid body are known at two time instants. For n = 5, we show that there are at most ten possible motion values (in rotation and translation) and give many examples. For n ≥ 6, we show that the solution is generally unique. We derive a variety of necessary and sufficient conditions a solution must satisfy, show their equivalence, and use algebraic geometry to derive the bound on the number of solutions. A homotopy method is then used to compute all the solutions. Several examples are worked out and our computational experience is summarized.

Journal ArticleDOI
TL;DR: This paper examines model-based approaches for motion and structure estimation from a long sequence of images by using a Cramer–Rao lower bounds on the estimation error variance, and shows that noisy sequences with fewer than four images often do not contain enough information to permit accurate estimation of motion andructure parameters.
Abstract: The problem considered here involves the use of a sequence of monocular images of a three-dimensional moving object to estimate both its structure and kinematics. The object is assumed to be rigid, and its motion is assumed to be smooth. A set of object match points is assumed to be available, consisting of fixed features on the object, the image plane coordinates of which have been extracted from successive images in the sequence. The measured data are the noisy image plane coordinates of this set of object match points, taken from each image in the sequence. In previous papers [ IEEE Trans. Pattern Anal. Mach. Intell.PAMI-8, 90 ( 1986); in Proceedings of the IEEE Workshop on Motion: Representation and Analysis ( Institute of Electrical and Electronics Engineers, New York, 1986), p. 95; in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition ( Institute of Electrical and Electronics Engineers, New York, 1986), p. 176] we discussed model-based approaches for motion and structure estimation from a long sequence of images. We examine here the performance of such techniques by using a Cramer–Rao lower bounds on the estimation error variance. This method permits a priori prediction of estimation accuracy as a function of a number of factors, including the number of images in the sequence, the time at which each image is made, the number of feature points used, the image-plane noise level, and the type of motion that is involved. Theoretical performance predictions are compared with the statistics of Monte Carlo simulation, and it is shown that the actual estimation accuracy is close to the Cramer–Rao bounds in most cases. These results also show that noisy sequences with fewer than four images often do not contain enough information to permit accurate estimation of motion and structure parameters. This conclusion is consistent with the observed instability of so-called two-frame estimation methods in the presence of noise.

Proceedings ArticleDOI
01 Jan 1989
TL;DR: Two different techniques based on the study of the Jacobian matrix of optical flow have been implemented which can be used to segment the image plane in regions that allow to distinguish between different kinds of motion, like translation, rotation, and relative motion, and to identify the different moving objects.
Abstract: This note describes two different methods for motion segmentation from optical flow. In the first method, the Jacobian matrix of the first spatial derivatives of the components of optical flow is used to compute the amount of uniform expansion, pure rotation, and shear at every flow point. In the second description, local properties of optical flow which are invariant for non-singular linear transformation are computed from the trace and the determinant of the matrix itself. Both the methods allow to distinguish between different kinds of motion, like translation, rotation, and relative motion in sequences of time-varying images. Preliminary results show that they can also be useful to identify the different moving objects in the viewed scene. Time-varying images provide useful information for the understanding of several visual problems. This information, which can be thought of as encoded in optical flow [1] — a dense planar vector field which gives the velocity of points over the image plane —, appears to be essential for important visual tasks like passive navigation and dynamic scene understanding (see [2-5] for example). In this note, the optical flow computed through a technique which has recently been proposed in ref. [6,7] is used for motion segmentation. Two different local descriptions of image motion that are invariant for orthogonal transformations of coordinates on the image plane (i.e., arbitrary rotation of the viewing camera around the optical axis) are discussed. In the first description, which is obtained by looking at the changing image brightness as a ID deformable body, the Jacobian matrix of the first spatial derivatives of the components of optical flow is used to compute the amount of uniform expansion, pure rotation, and shear at every flow point. The second description focuses on the local properties of optical flow which are invariant for non-singular linear transformation and thus can be inferred from the trace and the determinant of the matrix itself. According to these descriptions, image motion can be segmented in regions where either the average percentage of uniform expansion, pure rotation, or shear is larger than a fixed value, or where the qualitative nature of the eigenvalues do not change. Experiments on real images show that, in many cases, these regions roughly correspond to the image of the observed moving objects and make it possible to distinguish between different kinds of 3D motions. The computation of the percentage of uniform expansion, pure rotation, and shear seems to be less sensitive to noise than the study of the qualitative nature of the eigenvalues of the Jacobian matrix. In the case of translation, which is discussed analytically, the percentage of uniform expansion is usually much larger than the percentages of pure rotation and shear. At the same time, the two eigenvalues of the Jacobian matrix are real and often almost equal. At boundary points, instead, the shear component is larger and the eigenvalues may have opposite sign. Motion segmentation for rotation and relative motion is also discussed. Finally, it is shown that the integration between the presented motion segmentation and other visual cues like intensity edges allows to obtain more accurate image segmentation. Some conclusions can be drawn from the presented analysis. Firstly, it has been shown that it is possible to obtain motion segmentation from optical flow. Two different techniques based on the study of the Jacobian matrix of optical flow have been implemented which can be used to segment the image plane in regions that allow to distinguish between different kinds of motion, like translation, rotation, and relative motion, and to identify the different moving objects. The presented results complement recent results [8] that have been obtained on qualitative and quantitative properties of the Jacobian matrix at the singular points of optical flow, that is, the points where the flow vanishes. Here, the segmentation and the analysis of the spatial structure of optical flow in the neighborhood of singular points, which were essential for the understanding of the observed 3D motion in ref. [8], are obtained easily and reliably from local analysis. In fact, the obtained motion segmentation is useful even when no singular point is found in optical flow. Finally, it appears that the technique which has been used for the computation of optical flow [6,7] is not only adequate for 3D motion recovery from singular points of optical flow [6,7,9], but also for motion and object segmentation.