scispace - formally typeset
Search or ask a question

Showing papers on "Motion analysis published in 1993"


Journal ArticleDOI
TL;DR: A twin-comparison approach has been developed to solve the problem of detecting transitions implemented by special effects, and a motion analysis algorithm is applied to determine whether an actual transition has occurred.
Abstract: Partitioning a video source into meaningful segments is an important step for video indexing. We present a comprehensive study of a partitioning system that detects segment boundaries. The system is based on a set of difference metrics and it measures the content changes between video frames. A twin-comparison approach has been developed to solve the problem of detecting transitions implemented by special effects. To eliminate the false interpretation of camera movements as transitions, a motion analysis algorithm is applied to determine whether an actual transition has occurred. A technique for determining the threshold for a difference metric and a multi-pass approach to improve the computation speed and accuracy have also been developed.

1,360 citations


Proceedings ArticleDOI
15 Jun 1993
TL;DR: A set of techniques is devised for segmenting images into coherently moving regions using affine motion analysis and clustering techniques and it is possible to decompose an image into a set of layers along with information about occlusion and depth ordering.
Abstract: Standard approaches to motion analysis assume that the optic flow is smooth; such techniques have trouble dealing with occlusion boundaries. The image sequence can be decomposed into a set of overlapping layers, where each layer's motion is described by a smooth flow field. The discontinuities in the description are then attributed to object opacities rather than to the flow itself, mirroring the structure of the scene. A set of techniques is devised for segmenting images into coherently moving regions using affine motion analysis and clustering techniques. It is possible to decompose an image into a set of layers along with information about occlusion and depth ordering. The techniques are applied to a flower garden sequence. The scene can be analyzed into four layers, and, the entire 30-frame sequence can be represented with a single image of each layer, along with associated motion parameters. >

344 citations


Journal ArticleDOI
TL;DR: It is shown that multiple constraints can provide more accurate flow estimation in a wide range of circumstances and is presented a multimodal approach to the problem of motion estimation in which the computation of visual motion is based on several complementary constraints.
Abstract: The estimation of dense velocity fields from image sequences is basically an ill-posed problem, primarily because the data only partially constrain the solution. It is rendered especially difficult by the presence of motion boundaries and occlusion regions which are not taken into account by standard regularization approaches. In this paper, the authors present a multimodal approach to the problem of motion estimation in which the computation of visual motion is based on several complementary constraints. It is shown that multiple constraints can provide more accurate flow estimation in a wide range of circumstances. The theoretical framework relies on Bayesian estimation associated with global statistical models, namely, Markov random fields. The constraints introduced here aim to address the following issues: optical flow estimation while preserving motion boundaries, processing of occlusion regions, fusion between gradient and feature-based motion constraint equations. Deterministic relaxation algorithms are used to merge information and to provide a solution to the maximum a posteriori estimation of the unknown dense motion field. The algorithm is well suited to a multiresolution implementation which brings an appreciable speed-up as well as a significant improvement of estimation when large displacements are present in the scene. Experiments on synthetic and real world image sequences are reported. >

322 citations


Dissertation
01 Jan 1993
TL;DR: A probabilistic "coarse-to-fine" algorithm that functions much like a Kalman filter over scale is developed and it is demonstrated that such a model can account quantitatively for a set of psychophysical data on the perception of moving sinusoidal plaid patterns.
Abstract: The central theme of the thesis is that the failure of image motion algorithms is due primarily to the use of vector fields as a representation for visual motion We argue that the translational vector field representation is inherently impoverished and error-prone Furthermore, there is evidence that a direct optical flow representation scheme is not used by biological systems for motion analysis Instead, we advocate distributed representations of motion, in which the encoding of image plane velocity is implicit As a simple example of this idea, and in consideration of the errors in the flow vectors, we re-cast the traditional optical flow problem as a probabilistic one, modeling the measurement and constraint errors as random variables The resulting framework produces probability distributions of optical flow, allowing proper handling of the uncertainties inherent in the optical flow computation, and facilitating the combination with information from other sources We demonstrate the advantages of this probabilistic approach on a set of examples In order to overcome the temporal aliasing commonly found in time-sampled imagery (eg, video), we develop a probabilistic "coarse-to-fine" algorithm that functions much like a Kalman filter over scale We implement an efficient version of this algorithm and show its success in computing Gaussian distributions of optical flow of both synthetic and real image sequences We then extend the notion of distributed representation to a generalized framework that is capable of representing multiple motions at a point We develop an example representation through a series of modifications of the differential approach to optical flow estimation We show that this example is capable of representing multiple motions at a single image location and we demonstrate its use near occlusion boundaries and on simple synthetic examples containing transparent objects Finally, we show that these distributed representation are effective as models for biological motion representation We show qualitative comparisons of stages of the algorithm with neurons found in mammalian visual systems, suggesting experiments to test the validity of the model We demonstrate that such a model can account quantitatively for a set of psychophysical data on the perception of moving sinusoidal plaid patterns (Copies available exclusively from MIT Libraries, Rm 14-0551, Cambridge, MA 02139-4307 Ph 617-253-5668; Fax 617-253-1690) (Abstract shortened by UMI)

161 citations


Journal ArticleDOI
TL;DR: It is shown that the proposed method outperforms the conventional full search method by improving the overall SNR of the prediction error by at least 1 dB and reducing the bit rate by 10%.
Abstract: A general approach to block-matching motion estimation is introduced. It very well handles the complex motion found in broadcasting television signals by comparing each block of the current frame with a deformed quadrilateral of the previous one. Calculation of the extra motion information requires additional operations that increase the computational load but the improved prediction reduces the bit rate. It is shown that the proposed method outperforms the conventional full search method by improving the overall SNR of the prediction error by at least 1 dB and reducing the bit rate by 10%.

154 citations


Proceedings ArticleDOI
11 May 1993
TL;DR: A physically based deformable model which can be used to track and analyze non-rigid motion of dynamic structures in time sequences of 2-D or 3-D medical images and provides a sound framework for modal analysis, which allows a compact representation of a general deformation by a reduced number of parameters.
Abstract: The authors present a physically based deformable model which can be used to track and analyze non-rigid motion of dynamic structures in time sequences of 2-D or 3-D medical images. The model considers an object undergoing an elastic deformation as a set of masses linked by springs, where the natural length of the springs is set equal to zero and is replaced by a set of constant equilibrium forces, which characterize the shape of the elastic structure in the absence of external forces. This model has the extremely nice property of yielding dynamic equations which are linear and decoupled for each coordinate, irrespective of the amplitude of the deformation. It provides a reduced algorithmic complexity, and a sound framework for modal analysis, which allows a compact representation of a general deformation by a reduced number of parameters. The power of the approach to segment, track and analyze 2-D and 3-D images is demonstrated by a set of experimental results on various complex medical images. >

126 citations


BookDOI
01 Jan 1993
TL;DR: In this article, Hierarchical Model-Based Motion Estimation P. Anandan, J.R. Bergen, K.T. Smith, C.S. Kim, F.M. Zakhor, E.L. Lim.
Abstract: Preface. 1. Hierarchical Model-Based Motion Estimation P. Anandan, J.R. Bergen, K.J. Hanna, R. Hingorani. 2. An Estimation Theoretic Perspective on Image Processing and the Calculation of Optical Flow T.M. Chin, M.R. Luettgen, W.C. Karl, A.S. Willsky. 3. Estimation of 2-D Motion Fields from Image Sequences with Application to Motion-Compensated Processing J. Konrad, E. Dubois. 4. Edge-Based 3-D Camera Motion Estimation with Application to Video Coding E. Zakhor, F. Lari. 5. Motion Compensation: Visual Aspects, Accuracy, and Fundamental Limits B. Girod. 6. Motion Field Estimators and their Application to Image Interpolation S. Tubaro, F. Rocca. 7. Subsampling of Digital Image Sequences using Motion Information R.A.F. Belfor, R.L. Lagendijk, J. Biemond. 8. Image Sequence Coding using Motion-Compensated Subband Decomposition A. Nicoulin, M. Mattavelli, W. Li, A. Basso, A. Popat, M. Kunt. 9. Vector Quantization for Video Data Compression R.M. Mersereau, M.J.T. Smith, C.S. Kim, F. Kossentini, K.K. Truong. 10. Model-Based Image Sequence Coding M. Buck, N. Diehl. 11. Human Facial Motion Analysis and Synthesis with Applications to Model-Based Coding K. Aizawa, C.s. Choi, H. Harashima, T.S. Huang. 12. Motion Compensated Spatiotemporal Kalman Filtering J.W. Woods, J. Kim. 13. Multiframe Wiener Restoration of Image Sequences M.K. Ozkan, M.I. Sezan, A.T. Erdem, A.M. Tekalp. 14. 3-D Median Structures for Image Sequence Filtering and Coding T. Viero, Y. Neuvo. 15. Video Compression for Digital ATV Systems J.G. Apostolopoulos, J.S. Lim. Index.

109 citations


Journal ArticleDOI
TL;DR: This paper provides a validation of an optoelectric motion-tracking system used in a dynamic knee assessment study and suggests that all systems used in two- or three-dimensional motion analysis should be tested similarly in the actual configuration used.

78 citations


Proceedings ArticleDOI
11 May 1993
TL;DR: The authors describe an algorithm based on hierarchical model-based estimation and refinement that aims to make best use of the information in stereo and motion data sets to estimate scene structure that results in local ambiguities in information provided by one data set being resolved by information provided from the other data set.
Abstract: The authors describe an algorithm based on hierarchical model-based estimation and refinement that aims to make best use of the information in stereo and motion data sets to estimate scene structure. Information from both data sets is used to compute simultaneously stereo and motion correspondences that are consistent with a single scene structure. One result is that local ambiguities in information provided by one data set are resolved by information provided by the other data set. The algorithm uses an infinitesimal rigid-body motion model to estimate relative camera orientation and local ranges for both the stereo and motion components of the data. If the relative orientation of the cameras in the stereo data set is known, the solution proposed can be re-derived using fixed rather than variable camera parameters for the stereo data set. >

73 citations


Patent
04 Jun 1993
TL;DR: In this article, the first and second circuit apparatus, in response to relatively high-resolution image data from an ongoing input series of successive given pixel-density image-data frames that occur at a relatively high frame rate (e.g., 30 frames per second), derived, after a certain processing-system delay, an ongoing output series of given given pixel density vector-dataframes that occurred at the same given frame rate.
Abstract: First circuit apparatus, comprising a given number of prior-art image-pyramid stages, together with second circuit apparatus, comprising the same given number of novel motion-vector stages, perform cost-effective hierarchical motion analysis (HMA) in real time, with minimum system processing delay and/or employing minimum hardware structure. Specifically, the first and second circuit apparatus, in response to relatively high-resolution image data from an ongoing input series of successive given pixel-density image-data frames that occur at a relatively high frame rate (e.g., 30 frames per second), derives, after a certain processing-system delay, an ongoing output series of successive given pixel-density vector-data frames that occur at the same given frame rate. Each vector-data frame is indicative of image motion occurring between each pair of successive image frames.

63 citations


Journal ArticleDOI
TL;DR: A simple algorithm is presented which matches two single trajectories using only motion information and a second algorithm is proposed which matches multiple trajectories by combining motion and spatial match scores.

Book
02 Jan 1993
TL;DR: This article reviews some computational studies of vision, focusing on edge detection, binocular stereo, motion analysis, intermediate vision, and object recognition.
Abstract: The computational approach to the study of vision inquires directly into the sort of information processing needed to extract important information from the changing visual image---information such as the three-dimensional structure and movement of objects in the scene, or the color and texture of object surfaces. An important contribution that computational studies have made is to show how difficult vision is to perform, and how complex are the processes needed to perform visual tasks successfully. This article reviews some computational studies of vision, focusing on edge detection, binocular stereo, motion analysis, intermediate vision, and object recognition.

Proceedings ArticleDOI
27 Apr 1993
TL;DR: A sequence is decomposed into a set of overlapping 2-D regions, each region occupying a depth-ordered layer with specified transparency, intensity, and motion, and the resulting representation is efficient and flexible.
Abstract: A sequence is decomposed into a set of overlapping 2-D regions, each region occupying a depth-ordered layer with specified transparency, intensity, and motion. The resulting representation is efficient and flexible. While synthesis is straightforward, the analysis problem is challenging. The authors introduce a set of analysis techniques and demonstrate their use on some of the MPEG (Motion Picture Experts Group) sequences. The MPEG flower garden sequence is decomposed into a set of four static layer images. Given these four images and a small number of motion parameters, one can resynthesize the original 30-frame sequence remarkably well. >

Journal ArticleDOI
TL;DR: Both the algorithms for the stereo and motion estimation are presented here, together with some experimental results on images obtained from natural scenes containing motion.
Abstract: In this paper, an approach with combined stereo and motion analysis to establish correspondences in a sequence of stereo images is outlined. The advantages of the presented approach are (1) in both the motion and stereo estimation no restriction on rigid and/or planar objects is assumed; (2) by introducing the image pyramid in the matching process — pyramid quided edge-point matching for motion estimation and multi-resolutional dynamic programming for disparity estimation — large motion and disparity vectors can be computed easily; (3) in order to exclude ambiguities with the dynamic programming, the cost function takes the interline, interframe and multi-resolutional spatial information into account. Both the algorithms for the stereo and motion estimation are presented here, together with some experimental results on images obtained from natural scenes containing motion.

Journal ArticleDOI
01 Feb 1993
TL;DR: A new method of 3-D motion estimation of a speaker's head is presented, consisting of two steps, which is more robust than existing methods, even when the motion of an object is rather large or complicated.
Abstract: Model-based image coding applied to interpersonal communication achieves very low bit-rate image transmission. To accomplish it, accurate three-dimensional (3-D) motion estimation of a speaker is necessary. A new method of 3-D motion estimation is presented, consisting of two steps. In the first, facial contours and feature points of a speaker are extracted using filtering and Snake algorithms. Five feature points on a speaker's facial image are tracked between consecutive picture frames, which gives 2-D motion vectors of the feature points. Then, in the second step, the 3-D motion of a speaker's head is estimated using a three-layered neural network model, after training with many possible motion patterns of the human head using an existing 3-D general shape model. Experimental results show that our method not only achieves good results but is also more robust than existing methods, even when the motion of an object is rather large or complicated. Accurately estimated 3-D motion parameters can realise image transmission at a very low bit rate.

Proceedings ArticleDOI
15 Jun 1993
TL;DR: The dynamic retina exploits normally undesirable camera motion as a necessary step in detecting image contrast, by using dynamic receptive fields instead of traditional spatial-neighborhood operators, and develops an appropriate photoreceptor response function, based on light-adaptation models for vertebrate receptors.
Abstract: The dynamic retina is an efficient, biologically-inspired early vision architecture that is well-suited to active vision platforms. It exploits normally undesirable camera motion as a necessary step in detecting image contrast, using dynamic receptive fields instead of traditional spatial-neighborhood operators. A receptor response function, based on a light-adaptation model for vertebrate receptors, works together with the camera movements to compute spatial image contrast. The dynamic retina also responds to moving objects, producing a clear signature from which motion parameters can be extracted. >

Journal ArticleDOI
TL;DR: An enhanced set of displays are developed for an existing opto-electronic device employed for the non-invasive measurement of movement in the upper spine, expected to be valuable in improving the accuracy of attempts to identify normal versus pathological motion in the cervical spine.

Journal ArticleDOI
TL;DR: The proposed motion tracking method has been successfully applied to the behavioral analysis of a slug in the biological study of learning and memory formation in slugs, and also to the problem of tracking the boundary of the left ventricle of the heart from time-varying ultrasonic echocardiographic images.

Proceedings ArticleDOI
14 Jun 1993
TL;DR: In this article, the authors consider specular interreflections and explore the effects of both motion parallax and changes in shading on qualitative shape recovery from moving surfaces and conclude that reliable qualitative shape information is generally available only at discontinuities in the image flow field.
Abstract: The authors address the problem of qualitative shape recovery from moving surfaces. The analysis is unique in that they consider specular interreflections and explore the effects of both motion parallax and changes in shading. To study this situation they define an image flow field called the reflection flow field, which describes the motion of reflection points and the motion of the surface. From a kinematic analysis, they show that the reflection flow is qualitatively different from the motion parallax because it is discontinuous at or near parabolic curves. They also show that when the gradient of the reflected image is strong, gradient-based flow measurement techniques approximate the reflection flow field and not the motion parallax. They conclude that reliable qualitative shape information is generally available only at discontinuities in the image flow field. >

Journal ArticleDOI
TL;DR: The peripheral visual system analyzes motion of rigid patterns containing texture boundaries more accurately than does the fovea, consistent with a current model of motion analysis that combines responses of Fourier and non-Fourier motion pathways using a vector sum operation.

Proceedings ArticleDOI
08 Sep 1993
TL;DR: In this article, the authors studied the spatio-temporal shape of receptive fields of simple cells in the monkey visual cortex, and proposed a Gaussian Derivative (GD) model to fit these fields well in a transformed variable space.
Abstract: We studied the spatio-temporal shape of `receptive fields' of simple cells in the monkey visual cortex. Receptive fields are maps of the regions in space and time that affect a cell's electrical responses. Fields with no change in shape over time responded to all directions of motion; fields with changing shape over time responded to only some directions of motion. A Gaussian Derivative (GD) model fit these fields well, in a transformed variable space that aligned the centers and principal axes of the field and model in space-time. The model accounts for fields that vary in orientation, location, spatial scale, motion properties, and number of lobes. The model requires only ten parameters (the minimum possible) to describe fields in two dimensions of space and one of time. A difference-of-offset-Gaussians (DOOG) provides a plausible physiological means to form GD model fields. Because of its simplicity, the GD model improves the efficiency of machine vision systems for analyzing motion. An implementation produced robust local estimates of the direction and speed of moving objects in real scenes.

Journal ArticleDOI
TL;DR: A new two-view motion algorithm is presented and then extended to long sequence motion analysis, which automatically finds the proper model that applies to an image sequence and gives the globally optimal solution for the motion and structure parameters under the chosen model.


Journal ArticleDOI
TL;DR: Simulation results show that the accuracy of the most widely used geometric centroid estimation falls short of the theoretical limit and may be outperformed by other algorithms.

Journal ArticleDOI
01 Nov 1993
TL;DR: This paper describes both the architecture and the software of the vision processor, a vision processor for moving-object analysis in time-varying images, which consists of three components corresponding to these three stages.
Abstract: This paper proposes a vision processor for moving-object analysis in time-varying images. The process of motion analysis can be divided into three stages: moving-object candidate detection, object tracking, and final motion analysis. The processor Consists of three components corresponding to these three stages. The first isan overall image processing unitwith local parallel architecture. It locates candidate regions for moving objects. The second is a multimicroprocessor system consisting of 16 local modules. Each module tracks one candidate region. The third is the host workstation. In this paper, we describe both the architecture and the software of the vision processor.

Proceedings ArticleDOI
15 Jun 1993
TL;DR: The authors propose to use an information measure approach, based on comparisons between an individual and a class, and a set of pixels, to obtain the optimal motion description from two consecutive images including multiple moving parts.
Abstract: A method to obtain the optimal motion description from two consecutive images including multiple moving parts is presented. It copes with segmentation and motion estimation problems. Segmentation is necessary for motion estimation of each part, and vice versa. The authors propose to use an information measure approach, based on comparisons between an individual (or pixel) and a class (or set of pixels). First, the motion of an edge segment is optimally modeled. Next, merging and splitting processes are iterated until the minimum description is obtained for the whole image. As a result, the image is segmented into several regions, each of which is represented by an edge segment list, and, at the same time, the maximum likelihood motion estimation is obtained for each region. Experiments performed on real images are shown. >

Journal ArticleDOI
TL;DR: The normal form is derived for Chua's circuit in which the piecewise-linear function is replaced by a cubic nonlinearity and a partial bifurcation analysis of the normal form equations is used to show how Chua’s system can be made to track the motion of low level image features through parameter variations in the bIfurcation function.
Abstract: Low level vision for feature detection, motion analysis, or image segmentation is typically performed in parallel and is computationally intensive. The dynamic nature of scenes and the requirements for real time processing place further demands upon visual sensing. Dynamical systems which mimic the complexity of natural scenes provide an alternative to traditional computer vision approaches. However the design of such systems and the synthesis of complex, nonlinear dynamical systems by the interactions of simpler, low order systems remains a critical problem. One approach to this problem is to use the relative simplicity of Chua's circuit to provide a convenient model for the dynamics and bifurcation phenomena in more complex systems. In this paper the normal form is derived for Chua's circuit in which the piecewise-linear function is replaced by a cubic nonlinearity. A partial bifurcation analysis of the normal form equations is then used to show how Chua's system can be made to track the motion of low level image features through parameter variations in the bifurcation function.

Journal ArticleDOI
TL;DR: A complex associative memory model based on a neural network architecture is proposed for tracking three-dimensional objects in a dynamic environment and is readily amenable to optoelectronic implementation.
Abstract: A complex associative memory model based on a neural network architecture is proposed for tracking three-dimensional objects in a dynamic environment. The storage representation of the complex associative memory model is based on an efficient amplitude-modulated phase-only matched filter. The input to the memory is derived from the discrete Fourier transform of the edge coordinates of the to-be-recognized moving object, where the edges are obtained through motion-based segmentation of the image scene. An adaptive threshold is used during the decision-making process to indicate a match or identify a mismatch. Computer simulation on real-world data proves the effectiveness of the proposed model. The proposed scheme is readily amenable to optoelectronic implementation.

Proceedings ArticleDOI
15 Jun 1993
TL;DR: An analytical method for recovering 3-D motion and structure of four or more points from one motion of a stereo rig is described and nonlinear minimization can be used to improve the result.
Abstract: An analytical method for recovering 3-D motion and structure of four or more points from one motion of a stereo rig is described. The extrinsic parameters are unknown. Because of the exploitation of information redundancy, the approach gains over the traditional motion and structure from motion approach in that less features and less motions are required. Thus, more robust estimation of motion and structure can be obtained. Since the constraint on the rotation matrix is not fully exploited in the analytical method, nonlinear minimization can be used to improve the result. Both computer simulated data and real data are used to validate the proposed algorithm. Very promising results are obtained. >

Proceedings ArticleDOI
R. Hsu1, M. Kageyama1, H. Fukui1, Y. Nakaya1, Hiroshi Harashima1 
03 Nov 1993
TL;DR: This paper proposes a novel approach to image coding of the human arm, based on kinematic modeling of thehuman arm and motion analysis of the head-shoulder image sequences, and synthesizes shoulder image sequences from one frame of texture image, the wire-frame arm model, and estimated kinematics motion.
Abstract: Analysis/synthesis image coding is potentially a powerful technique for compressing scenes dominated by the head-shoulder images, as in a videotelephone scene with a closeup of the human upper-body. In this paper, we propose a novel approach to image coding of the human arm, based on kinematic modeling of the human arm and motion analysis of the head-shoulder image sequences. The main characteristics of this approach are that i) it relies on the assumption that the human arm can be modeled as a kinematic linkage connected by movable joints, ii) it approximates the 3-D kinematic arm motion from the 2-D image velocities, and iii) it synthesizes shoulder image sequences from one frame of texture image, the wire-frame arm model, and estimated kinematic motion. We evaluated our approach in simulations involving a grayscale monocular image sequence of a moving arm. >