scispace - formally typeset
Search or ask a question

Showing papers on "Motion analysis published in 1998"


Proceedings ArticleDOI
04 Jan 1998
TL;DR: This paper presents a comprehensive framework for tracking moving humans in an indoor environment from sequences of synchronized monocular grayscale images captured from multiple fixed cameras.
Abstract: This paper presents a comprehensive framework for tracking moving humans in an indoor environment from sequences of synchronized monocular grayscale images captured from multiple fixed cameras. The proposed framework consists of three main modules: Single View Tracking (SVT), Multiple View Transition Tracking (MVTT), and Automatic Camera Switching (ACS). Bayesian classification schemes based on motion analysis of human features are used to track (spatially and temporally) a subject image of interest between consecutive frames. The automatic camera switching module predicts the position of the subject along a spatial-temporal domain, and then, selects the camera which provides the best view and requires the least switching to continue tracking. Limited degrees of occlusion are tolerated within the system. Tracking is based upon the images of upper human, bodies captured from various viewing angles, and non-human moving objects are excluded using Principal Component Analysis (PCA). Experimental results are presented to evaluate the performance of the tracking system.

134 citations


Journal ArticleDOI
T.R. Kronhamn1
01 Aug 1998
TL;DR: The author presents a multihypothesis cartesian Kalman filter (MHCKF) applied to the problem of bearings-only target motion analysis (TMA) from a single moving platform and a method to adaptively control the ownship motion based on a measure of "available range information" extracted from the M HCKF.
Abstract: The author presents a multihypothesis cartesian Kalman filter (MHCKF) applied to the problem of bearings-only target motion analysis (TMA) from a single moving platform. A method to adaptively control the ownship motion based on a measure of "available range information" extracted from the MHCKF is also presented. The properties of the MHCKF algorithm are discussed qualitatively and illustrated by examples. The adaptive ownship motion is demonstrated and the estimated range is compared with the results from a fixed, two-leg ownship trajectory.

120 citations


Journal ArticleDOI
TL;DR: Investigation thresholds were invariant with direction of motion, but direction-discrimination thresholds were significantly higher for motion in oblique directions, even at low-coherence levels, which indicates that the oblique effect is relative to retinal coordinates.
Abstract: We measured motion-detection and motion-discrimination performance for different directions of motion, using stochastic motion sequences. Random-dot cinematograms containing 200 dots in a circular aperture were used as stimuli in a two-interval forced-choice procedure. In the motion-detection experiment, observers judged which of two intervals contained weak coherent motion, the other internal containing random motion only. In the direction-discrimination experiment, observers viewed a standard direction of motion followed by comparison motion in a slightly different direction. Observers indicated whether the comparison was clockwise or counterclockwise, relative to the standard. Twelve directions of motion were tested in the detection task and five standard directions (three cardinal directions and two oblique directions) in the discrimination task. Detection thresholds were invariant with direction of motion, but direction-discrimination thresholds were significantly higher for motion in oblique directions, even at low-coherence levels. Results from control conditions ruled out monitor artifacts and indicate that the oblique effect is relative to retinal coordinates. These results have broad implications for computational and physiological models of motion perception.

111 citations


Book ChapterDOI
02 Jun 1998
TL;DR: The main goal of this paper is to put well-established techniques for two-view motion analysis in the context of the theory of Total Least Squares and to make clear that robust and reliable motion analysis algorithms cannot be designed without a thorough statistical consideration of the consequences of errors in the input data.
Abstract: The main goal of this paper is to put well-established techniques for two-view motion analysis in the context of the theory of Total Least Squares and to make clear that robust and reliable motion analysis algorithms cannot be designed without a thorough statistical consideration of the consequences of errors in the input data.

101 citations


Proceedings ArticleDOI
12 May 1998
TL;DR: A hybrid real-time face tracker based on both sound and visual cues is presented that is robust to nonlinear source motions, complex backgrounds, varying lighting conditions, and a variety of source-camera depths.
Abstract: A hybrid real-time face tracker based on both sound and visual cues is presented. Initial talker locations are estimated acoustically from microphone array data while precise localization and tracking are derived from image information. A computationally efficient algorithm for face detection via motion analysis is employed to track individual faces at rates up to 30 frames per second. The system is robust to nonlinear source motions, complex backgrounds, varying lighting conditions, and a variety of source-camera depths. While the direct focus of this work is automated video conferencing, the face tracking capability has utility to many multimedia and virtual reality applications.

96 citations


Proceedings ArticleDOI
16 Aug 1998
TL;DR: A statistical analysis of the temporal distribution of appropriate local motion-based measures to perform a global motion characterization and considers motion features extracted from temporal cooccurrence matrices, and related to properties of homogeneity, acceleration or complexity.
Abstract: This paper describes an original approach for motion interpretation with a view to content-based video indexing. We exploit a statistical analysis of the temporal distribution of appropriate local motion-based measures to perform a global motion characterization. We consider motion features extracted from temporal cooccurrence matrices, and related to properties of homogeneity, acceleration or complexity. Results on various real video sequences are reported and provide a first validation of the approach.

84 citations


Proceedings Article
01 Dec 1998
TL;DR: From synthetic data, the relationship between image and scene patches is modeled, and between a scene patch and neighboring scene patches, and this yields an efficient method to form low-level scene interpretations.
Abstract: We seek the scene interpretation that best explains image data. For example, we may want to infer the projected velocities (scene) which best explain two consecutive image frames (image). From synthetic data, we model the relationship between image and scene patches, and between a scene patch and neighboring scene patches. Given a new image, we propagate likelihoods in a Markov network (ignoring the effect of loops) to infer the underlying scene. This yields an efficient method to form low-level scene interpretations. We demonstrate the technique for motion analysis and estimating high resolution images from low-resolution ones.

73 citations


Journal ArticleDOI
TL;DR: A 4-D polar transformation is defined to describe the left-ventricle (LV) motion and a method is presented to estimate it from sequences of 3-D images and a demonstration of its feasability on a series of gated SPECT sequences.

70 citations


Proceedings ArticleDOI
04 Jan 1998
TL;DR: A novel method for the shape and motion estimation of a deformable model using error residuals from model-based motion analysis and it is demonstrated that this framework is a considerable improvement over a framework that uses only optical flow information and edges.
Abstract: We present a novel method for the shape and motion estimation of a deformable model using error residuals from model-based motion analysis. The motion of the model is first estimated using a model-based least squares method. Using the residuals from the least squares solution, the non-rigid structure of the model can be better estimated by computing how changes in the shape of the model affect its motion parameterization. This method is implemented as a component in a deformable model-based framework that uses optical flow information and edges. This general model-based framework is applied to human face shape and motion estimation. We present experiments that demonstrate that this framework is a considerable improvement over a framework that uses only optical flow information and edges.

65 citations


Journal ArticleDOI
TL;DR: The application of a video-based motion analysis system for goniometry of finger joints during measurement of the fingertip motion area indicates that the results are comparable with those obtained by conventional goniometer.
Abstract: The application of a video-based motion analysis system for goniometry of finger joints during measurement of the fingertip motion area has been assessed. The results indicate that the motion analysis system is reliable for angular measurements of finger joints that are comparable with those obtained by conventional goniometer. The advantages of using the motion analysis system is that it can record and show the changes in angle of all finger joints continuously during finger motion.

65 citations


Proceedings ArticleDOI
23 Jun 1998
TL;DR: A multiple cite-based localization scheme combined with a tracking framework to reliably track the human arm dynamics in unconstrained environments and an interaction scheme between tracking and localization for improving the estimation process while reducing the computational requirements is proposed.
Abstract: The use of hard gestures provides an attractive means of interacting naturally with a computer generated display. Using one or more video cameras, the hand movements can potentially be interpreted as meaningful gestures. One key problem in building such am interface without a restricted setup is the ability to localize and track the human arm robust in image sequences. This paper proposes a multiple cite-based localization scheme combined with a tracking framework to reliably track the human arm dynamics in unconstrained environments. The localization scheme integrates the multiple cues of motion, shape, and color for locating a set of key image features. Using constraint fusion, these features are tracked by a modified Extended Kalman Filter that exploits the articulated structure of the arm. We also propose an interaction scheme between tracking and localization for improving the estimation process while reducing the computational requirements. The performance of the frameworks is validated with the help of extensive experiments and simulations.

Journal ArticleDOI
Zhengyou Zhang1
TL;DR: The author shows that, given a reasonable initial guess of the epipolar geometry, the last two criteria are equivalent when the epipoles are at infinity, and differ from each other only a little even when the Epiphany is in the image, as shown experimentally.
Abstract: The three best-known criteria in two-view motion analysis are based, respectively, on the distances between points and their corresponding epipolar lines, on the gradient-weighted epipolar errors, and on the distances between points and the re-projections of their reconstructed points. The last one has a better statistical interpretation, but is significantly slower than the first two. The author shows that, given a reasonable initial guess of the epipolar geometry, the last two criteria are equivalent when the epipoles are at infinity, and differ from each other only a little even when the epipoles are in the image, as shown experimentally. The first two criteria are equivalent only when the epipoles are at infinity and when the observed object/scene has the same scale in the two images. This suggests that the second criterion is sufficient in practice because of its computational efficiency. Experiments with several thousand computer simulations and four sets of real data confirm the analysis.

Proceedings ArticleDOI
23 Jun 1998
TL;DR: A new direct estimation method for motion estimation and 3D reconstruction from stereo image sequences obtained by a stereo rig moving through a rigid world is described.
Abstract: We investigate the relationship between the kinematics (infinitesimal motion model) of a calibrated Stereo Rig and point and line image feature measurements seen at two time instances of the rig's motion (four images in all) In particular we are interested in the byproduct of this analysis providing a direct connection between the spatio-temporal derivatives of the images at two time instances and kinematics of the 3D motion of the Rig We establish a fundamental result showing that 3 quadruples of point-line-line-line matches (ie, point in the reference image and lines coincident with the corresponding points in the remaining three images) are sufficient for a unique linear solution for the kinematics of the rig In other words, the projected instantaneous motion of "one and a half" 3D lines is sufficient for recovering the kinematics of the moving rig In particular, spatio-temporal derivatives across 3 points are sufficient for a direct estimation of the rig's motion Consequently, we describe a new direct estimation method for motion estimation and 3D reconstruction from stereo image sequences obtained by a stereo rig moving through a rigid world Correspondences (optic flow) are not required as spatio-temporal derivative are used instead One can then use the images from both pairs combined, to compute a dense depth map Finally, since the basic equations are linear, we combine the contribution coming from all pixels in the image using a Least Squares approach

Proceedings ArticleDOI
16 Aug 1998
TL;DR: A new algorithm is presented for feature point based motion tracking in long image sequences where dynamic scenes with multiple, independently moving objects are considered in which feature points may temporarily disappear enter and leave the view field.
Abstract: A new algorithm is presented for feature point based motion tracking in long image sequences. Dynamic scenes with multiple, independently moving objects are considered in which feature points may temporarily disappear enter and leave the view field. The existing approaches to feature point tracking have limited capabilities in handling incomplete trajectories, especially when the number of points and their speeds are large, and trajectory ambiguities are frequent. The proposed algorithm was designed to efficiently resolve these ambiguities.

Proceedings ArticleDOI
23 Jun 1998
TL;DR: This work describes the generation of a large pose-mosaic dataset: a collection of several thousand digital images, grouped by spatial position into spherical mosaics, each annotated with estimates of the acquiring camera's 6 DOF pose in an absolute coordinate system.
Abstract: We describe the generation of a large pose-mosaic dataset: a collection of several thousand digital images, grouped by spatial position into spherical mosaics, each annotated with estimates of the acquiring camera's 6 DOF pose (3 DOF position and 3 DOF orientation) in an absolute coordinate system. The pose-mosaic dataset was generated by acquiring images, grouped by spatial position into nodes (essentially, spherical mosaics). A prototype mechanical pan-tilt head was manually deployed to acquire the data. Manual surveying provided initial position estimates for each node. A back-projecting scheme provided initial rotational estimates. Relative rotations within each node, along with internal camera parameters, were refined automatically by an optimization-correlation scheme. Relative translations and rotations among nodes were refined according to point correspondences, generated automatically and by a human operator. The resulting pose-imagery is self-consistent under a variety of evaluation metrics. Pose-mosaics are useful "first-class" data objects, for example in automatic reconstruction of textured 3D CAD models which represent urban exteriors.

Journal ArticleDOI
TL;DR: A procedure to determine the optimal block size that minimizes the encoding rate for a typical block-based video coder is derived and this formula shows that the best block size is a function of the accuracy with which the motion vectors are encoded and several parameters related to key characteristics of the video scene.
Abstract: Despite the widespread experience with block-based video coders, there is little analysis or theory that quantitatively explains the effect of block size on encoding bit rate, and ordinarily the block size for a coder is chosen based on empirical experiments on video sequences of interest. In this work, we derive a procedure to determine the optimal block size that minimizes the encoding rate for a typical block-based video coder. To do this, we analytically model the effect of block size and derive expressions for the encoding rates for both motion vectors and difference frames as functions of block size. Minimizing these expressions leads to a simple formula that indicates how to choose the block size in these types of coders. This formula also shows that the best block size is a function of the accuracy with which the motion vectors are encoded and several parameters related to key characteristics of the video scene, such as image texture, motion activity, interframe noise, and coding distortion. We implement the video coder and use our analysis to optimize and explain its performance on real video frames.


Journal ArticleDOI
TL;DR: A two-stage model of motion perception that identifies moving spatial features and computes their velocity, achieving both high spatial localisation and reliable estimates of velocity is developed, allowing it to simulate human visual performance in the detection of noise images, transparent motion, some motion illusions, and second-order motion.

Journal ArticleDOI
TL;DR: The method allows one to operate the motion analysis system in real-time; even when the data elaboration unit is required to perform other processing functions, the only consequence is a decrease in system sampling rate.
Abstract: A method for real-time motion analysis based on passive markers is presented. An opto-electronic automatic motion analyser was used as hardware platform and the real-time operation was based on the interfacing between two levels of the system architecture. True real-time acquisition, processing and representation of two-dimensional and three-dimensional kinematics data were implemented through a newly conceived data acquisition procedure and high speed optimisation of the kinematics data processing. The method allows one to operate the motion analysis system in real-time; even when the data elaboration unit is required to perform other processing functions, the only consequence is a decrease in system sampling rate. The maximum number of processed and plotted markers in three dimensions at the highest system sampling rate (100 Hz) turned out to be suitable for the implementation of analytical and visual kinematics biofeedback. An example of the achievable level of complexity in terms of marker disposition model and graphic representation is reported by describing a demonstration of the real-time representation of human face movements. A clinical application of the method for patient position definition and control at radiotherapy units is presented.

Journal ArticleDOI
TL;DR: An object-based video coding system with new ideas in both the motion analysis and source encoding procedures, which produced visually more pleasing video with less blurriness and devoid of block artifacts, thus confirming the advantages ofobject-based coding at very low bit-rates.
Abstract: This paper describes an object-based video coding system with new ideas in both the motion analysis and source encoding procedures. The moving objects in a video are extracted by means of a joint motion estimation and segmentation algorithm based on the Markov random field (MRF) model. The two important features of the presented technique are the temporal linking of the objects, and the guidance of the motion segmentation with spatial color information. This facilitates several aspects of an object-based coder. First, a new temporal updating scheme greatly reduces the bit rate to code the object boundaries without resorting to crude lossy approximations. Next, the uncovered regions can be extracted and encoded in an efficient manner by observing their revealed contents. The objects are classified adaptively as P objects or I objects and encoded accordingly. Subband/wavelet coding is applied in encoding the object interiors. Simulations at very low bit rates yielded comparable performance in terms of reconstructed PSNR to the H.263 coder. The object-based coder produced visually more pleasing video with less blurriness and devoid of block artifacts, thus confirming the advantages of object-based coding at very low bit-rates.

Patent
23 Dec 1998
TL;DR: In this paper, a method and apparatus for processing and encoding video data is presented that allocates available bandwidth, and hence image quality, in dependence upon the relative speed of motion of objects in a sequence of images forming the video data.
Abstract: A method and apparatus for processing and encoding video data is presented that allocates available bandwidth, and hence image quality, in dependence upon the relative speed of motion of objects in a sequence of images forming the video data. Fast moving objects are allocated less quality, or precision, than slower moving or stationary objects. In a preferred embodiment of this invention, the quantization step size is dependent upon the magnitude of the motion vector associated with each block in each frame of a video sequence. In a further embodiment of this invention, the quantization step size is also dependent upon the location of each block in each frame, providing more precision to a central area of each frame. To reduce computational complexity, a motion activity map is created to identify areas of higher precision based upon the location and motion associated with each block. To further reduce computational complexity in a preferred embodiment, the sets of parameters for effecting the desired quality levels are predefined, and include, for example, an initial value and bounds for the quantizing factor that is used for encoding independent and predictive frames of the sequence of images. In a further preferred embodiment, the sets of parameters for effecting the desired quality levels are adjustable based upon a user's preferences.

Patent
01 Oct 1998
TL;DR: In this paper, a spatiotemporal finite element mesh model is used for non-rigid cyclic motion analysis using a series of images acquired from phase contrast magnetic resonance imaging.
Abstract: Disclosed is a method for nonrigid cyclic motion analysis using a series of images covering the cycle, acquired, for example, from phase contrast magnetic resonance imaging. The method is based on fitting a global spatiotemporal finite element mesh model to motion data samples of an extended region at all time frames. A spatiotemporal model is composed of time-varying finite elements, with the nonrigid motion of each characterized by a set of Fourier harmonics. The model is suitable for accurately modeling the kinematics of a cyclically moving and deforming object with complex geometry, such as that of the myocardium. The model has controllable built-in smoothing in space and time for achieving satisfactory reproducibility in the presence of noise. Motion data measured, with PC MRI for example, can be used to quantify motion and deformation by fitting the model to data.

Journal ArticleDOI
TL;DR: A low-cost REal-TIme Motion Analysis Chip, RETIMAC, is presented, which is suitable for dynamic scene analysis in computer vision applications and implements a gradient-based solution which has been demonstrated to be more reliable and precise with respect to several solutions proposed in the literature.
Abstract: Motion estimation is relevant for applications of both motion-compensated image sequence processing and dynamic scene analysis of computer vision. Different approaches and solutions have been proposed for these two applicative fields. In some cases, parallel architectures and dedicated chips for motion estimation in real-time have been developed. In this paper, a low-cost REal-TIme Motion Analysis Chip, RETIMAC, is presented, which is suitable for dynamic scene analysis in computer vision applications. This chip is capable of estimating optical flow fields in real-time, and has been especially developed for project OFCOMP (Optical Flow for COunting Moving People) DTM 45 ESPRIT III MEPI. It can be profitably used also for autonomous navigation, tracking, surveillance, counting moving objects, measuring velocity, etc., and for several computer vision applications which require as a first processing step the estimation of the apparent velocity of each pixel in the image plane (e.g., optical flow, velocity field). RETIMAC implements a gradient-based solution which has been demonstrated to be more reliable and precise with respect to several solutions proposed in the literature.

Journal ArticleDOI
TL;DR: In this paper, a simple method for the estimation of global motion parameters from sparse translational vector fields is presented, where differential operations on pairs of adjacent motion vectors are used to derive local estimates of rotation and scale.
Abstract: A simple method for the estimation of global motion parameters from sparse translational vector fields is presented. Differential operations on pairs of adjacent motion vectors are used to derive local estimates of rotation and change of scale. A majority vote is then applied to identify global trends corresponding to the required estimates. Key features of the proposed technique are its low complexity and its compatibility with standardised motion estimation tools.

Proceedings ArticleDOI
04 Oct 1998
TL;DR: The presented technique provides a way to address the 3-D recognition problem when rigid motion is assumed, and shows that information about translation, rotation and uniform scaling of a 2-D or 3D image can be represented in its 1-D projections.
Abstract: A new technique for matching a 2-D or 3-D image to a translated, rotated, and uniformly scaled reference image is proposed. This method comprises three steps: (1) calculating the Radon transform of the reference image and the images to be matched, (2) calculating the Fourier invariant (FI) descriptor for each 1-D projection image, and (3) matching the FI descriptor in the projection space. The theory shows that information about translation, rotation and uniform scaling of a 2-D or 3-D image can be represented in its 1-D projections; thus a 2-D or 3-D matching can be accomplished as a set of 1-D operations. In particular, the presented technique provides a way to address the 3-D recognition problem when rigid motion is assumed.

Journal ArticleDOI
TL;DR: A segmentation approach combining local spatial homogeneity with motion homogeneity is proposed, showing interesting performances of such techniques for an automatic construction of hierarchical representations of time-varying images, allowing for flexible access and manipulation of video content.

Proceedings ArticleDOI
09 Jan 1998
TL;DR: Simulation results show that successful moving object detection has been performed on macroblock level using several test sequences and proposed method has a significant advantage as motion analysis tool.
Abstract: We describe a method of moving object detection directly from MPEG coded data. Since motion information in MPEG coded data is determined in terms of coding efficiency point of view, it does not always provide real motion information of objects. We use a wide variety of coding information including motion vectors and DCT coefficients to estimate real object motion. Since such information can be directly obtained from coded bitstream, very fast operation can be expected. Moving objects are detected basically analyzing motion vectors and spatio-temporal correlation of motion in P-, and B-pictures. Moving objects are also detected in intra macroblocks by analyzing coding characteristics of intra macroblocks in P- and B-pictures and by investigating temporal motion continuity in I-pictures. The simulation results show that successful moving object detection has been performed on macroblock level using several test sequences. Since proposed method is very simple and requires much less computational power than the conventional object detection methods, it has a significant advantage as motion analysis tool.

Journal ArticleDOI
TL;DR: Maintaining camera horizontal and vertical separations above a sum of 30 degrees is sufficient for clinical testing and there is a need to explore the errors involved in placing two cameras at less than 60 degrees.

Book ChapterDOI
04 Jan 1998
TL;DR: The spatial blur and temporal smear effects induced by the camera's finite aperture and shutter speed are used for inferring both the shape and motion of the imaged objects.
Abstract: This paper addresses 3D shape recovery and motion estimation using a realistic camera model with an aperture and a shutter. The spatial blur and temporal smear effects induced by the camera's finite aperture and shutter speed are used for inferring both the shape and motion of the imaged objects.

Proceedings ArticleDOI
23 Jun 1998
TL;DR: This paper has developed an original method which relies on an orthogonality constraint between the spatial image gradient field and the motion model velocity field, while explicitly formalizing and handling both model and measurement noises.
Abstract: This paper is concerned with the analysis of 2D fluid motion from numerical images. The interpretation of such deformable flow fields can be derived from the characterization of linear motion models provided that first order approximations are considered in an adequate neighborhood of so-called singular points where the velocity becomes null. However, locating such points, delimiting this neighborhood, and estimating the associated 2D affine motion model, are intricate difficult problems. We explicitly address these three joint problems according to a statistical adaptive approach. In the fluid mechanics images we are dealing with, the motion model can be directly inferred from a single image, since the visualized form accounts for the underlying motion. We have developed an original method which relies on an orthogonality constraint between the spatial image gradient field and the motion model velocity field, while explicitly formalizing and handling both model and measurement noises. This method has been validated on several real fluid flow images.