scispace - formally typeset
Search or ask a question
Journal ArticleDOI

The computation of optical flow

01 Sep 1995-ACM Computing Surveys (ACM)-Vol. 27, Iss: 3, pp 433-466
TL;DR: The computation of optical flow is investigated in this survey: widely known methods for estimating optical flow are classified and examined by scrutinizing the hypothesis and assumptions they use.
Abstract: Two-dimensional image motion is the projection of the three-dimensional motion of objects, relative to a visual sensor, onto its image plane. Sequences of time-orderedimages allow the estimation of projected two-dimensional image motion as either instantaneous image velocities or discrete image displacements. These are usually called the optical flow field or the image velocity field. Provided that optical flow is a reliable approximation to two-dimensional image motion, it may then be used to recover the three-dimensional motion of the visual sensor (to within a scale factor) and the three-dimensional surface structure (shape or relative depth) through assumptions concerning the structure of the optical flow field, the three-dimensional environment, and the motion of the sensor. Optical flow may also be used to perform motion detection, object segmentation, time-to-collision and focus of expansion calculations, motion compensated encoding, and stereo disparity measurement. We investigate the computation of optical flow in this survey: widely known methods for estimating optical flow are classified and examined by scrutinizing the hypothesis and assumptions they use. The survey concludes with a discussion of current research issues.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: A review of recent as well as classic image registration methods to provide a comprehensive reference source for the researchers involved in image registration, regardless of particular application areas.

6,842 citations


Cites methods from "The computation of optical flow"

  • ...Finally, the optical flow approach was originally motivated by estimation of relative motion between images [19]....

    [...]

Journal ArticleDOI
TL;DR: A comprehensive survey of efforts in the past couple of decades to address the problems of representation, recognition, and learning of human activities from video and related applications is presented.
Abstract: The past decade has witnessed a rapid proliferation of video cameras in all walks of life and has resulted in a tremendous explosion of video content. Several applications such as content-based video annotation and retrieval, highlight extraction and video summarization require recognition of the activities occurring in the video. The analysis of human activities in videos is an area with increasingly important consequences from security and surveillance to entertainment and personal archiving. Several challenges at various levels of processing-robustness against errors in low-level processing, view and rate-invariant representations at midlevel processing and semantic representation of human activities at higher level processing-make this problem hard to solve. In this review paper, we present a comprehensive survey of efforts in the past couple of decades to address the problems of representation, recognition, and learning of human activities from video and related applications. We discuss the problem at two major levels of complexity: 1) "actions" and 2) "activities." "Actions" are characterized by simple motion patterns typically executed by a single human. "Activities" are more complex and involve coordinated actions among a small number of humans. We will discuss several approaches and classify them according to their ability to handle varying degrees of complexity as interpreted above. We begin with a discussion of approaches to model the simplest of action classes known as atomic or primitive actions that do not require sophisticated dynamical modeling. Then, methods to model actions with more complex dynamics are discussed. The discussion then leads naturally to methods for higher level representation of complex activities.

1,426 citations


Cites background from "The computation of optical flow"

  • ...Examples of such complex actions include the steps in a ballet dancing video, a juggler juggling a ball, and a music conductor conducting an orchestra using complex hand gestures....

    [...]

Book
18 Feb 2002
TL;DR: The new edition of Feature Extraction and Image Processing provides an essential guide to the implementation of image processing and computer vision techniques, explaining techniques and fundamentals in a clear and concise manner, and features a companion website that includes worksheets, links to free software, Matlab files, solutions and new demonstrations.
Abstract: Image processing and computer vision are currently hot topics with undergraduates and professionals alike. "Feature Extraction and Image Processing" provides an essential guide to the implementation of image processing and computer vision techniques, explaining techniques and fundamentals in a clear and concise manner. Readers can develop working techniques, with usable code provided throughout and working Matlab and Mathcad files on the web. Focusing on feature extraction while also covering issues and techniques such as image acquisition, sampling theory, point operations and low-level feature extraction, the authors have a clear and coherent approach that will appeal to a wide range of students and professionals.The new edition includes: a new coverage of curvature in low-level feature extraction (SIFT and saliency) and features (phase congruency); geometric active contours; morphology; and camera models and an updated coverage of image smoothing (anistropic diffusion); skeletonization; edge detection; curvature; and shape descriptions (moments). It is an essential reading for engineers and students working in this cutting edge field. It is an ideal module text and background reference for courses in image processing and computer vision. It features a companion website that includes worksheets, links to free software, Matlab files, solutions and new demonstrations.

929 citations


Cites background from "The computation of optical flow"

  • ...The major survey (Beauchemin and Barron, 1995) of the approaches to optical flow is rather dated now, as is their performance appraisal (Barron et al....

    [...]

Journal ArticleDOI
TL;DR: 3-D model-based approaches have the capability to improve the diagnostic value of cardiac images, but issues as robustness, 3-D interaction, computational complexity and clinical validation still require significant attention.
Abstract: Three-dimensional (3-D) imaging of the heart is a rapidly developing area of research in medical imaging. Advances in hardware and methods for fast spatio-temporal cardiac imaging are extending the frontiers of clinical diagnosis and research on cardiovascular diseases. In the last few years, many approaches have been proposed to analyze images and extract parameters of cardiac shape and function from a variety of cardiac imaging modalities. In particular, techniques based on spatio-temporal geometric models have received considerable attention. This paper surveys the literature of two decades of research on cardiac modeling. The contribution of the paper is three-fold: (1) to serve as a tutorial of the field for both clinicians and technologists, (2) to provide an extensive account of modeling techniques in a comprehensive and systematic manner, and (3) to critically review these approaches in terms of their performance and degree of clinical evaluation with respect to the final goal of cardiac functional analysis. From this review it is concluded that whereas 3-D model-based approaches have the capability to improve the diagnostic value of cardiac images, issues as robustness, 3-D interaction, computational complexity and clinical validation still require significant attention.

625 citations


Cites methods from "The computation of optical flow"

  • ...6For a survey of optic flow methods in computer vision, see Beauchemin and Barron [81]....

    [...]

  • ...[81] S. S. Beauchemin and J. L. Barron, “The computation of optical flow,” ACM Comput....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: The analogy between images and statistical mechanics systems is made and the analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations, creating a highly parallel ``relaxation'' algorithm for MAP estimation.
Abstract: We make an analogy between images and statistical mechanics systems. Pixel gray levels and the presence and orientation of edges are viewed as states of atoms or molecules in a lattice-like physical system. The assignment of an energy function in the physical system determines its Gibbs distribution. Because of the Gibbs distribution, Markov random field (MRF) equivalence, this assignment also determines an MRF image model. The energy function is a more convenient and natural mechanism for embodying picture attributes than are the local characteristics of the MRF. For a range of degradation mechanisms, including blurring, nonlinear deformations, and multiplicative or additive noise, the posterior distribution is an MRF with a structure akin to the image model. By the analogy, the posterior distribution defines another (imaginary) physical system. Gradual temperature reduction in the physical system isolates low energy states (``annealing''), or what is the same thing, the most probable states under the Gibbs distribution. The analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations. The result is a highly parallel ``relaxation'' algorithm for MAP estimation. We establish convergence properties of the algorithm and we experiment with some simple pictures, for which good restorations are obtained at low signal-to-noise ratios.

18,761 citations


"The computation of optical flow" refers methods in this paper

  • ...One stratea~ to handle occlusion involves using binary line processes that explicitly model intensity discontinuities [Geman and Geman 1984]....

    [...]

Proceedings Article
24 Aug 1981
TL;DR: In this paper, the spatial intensity gradient of the images is used to find a good match using a type of Newton-Raphson iteration, which can be generalized to handle rotation, scaling and shearing.
Abstract: Image registration finds a variety of applications in computer vision. Unfortunately, traditional image registration techniques tend to be costly. We present a new image registration technique that makes use of the spatial intensity gradient of the images to find a good match using a type of Newton-Raphson iteration. Our technique is taster because it examines far fewer potential matches between the images than existing techniques Furthermore, this registration technique can be generalized to handle rotation, scaling and shearing. We show how our technique can be adapted tor use in a stereo vision system.

12,944 citations

Journal ArticleDOI
TL;DR: A technique for image encoding in which local operators of many scales but identical shape serve as the basis functions, which tends to enhance salient image features and is well suited for many image analysis tasks as well as for image compression.
Abstract: We describe a technique for image encoding in which local operators of many scales but identical shape serve as the basis functions. The representation differs from established techniques in that the code elements are localized in spatial frequency as well as in space. Pixel-to-pixel correlations are first removed by subtracting a lowpass filtered copy of the image from the image itself. The result is a net data compression since the difference, or error, image has low variance and entropy, and the low-pass filtered image may represented at reduced sample density. Further data compression is achieved by quantizing the difference image. These steps are then repeated to compress the low-pass image. Iteration of the process at appropriately expanded scales generates a pyramid data structure. The encoding process is equivalent to sampling the image with Laplacian operators of many scales. Thus, the code tends to enhance salient image features. A further advantage of the present code is that it is well suited for many image analysis tasks as well as for image compression. Fast algorithms are described for coding and decoding.

6,975 citations


"The computation of optical flow" refers methods in this paper

  • ...A Laplacian pyramid is used to hierarchically represent the image data [Burt and Adelson 1983], and motion es­timation is performed by SSD minimiza­tion with respect to a particular model of motion....

    [...]

Journal ArticleDOI
TL;DR: The theory of edge detection explains several basic psychophysical findings, and the operation of forming oriented zero-crossing segments from the output of centre-surround ∇2G filters acting on the image forms the basis for a physiological model of simple cells.
Abstract: A theory of edge detection is presented. The analysis proceeds in two parts. (1) Intensity changes, which occur in a natural image over a wide range of scales, are detected separately at different scales. An appropriate filter for this purpose at a given scale is found to be the second derivative of a Gaussian, and it is shown that, provided some simple conditions are satisfied, these primary filters need not be orientation-dependent. Thus, intensity changes at a given scale are best detected by finding the zero values of delta 2G(x,y)*I(x,y) for image I, where G(x,y) is a two-dimensional Gaussian distribution and delta 2 is the Laplacian. The intensity changes thus discovered in each of the channels are then represented by oriented primitives called zero-crossing segments, and evidence is given that this representation is complete. (2) Intensity changes in images arise from surface discontinuities or from reflectance or illumination boundaries, and these all have the property that they are spatially. Because of this, the zero-crossing segments from the different channels are not independent, and rules are deduced for combining them into a description of the image. This description is called the raw primal sketch. The theory explains several basic psychophysical findings, and the operation of forming oriented zero-crossing segments from the output of centre-surround delta 2G filters acting on the image forms the basis for a physiological model of simple cells (see Marr & Ullman 1979).

6,893 citations

Journal ArticleDOI
TL;DR: These comparisons are primarily empirical, and concentrate on the accuracy, reliability, and density of the velocity measurements; they show that performance can differ significantly among the techniques the authors implemented.
Abstract: While different optical flow techniques continue to appear, there has been a lack of quantitative evaluation of existing methods. For a common set of real and synthetic image sequences, we report the results of a number of regularly cited optical flow techniques, including instances of differential, matching, energy-based, and phase-based methods. Our comparisons are primarily empirical, and concentrate on the accuracy, reliability, and density of the velocity measurements; they show that performance can differ significantly among the techniques we implemented.

4,771 citations