scispace - formally typeset
Search or ask a question

Showing papers on "Motion estimation published in 1991"


Journal ArticleDOI
TL;DR: Direct methods for recovering the motion of an observer in a static environment in the case of pure rotation, pure translation, and arbitrary motion when the rotation is known are developed.
Abstract: We have developed direct methods for recovering the motion of an observer in a static environment in the case of pure rotation, pure translation, and arbitrary motion when the rotation is known. Some of these methods are based on the minimization of the difference between the observed time derivative of brightness and that predicted from the spatial brightness gradient, given the estimated motion. We minimize the square of the integral of this difference taken over the image region of interest. Other methods presented here exploit the fact that surfaces have to be in front of the observer in order to be seen. We do not establish point correspondences, nor do we estimate the optical flow. We use only first-order derivatives of the image brightness, and we do not assume an analytic form for the surface. We show that the field of view should be large to accurately recover the components of motion in the direction toward the image region. We also demonstrate the importance of points where the time derivative of brightness is small and discuss difficulties resulting from very large depth ranges. We emphasize the need for adequate filtering of the image data before sampling to avoid aliasing, in both the spatial and temporal dimensions.

379 citations


Proceedings ArticleDOI
03 Jun 1991
TL;DR: A highly parallel incremental stochastic minimization algorithm is presented which has a number of advantages over previous approaches and the incremental nature of the scheme makes it dynamic and permits the detection of occlusion and disocclusion boundaries.
Abstract: A novel approach to incrementally estimating visual motion over a sequence of images is presented. The authors start by formulating constraints on image motion to account for the possibility of multiple motions. This is achieved by exploiting the notions of weak continuity and robust statistics in the formulation of a minimization problem. The resulting objective function is non-convex. Traditional stochastic relaxation techniques for minimizing such functions prove inappropriate for the task. A highly parallel incremental stochastic minimization algorithm is presented which has a number of advantages over previous approaches. The incremental nature of the scheme makes it dynamic and permits the detection of occlusion and disocclusion boundaries. >

293 citations


Journal ArticleDOI
Atul Puri1, Rangarajan Aravind1
TL;DR: The authors address the problem of adapting the Motion Picture Experts Group (MPEG) quantizer for scenes of different complexity (at bit rates around 1 Mb/s), such that the perceptual quality of the reconstructed video is optimized.
Abstract: The authors address the problem of adapting the Motion Picture Experts Group (MPEG) quantizer for scenes of different complexity (at bit rates around 1 Mb/s), such that the perceptual quality of the reconstructed video is optimized. Adaptive quantisation techniques conforming to the MPEG syntax can significantly improve the performance of the encoder. The authors concentrate on a one-pass causal scheme to limit the complexity of the encoder. The system employs prestored models for perceptual quality and a bit rate that have been experimentally derived. A framework is provided for determining these models as well as adapting them to locally varying scene characteristics. The variance of an 8*8 (luminance) block is basic to the techniques developed. Following standard practice, it is defined as the average of the square of the deviations of the pixels in the block from the mean pixel value. >

201 citations


Journal ArticleDOI
TL;DR: Results of an experiment with real imagery are presented, involving estimation of 28 unknown translational, rotational, and structural parameters, based on 12 images with seven feature points.
Abstract: The problem considered involves the use of a sequence of noisy monocular images of a three-dimensional moving object to estimate both its structure and kinematics. The object is assumed to be rigid, and its motion is assumed to be smooth. A set of object match points is assumed to be available, consisting of fixed features on the object, the image plane coordinates of which have been extracted from successive images in the sequence. Structure is defined as the 3-D positions of these object feature points, relative to each other. Rotational motion occurs about the origin of an object-centered coordinate system, while translational motion is that of the origin of this coordinate system. In this work, which is a continuation of the research done by the authors and reported previously (ibid., vol.PAMI-8, p.90-9, Jan. 1986), results of an experiment with real imagery are presented, involving estimation of 28 unknown translational, rotational, and structural parameters, based on 12 images with seven feature points. >

200 citations


Journal ArticleDOI
Norbert Diehl1
TL;DR: In this paper, a method for segmenting video scenes hierarchically into several differently moving objects and subobjects is presented, where both contour and texture information from the single images and information from successive images are used to split up a scene into various objects.
Abstract: This contribution presents a method for segmenting video scenes hierarchically into several differently moving objects and subobjects. To this end, both contour and texture information from the single images and information from successive images is used to split up a scene into various objects. Furthermore, each of these objects is characterized by a transform h ( x,T ) with a parameter vector T which implicitely describes the surface shape and the three-dimensional motion of the objects in the scene. In order to estimate T of these transforms, an efficient algorithm is introduced. Thus, we obtain an object-oriented segmentation and a prediction of the image contents from one image to the next, which can be used in low bit-rate image coding.

177 citations


Proceedings ArticleDOI
07 Oct 1991
TL;DR: A layered model of scene segmentation based on explicitly representing the support of a homogeneous region is introduced, which employs parallel robust estimation techniques, and uses a minimal-covering optimization to estimate the number of objects in the scene.
Abstract: In order to recover an accurate representation of a scene containing multiple moving objects, one must use estimation methods that can recover both model parameters and segmentation at the same time. Traditional approaches to this problem rely on an edge-based discontinuity model, and have problems with transparent phenomena. The authors introduce a layered model of scene segmentation based on explicitly representing the support of a homogeneous region. The model employs parallel robust estimation techniques, and uses a minimal-covering optimization to estimate the number of objects in the scene. Using a simple direct motion model of translating objects, they successfully segment real image sequences containing multiple motions. >

164 citations


Patent
11 Jun 1991
TL;DR: In this article, a block-based motion estimation method is used to determine motion vectors for blocks of pixels in a current frame or field, where only a portion of blocks from a predetermined pattern of blocks in the current frame and field are searched for a match with a block of pixels from a previous frame and/or field over a designated search area.
Abstract: In an image coding system, block based motion estimation is used to determine motion vectors for blocks of pixels in a current frame or field. Only a portion of blocks from a predetermined pattern of blocks in a current frame or field are searched for a match with a block of pixels in a previous frame or field over a designated search area. Motion vectors for the blocks of the current frame or field not selected for searching are obtained by interpolation from motion vectors obtained for associated searched neighboring blocks, respectively, or by using the zero motion vector, or by performing a search over a limited range around the position defined by each of the motion vectors from neighboring blocks.

160 citations


Proceedings ArticleDOI
07 Oct 1991
TL;DR: In this paper, a method for segmenting monocular images of people in motion from a cinematic sequence of frames is described, based on image intensities, motion, and an object model.
Abstract: A method for segmenting monocular images of people in motion from a cinematic sequence of frames is described. This method is based on image intensities, motion, and an object model-i.e., a model of the image of a person in motion. Though each part of a person may move in different directions at any instant, the time averaged motion of all parts must converge to a global average value over a few seconds. People in an image may be occluded by other people, and usually it is not easy to detect their boundaries. These boundaries can be detected with motion information if they move in different directions, even if there are almost no apparent differences among object intensities or colors. Each image of a person in a scene usually can be divided into several parts, each with distinct intensities or colors. The parts of a person can be merged into a single group by an iterative merging algorithm based on the object model and the motion information because the parts move coherently. This merging is analogous to the property of perceptual grouping in human visual perception of motion. Experiments based on a sequence of complex real scenes produced results that are supportive of the authors approach to the segmentation of people in motion. >

136 citations


Proceedings ArticleDOI
14 Apr 1991
TL;DR: Simulations suggest the algorithm is robust and accurate, and can significantly reduce both the energy of the motion compensated residual image as well as the zeroth-order entropy of the local displacement vector field.
Abstract: An algorithm is presented for estimating and compensating camera zooms and pans. It models the global motion in each frame with just two parameters: a zoom factor and a two-dimensional pan vector both based on local displacement vectors found by conventional means (such as block matching). Since motion by objects in the scene obscures global motion, the algorithm can iterate to refine its estimate. Simulations suggest the algorithm is robust and accurate, and can significantly reduce both the energy of the motion compensated residual image as well as the zeroth-order entropy of the local displacement vector field. >

135 citations


Proceedings ArticleDOI
02 Dec 1991
TL;DR: The authors show that a variable block size algorithm using an optimized tree structure yields a significant improvement in rate-distortion performance over traditional motion compensation with a fixed block size.
Abstract: The authors describe a method for optimizing in a rate-distortion sense the performance of block matching motion compensation for video compression using fixed or variable size blocks. They apply recent advances in rate allocation theory and optimal tree structures to the choice of motion vector and block size for each region of the prediction image. They show that a variable block size algorithm using an optimized tree structure yields a significant improvement in rate-distortion performance over traditional motion compensation with a fixed block size. The computational complexity of such a system is not significantly higher than that of a fixed block size system. >

131 citations


Journal ArticleDOI
TL;DR: This paper presents an algorithm based on multiple frames that employs only the rigidity assumption, is simple and mathematically elegant and, experimentally, proves to be a major improvement over the two-frame algorithms.
Abstract: One of the main issues in the area of motion estimation given the correspondences of some features in a sequence of images is sensitivity to error in the input. The main way to attack the problem, as with several other problems in science and engineering, is redundancy in the data. Up to now all the algorithms developed either used two frames or depended on assumptions about the motion or the shape of the scene. We present in this paper an algorithm based on multiple frames that employs only the rigidity assumption, is simple and mathematically elegant and, experimentally proves to be a major improvement over the two-frame algorithms. The algorithm does minimization of the squared error which we prove equivalent to an eigenvalue minimization problem. One of the side effects of this mean-square method is that the algorithm can have a very descriptive physical interpretation in terms of the “loaded spring model.”

Proceedings ArticleDOI
K.J. Hanna1
07 Oct 1991
TL;DR: An iterative algorithm that estimates the motion of a camera through an environment directly from brightness derivatives of an image pair and how the ego-motion constraint can help resolve local motion ambiguities that arise from the aperture problem is described.
Abstract: The paper describes an iterative algorithm that estimates the motion of a camera through an environment directly from brightness derivatives of an image pair. A global ego-motion constraint is combined with the local brightness constancy constraint to relate local surface models with the global ego-motion model and local brightness derivatives. In an iterative process, the author first refines the local surface models using the ego-motion as a constraint, and then refines the ego-motion model using the local surface models as constraints. He performs this analysis at multiple resolutions. He shows how information from local corner-like and edge-like image structures contribute to the refinement of the global ego-motion estimate, and how the ego-motion constraint can help resolve local motion ambiguities that arise from the aperture problem. Results of the algorithm are shown on uncalibrated outdoor image sequences, and also on a computer-rendered image sequence. >

Proceedings ArticleDOI
01 Nov 1991
TL;DR: A new motion compensation technique using a window which satisfies the perfect reconstruction condition is proposed, which gives a smooth predicted image for a typical MC + DCT coding scheme.
Abstract: A new motion compensation technique using a window which satisfies the perfect reconstruction condition is proposed. THe conventional motion compensation using rectangular blocks often gives discontinuities between neighboring motion compensation blocks in the predicted image. The proposed method is based on a window operation to the data which overlaps an area of the conventional motion compensation block. Computer simulation is carried out using MPEG video coding algorithm to evaluate the proposed method. The performance of the proposed method is better than the conventional method in terms of mean square error, and large improvement can be obtained at the block boundaries. This gives a smooth predicted image for a typical MC + DCT coding scheme.© (1991) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

Patent
09 Aug 1991
TL;DR: A generalized compliant motion with sensor fusion primitive uses a set of input parameters provided from a local control site to a remote execution site to control a telerobot with a combination of a priori trajectory motion and real and virtual local and remote sensor inputs as mentioned in this paper.
Abstract: A generalized compliant motion with sensor fusion primitive uses a set of input parameters provided from a local control site to a remote execution site to control a telerobot with a combination of a priori trajectory motion and real and virtual local and remote sensor inputs. The set of input parameters specify the desired telerobot behavior based on a combination of local and remote information. This general motion primitive requires less computer memory size, and provides more capabilities, than the task specific primitives it replaces because redundancies are eliminated while permutations of capabilities are available. Trajectory motion occurs during a nominal motion time segment while termination conditions are monitored during an ending time segment to stop motion when a termination condition occurs. Force and compliant motion, teleoperation, dither, virtual springs restoration and joint limit control are combined with the trajectory motion at the remote site.

Journal ArticleDOI
TL;DR: In this article, an algorithm for recovering the six degrees of freedom of motion of a vehicle from a sequence of range images of a static environment taken by a range camera rigidly attached to the vehicle is described.
Abstract: An algorithm is described for recovering the six degrees of freedom of motion of a vehicle from a sequence of range images of a static environment taken by a range camera rigidly attached to the vehicle. The technique utilizes a least-squares minimization of the difference between the measured rate of change of elevation at a point and the rate predicted by the so-called elevation rate constmint equation. It is assumed that most of the surface is smooth enough so that local tangent planes can be constructed, and that the motion between frames is smaller than the size of most features in the range image. This method does not depend on the determination of correspondences between isolated high-level features in the range images. The algorithm has been successfully applied to data obtained from the range imager on the Autonomous Land Vehicle (ALV). Other sensors on the ALV provide an initial approximation to the motion between frames. It was found that the outputs of the vehicle sensors themselves are not suitable for accurate motion recovery because of errors in dead reckoning resulting from such problems as wheel slippage. The sensor measurements are used only to approximately register range data. The algorithm described here then recovers the difference between the true motion and that estimated from the sensor outputs. s 1991 Academic Press. Inc.

Patent
29 Nov 1991
TL;DR: In this paper, the authors propose a motion compensated encoder where motion vectors are selected based on the prediction error generated in localized areas of the encoded image and based on an available bit budget.
Abstract: A motion compensated encoder where motion vectors are selected based on the prediction error generated in localized areas of the encoded image and based on an available bit budget. The motion vectors are created by dividing the image into blocks of two sizes and by considering the best mix of large and small size blocks, and their associated motion vectors, that minimize the overall prediction error, within the constraints of the bit budget. For convenience, the image division is arranged so that a given number of small sized blocks forms one large sized block (e.g. 16:1). Also, the block sizes are arranged so that employing only large sized blocks does not exceed the given bit budget, while employing only the small sized blocks does exceed the given bit budget.

Patent
24 Oct 1991
TL;DR: In this paper, a plurality of block-matching motion compensators, each using a different block size, compare current video image data to prior video data to find which motion compensator results in the least amount of compressed data.
Abstract: Digital video signals are adaptively compressed for communication to a receiver. A plurality of block-matching motion compensators, each using a different block size, compare current video image data to prior video image data. Video image data output from the motion compensators is compressed, and the compressed data from each motion compensator is compared to find which motion compensator results in the least amount of compressed data for a region of a current video image corresponding to the smallest of the block sizes. The compressed data having the lowest bit count is transmitted to a receiver for recovery of a motion vector. The recovered motion vector is used to recover current video image data from the transmitted data and previously received video image data.

Journal ArticleDOI
TL;DR: This paper reviews the different approaches developed to estimate motion parameters from a sequence of two range images and gives the mathematical formulation of the problem along with the various modifications by different investigators to adapt the formulation to their algorithms.
Abstract: The estimation of motion of a moving object from a sequence of images is of prime interest in computer vision. This paper reviews the different approaches developed to estimate motion parameters from a sequence of two range images. We give the mathematical formulation of the problem along with the various modifications by different investigators to adapt the formulation to their algorithms. The shortcomings and the advantages of each method are also briefly mentioned. The methods are divided according to the type of feature used in the motion estimation task. We address the representational and the computational issues for each of the methods described. Most of the earlier approaches used local features such as corners (points) or edges (lines) to obtain the transformation. Local features are sensitive to noise and quantization errors. This causes uncertainties in the motion estimation. Using global features, such as surfaces, makes the procedure of motion computation more robust at the expense of making the procedure very complex. A common error is assuming that the best affine transform is the best estimate of the desired motion, which in general is false. It is important to make the distinction between the motion transform and the general affine transform, since an affine transform may not be realized physically by a rigid object.

Patent
14 Jan 1991
TL;DR: In this article, an iterative process, implemented by a feedback loop, responds to all image data in the respective analysis regions of three consecutive frames of a motion picture, to provide, after a plurality of cycles of operation thereof, an accurate estimation of the motion of either one or both of two differently moving patterns defined by the image data of these respective regions.
Abstract: An iterative process, implemented by a feedback loop, responds to all the image data in the respective analysis regions of three consecutive frames of a motion picture, to provide, after a plurality of cycles of operation thereof, an accurate estimation of the motion of either one or both of two differently moving patterns defined by the image data of these respective analysis regions. The analysis region of each frame is preferably large, and may occupy the entire frame area of a frame. The type of differently moving patterns include (1) separation by a motion bounty, (2) overlapping transparent surface in motion, (3) "picket fence" motion (4) masking of a small and/or low-contrast pattern by a dominant pattern, and (5) two-component aperture effects. Also, the operation of this iterative process inherently accurately estimates the motion of image data defining a single moving pattern.

Proceedings ArticleDOI
07 Oct 1991
TL;DR: In this paper, the robustness of phase information for measuring image velocity and binocular disparity, its stability with respect to geometric deformations, and its linearity as a function of spatial position are discussed.
Abstract: This paper concerns the robustness of phase information for measuring image velocity and binocular disparity, its stability with respect to geometric deformations, and its linearity as a function of spatial position. These properties are shown to depend on the form of the filters used and their frequency bandwidths. The authors also discuss situations in which phase is unstable, many of which can be detected using the model of phase singularities (see Image Vis. Comput. (UK) vol.9, no.5, p.333-7 (Oct. 1991)). >

Journal ArticleDOI
TL;DR: The authors present an efficient block-matching algorithm called the parallel hierarchical one-dimensional search (PHODS) for motion estimation that is more suitable for hardware realization of a VLSI motion estimator.
Abstract: The authors present an efficient block-matching algorithm called the parallel hierarchical one-dimensional search (PHODS) for motion estimation. Instead of finding the two-dimensional motion vector directly, the PHODS finds two one-dimensional displacements in parallel on the two axes (say x and y) independently within the search area. The major feature of this algorithm is that its search speed for the motion vector is faster than that of the other search algorithms on account of its simpler computations and parallelism. Compared with previous research in terms of four measurements, the PHODS can rival those algorithms for performance. The hardware-oriented features of the PHODS, i.e., regularity, simplicity, and parallelism, guarantee that the PHODS is more suitable for hardware realization of a VLSI motion estimator. >

Journal ArticleDOI
TL;DR: It is demonstrated experimentally that the affine matching algorithm performs better in estimating displacements than other standard approaches, especially for long-range motion with possible changes in scene illumination.
Abstract: A model is developed for estimating the displacement field in spatio-temporal image sequences that allows for affine shape deformations of corresponding spatial regions and for affine transformations of the image intensity range. This model includes the block matching method as a special case. The model parameters are found by using a least-squares algorithm. We demonstrate experimentally that the affine matching algorithm performs better in estimating displacements than other standard approaches, especially for long-range motion with possible changes in scene illumination. The algorithm is successfully applied to various classes of moving imagery, including the tracking of cloud motion.

Proceedings ArticleDOI
01 Nov 1991
TL;DR: A new motion estimation algorithm called hexagonal matching which iteratively refines the estimated displacement vectors is presented which produces less prediction error and also proposes another algorithm with similar function but less computational complexity.
Abstract: In order to overcome the drawback of the conventional block-based motion compensation, a new triangle-based method which utilizes triangular patches instead of blocks has recently been proposed. Compared to conventional methods which represent the motion of scene objects by translational displacements of blocks, the new method can cope with a wider range of motions since is allows for rotation and deformation of the triangular patches. In the block- based motion compensation, a simple local minimization algorithm (i.e., block matching) is applied to obtain the displacement vector of each block. However, it is inappropriate to apply this algorithm in the triangle-based motion compensation because of the complicated linkage between the deformation of the triangular patches and the displacement of the grid points (vertices of triangles). Consequently, the primary issue is to find an optimal way to estimate motion of the grid points. In this paper, we present a new motion estimation algorithm called hexagonal matching which iteratively refines the estimated displacement vectors. Simulation results show that the motion estimation algorithm produces less prediction error than the previously proposed triangle-based method or the block-based method. We also propose another algorithm with similar function but less computational complexity.© (1991) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

Proceedings ArticleDOI
07 Oct 1991
TL;DR: The authors show that a matrix of image measurements can be factored by singular value decomposition into the product of two matrices that represent shape and motion, respectively.
Abstract: Recovery scene geometry and camera motion from a sequence of images is an important problem in computer vision. If the scene geometry is specified by depth measurements, that is, by specifying distances between the camera and feature points in the scene, noise sensitivity worsens rapidly with increasing depth. The authors show hat this difficulty can be overcome by computing scene geometry directly in terms of shape, that is, by computing the coordinates of feature points in the scene with respect to a world-centered system, without recovering camera-centered depth as an intermediate quantity. More specifically, the authors show that a matrix of image measurements can be factored by singular value decomposition into the product of two matrices that represent shape and motion, respectively. The results in this paper extend to three dimensions the solution the authors described in a previous paper for planar camera motion (ICCV, Osaka, Japan, 1990). >

Patent
26 Apr 1991
TL;DR: In this paper, a method and apparatus for performing block-matching motion estimation in a video coder is disclosed which estimates the motion vector associated with each block of pels in the current coding frame.
Abstract: A method and apparatus (110) for performing block-matching motion estimation in a video coder is disclosed which estimates the motion vector associated with each block of pels in the current coding frame. The motion vector for each block in the current frame is estimated by searching through a larger search window in the previous frame for a best match. At each possible shift position within the search window a pel-by-pel comparison (304) is made between the intensity of the pels in the block in the current frame and the corresponding pels in the previous frame. Each pel is classified as either a matching pel or a mismatching pel depending upon the pel difference and a threshold (306). The number of matching pels at each possible shift position is counted (307) and the motion vector is determined from the shift position that yields the maximum number of matching pels.

Journal ArticleDOI
TL;DR: In this paper, a predictive block-matching motion estimation scheme was implemented for efficient video code design, which is based on the so-called inertia effect of natural video scenes and takes advantage of the motion vectors obtained in the previous frames.
Abstract: For pt.I, see ibid., vol.37, no.3, p.97-101 (1991). A predictive block-matching motion estimation scheme was implemented for efficient video code design. The scheme is based on the so-called inertia effect of natural video scenes and takes advantage of the motion vectors obtained in the previous frames. The benefits from this prediction process are threefold. First, the searching area is greatly reduced, and so is the computational complexity. Second, the motion vector overhead information is reduced since motion vectors are decorrelated by the prediction process. Finally, the motion vectors estimated from this procedure are more realistic since it reflects the real physical phenomena. These advantages were also demonstrated by simulation results including the coded data rate, displaced frame difference entropy, motion vectors, and reconstructed signal-to-noise ratio. Only a simple prediction model was implemented; further results in more general autoregressive (AR) modes are still under study. >

Proceedings ArticleDOI
07 Oct 1991
TL;DR: In this paper, the authors proposed a pyramid framework to separate motion components based on their spatial and temporal frequency characteristics so that each can be estimated independently of the others, which can provide important guidance in practical applications of motion analysis.
Abstract: Pyramid techniques are commonly used to provide computational efficiency in the analysis of image motion. But these techniques can play an even more important role in the analysis of multiple motion, where, for example, a transparent pattern moves in front of a differently moving background pattern. The pyramid framework then separates motion components based on their spatial and temporal frequency characteristics so that each can be estimated independently of the others. This property is key to recently proposed selective stabilization algorithms for the sequential analysis of multiple motion and for the detection of moving objects from a moving platform. The authors determine the conditions for component selection. Results can provide important guidance in practical applications of motion analysis. >

Proceedings ArticleDOI
14 Apr 1991
TL;DR: A motion-compensated noise suppression algorithm that employs temporally adaptive filtering along motion trajectories that is far superior to methods that incorporate implicit or explicit motion compensation, especially in cases of low SNR and/or significant interframe motion.
Abstract: A motion-compensated noise suppression algorithm that employs temporally adaptive filtering along motion trajectories is proposed for image sequences. Filtering is performed via linear minimum mean square error (LMMSE) point estimation. Motion trajectories are determined using a recent motion estimation algorithm, which is capable of performing very well at low signal-to-noise ratios (SNRs). The results suggest that the proposed method is far superior to methods that incorporate implicit or explicit motion compensation, especially in cases of low SNR and/or significant interframe motion. >

Proceedings ArticleDOI
Eric Viscito1, Cesar A. Gonzales1
01 Nov 1991
TL;DR: This paper describes an MPEG encoder designed to produce good quality coded sequences for a wide range of video source characteristics and over a range of bit rates.
Abstract: The emerging ISO MPEG video compression standard is a hybrid algorithm which employs motion compensation, spatial discrete cosine transforms, quantization, and Huffman coding. The MPEG standard specifies the syntax of the compressed data stream and the method of decoding, but leaves considerable latitude in the design of the encoder. Although the algorithm is geared toward fixed-bit-rate storage media, the rules for bit rate control allow a good deal of variation in the number of bits allocated to each picture. In addition, the allocation of bits within a picture is subject to no rules whatsoever. One would like to design an encoder that optimizes visual quality of the decoded video sequence subject to these bit rate restrictions. However, this is difficult due to the elusive nature of a quantitative distortion measure for images and motion sequences that correlates well with human perception. This paper describes an MPEG encoder designed to produce good quality coded sequences for a wide range of video source characteristics and over a range of bit rates. The novel parts of the algorithm include a temporal bit allocation strategy, spatially adaptive quantization, and a bit rate control scheme.© (1991) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

Patent
15 Mar 1991
TL;DR: In this article, a method and apparatus for compressing motion pictures is described, where the frames of the motion picture are divided into adjacent groups and each group is treated as a three-dimensional image.
Abstract: A method and apparatus for compressing motion pictures is disclosed. The frames of the motion picture are divided into adjacent groups. Each group is treated as a three-dimensional image. The three-dimensional image is then filtered via a three-dimensional FIR filter to generate three-dimensional component images that are more efficiently quantized. The degree of quantization of each component image is determined in part by the spatial frequencies represented by the component image in question. For motion pictures derrived from interlaced scanning devices, the quantization of specific component images is altered to prevent artifacts.