scispace - formally typeset
Search or ask a question
Journal Article•DOI•

Evaluation of multiresolution block matching techniques for motion and disparity estimation

TL;DR: Multiresolution block matching methods for both monocular and stereoscopic image sequence coding are evaluated to drastically reduce the amount of processing needed for block correspondence without seriously affecting the quality of the reconstructed images.
Abstract: Multiresolution block matching methods for both monocular and stereoscopic image sequence coding are evaluated. These methods are seen to drastically reduce the amount of processing needed for block correspondence without seriously affecting the quality of the reconstructed images. The evaluation criteria are the prediction error and the speed of the algorithm for motion, disparity, and fused motion and disparity estimation, in comparison with the full search (exhaustive) method. A new method is also proposed based in multiresolution techniques, for efficient coding of the disparity or the displacement vector field.
Citations
More filters
Journal Article•DOI•
TL;DR: A survey of optical flow estimation classifying the main principles elaborated during this evolution, with a particular concern given to recent developments is proposed.

368 citations

Journal Article•DOI•
Yu-Wen Huang1, Ching-Yeh Chen1, Chen-Han Tsai1, Chun-Fu Shen1, Liang-Gee Chen1 •
01 Mar 2006
TL;DR: The main idea is quick checking of the entire search range with simplified matching criterion to globally eliminate impossible candidates, followed by finer selection among potential best matched candidates.
Abstract: Block matching motion estimation is the heart of video coding systems. During the last two decades, hundreds of fast algorithms and VLSI architectures have been proposed. In this paper, we try to provide an extensive exploration of motion estimation with our new developments. The main concepts of fast algorithms can be classified into six categories: reduction in search positions, simplification of matching criterion, bitwidth reduction, predictive search, hierarchical search, and fast full search. Comparisons of various algorithms in terms of video quality and computational complexity are given as useful guidelines for software applications. As for hardware implementations, full search architectures derived from systolic mapping are first introduced. The systolic arrays can be divided into inter-type and intra-type with 1-D, 2-D, and tree structures. Hexagonal plots are presented for system designers to clearly evaluate the architectures in six aspects including gate count, required frequency, hard-ware utilization, memory bandwidth, memory bitwidth, and latency. Next, architectures supporting fast algorithms are also reviewed. Finally, we propose our algorithmic and architectural co-development. The main idea is quick checking of the entire search range with simplified matching criterion to globally eliminate impossible candidates, followed by finer selection among potential best matched candidates. The operations of the two stages are mapped to the same hardware for resource sharing. Simulation results show that our design is ten times more area-speed efficient than full search architectures while the video quality is competitively the same.

199 citations

Journal Article•DOI•
TL;DR: An object-based coding scheme is proposed for the coding of a stereoscopic image sequence using motion and disparity information and the use of the depth map information for the generation of intermediate views at the receiver is discussed.
Abstract: An object-based coding scheme is proposed for the coding of a stereoscopic image sequence using motion and disparity information. A hierarchical block-based motion estimation approach is used for initialization, while disparity estimation is performed using a pixel-based hierarchical dynamic programming algorithm. A split-and-merge segmentation procedure based on three-dimensional (3-D) motion modeling is then used to determine regions with similar motion parameters. The segmentation part of the algorithm is interleaved with the estimation part in order to optimize the coding performance of the procedure. Furthermore, a technique is examined for propagating the segmentation information with time. A 3-D motion-compensated prediction technique is used for both intensity and depth image sequence coding. Error images and depth maps are encoded using discrete cosine transform (DCT) and Huffman methods. Alternately, an efficient wireframe depth modeling technique may be used to convey depth information to the receiver. Motion and wireframe model parameters are then quantized and transmitted to the decoder along with the segmentation information. As a straightforward application, the use of the depth map information for the generation of intermediate views at the receiver is also discussed. The performance of the proposed compression methods is evaluated experimentally and is compared to other stereoscopic image sequence coding schemes.

124 citations

Journal Article•DOI•
TL;DR: An efficient disparity estimation and Occlusion detection algorithm for multiocular systems is presented and techniques are developed for the coding of occlusion and disparity information, which is needed at the receiver for the reproduction of a multiview sequence using the two encoded extreme views.
Abstract: An efficient disparity estimation and occlusion detection algorithm for multiocular systems is presented. A dynamic programming algorithm, using a multiview matching cost as well as pure geometrical constraints, is used to estimate disparity and to identify the occluded areas in the extreme left and right views. A significant advantage of the proposed approach is that the exact number of views in which each point appears (is not occluded) can be determined. The disparity and occlusion information obtained may then be used to create virtual images from intermediate viewpoints. Furthermore, techniques are developed for the coding of occlusion and disparity information, which is needed at the receiver for the reproduction of a multiview sequence using the two encoded extreme views. Experimental results illustrate the performance of the proposed techniques.

93 citations

Journal Article•DOI•
TL;DR: A novel unsupervised video object segmentation algorithm that employs the long-term trajectory of regions, rather than the motion at the frame level, so as to group them to objects with different motion can efficiently segment video sequences with fast moving or newly appearing objects.
Abstract: A novel unsupervised video object segmentation algorithm is presented, aiming to segment a video sequence to objects: spatiotemporal regions representing a meaningful part of the sequence. The proposed algorithm consists of three stages: initial segmentation of the first frame using color, motion, and position information, based on a variant of the K-means-with-connectivity-constraint algorithm; a temporal tracking algorithm, using a Bayes classifier and rule-based processing to reassign changed pixels to existing regions and to efficiently handle the introduction of new regions; and a trajectory-based region merging procedure that employs the long-term trajectory of regions, rather than the motion at the frame level, so as to group them to objects with different motion. As shown by experimental evaluation, this scheme can efficiently segment video sequences with fast moving or newly appearing objects. A comparison with other methods shows segmentation results corresponding more accurately to the real objects appearing on the image sequence.

75 citations

References
More filters
Journal Article•DOI•
TL;DR: A technique for image encoding in which local operators of many scales but identical shape serve as the basis functions, which tends to enhance salient image features and is well suited for many image analysis tasks as well as for image compression.
Abstract: We describe a technique for image encoding in which local operators of many scales but identical shape serve as the basis functions. The representation differs from established techniques in that the code elements are localized in spatial frequency as well as in space. Pixel-to-pixel correlations are first removed by subtracting a lowpass filtered copy of the image from the image itself. The result is a net data compression since the difference, or error, image has low variance and entropy, and the low-pass filtered image may represented at reduced sample density. Further data compression is achieved by quantizing the difference image. These steps are then repeated to compress the low-pass image. Iteration of the process at appropriately expanded scales generates a pyramid data structure. The encoding process is equivalent to sampling the image with Laplacian operators of many scales. Thus, the code tends to enhance salient image features. A further advantage of the present code is that it is well suited for many image analysis tasks as well as for image compression. Fast algorithms are described for coding and decoding.

6,975 citations

Journal Article•DOI•
01 Apr 1985
TL;DR: This paper presents a review of the advances in digital coding of video signals during the last four years, and summarizes the first promising results of motion adaptive frame interpolation.
Abstract: This paper presents a review of the advances in digital coding of video signals during the last four years. Displacement estimation algorithms for coding applications are compared first and the relationship between the algorithms is pointed out. The developments in predictive and transform coding are described and discussed with view to broadcast television and video-conferencing applications. One chapter summarizes the first promising results of motion adaptive frame interpolation. Some problems to be solved in the future are pointed out in the conclusions.

895 citations

Proceedings Article•DOI•
25 Oct 1988
TL;DR: A hierarchical blockmatching algorithm for the estimation of displacement vector fields in digital television sequences is presented, which yields reliable and homogeneous displacement vector - fields, which are close to the true displacements.
Abstract: A hierarchical blockmatching algorithm for the estimation of displacement vector fields in digital television sequences is presented. Known blockmatching techniques fail frequently as a result of using a fixed measurement window size. Using distinct sizes of measurement windows at different levels of a hierarchy, the presented blockmatching technique yields reliable and homogeneous displacement vector - fields, which are close to the true displacements, rather than only a match in the sense of a minimum mean absolute luminance difference. In the environment of a low bit rate hybrid coder for image sequences, the hierarchical blockmatching algorithm is well suited for both, motion compensating prediction, and motion compensating interpolation. Compared to other high sophisticated displacement estimation techniques, the computational effort is decreased drastically. Due to the regularity and the very small number of necessary operations, the presented hierarchical blockmatching algorithm can be implemented in hardware very easily.

460 citations

Journal Article•DOI•
TL;DR: In this paper, a displacement vector with integer components for each picture element of the fields to be interpolated is provided, and a change detector is used to assure zero displacement vectors in unchanged areas.

193 citations

Journal Article•DOI•
M. Waldowski1•
TL;DR: The stereo image pair of a speaking person in front of a stationary background taken by two CCD-cameras is used as an input scene for a new segmentation algorithm based on the phase correlation technique which provides a disparity vector field.
Abstract: The stereo image pair of a speaking person in front of a stationary background taken by two CCD-cameras is used as an input scene for a new segmentation algorithm. The algorithm is based on the phase correlation technique which provides a disparity vector field. A brightness adjustment procedure is performed to provide stereo image pairs suited for the segmentation procedure. Then, a coarse segmentation into background and speaking person is made to achieve a reliable segmentation result. Finally, finer segmentation using a coarse-to-fine control strategy is performed only at the object boundaries. An application is demonstrated by applying a lowpass filter selectively to the background of input sequences for low bit rate image coding algorithms. >

29 citations