scispace - formally typeset
Search or ask a question

Multiresolution based hierarchical disparity estimation for stereo image pair compression

01 Jan 1994-
TL;DR: A multiresolution based approach is proposed for compressing 'still' stereo image pairs and the typical computational gains and compression ratios possible with this approach are provided.
Abstract: Stereo vision is the process of viewing two different perspective projections of the same real world scene and perceiving the depth that was present in the original scene. These projections offer a compact 2-dimensional means of representing a 3-dimensional scene, as seen by one observer. Different display schemes have been developed to ensure that each eye sees the image that is intended for it. Each image in the image pair is referred to as the left or right image depending on the eye it is intended for. The binocular cues contain unambiguous information in contrast to monocular cues like shading or coloring. Hence binocular stereo may be quite useful, for instance, in video based training of personnel. On the entertainment side, it can make mundane TV material lively. Though the concept has been around for more than half a century, only recently have technically effective ways of making stereoscopic displays and the usually required eyeware emerged. Despite this progress, stereo TV can be made a cost effective add-on option only if the increased bandwidth requirement is relaxed somehow. Since the two images are projections of the same scene from two nearby points of view, they are bound to have a lot of redundancy between them. By properly exploiting this redundancy, the two image streams might be compressed and transmitted through a single monocular channel's bandwidth. The first step towards stereoscopic image sequence compression is 'still' stereo image pair compression that exploits the high correlation between the left and right images, in addition to exploiting the spatial correlation within each image. The temporal correlation between the frames can be taken advantage of, along the lines of the MPEG (Motion Picture Experts Group) standards, to achieve further compression. The final step would be to explore the correlation between left and right frames with a time offset between them. In this paper a multiresolution based approach is proposed for compressing 'still' stereo image pairs. In Section II the task at hand is contrasted with the stereo disparity estimation problem in the machine vision community; a block based scheme on the lines of a motion estimation scheme is suggested as a possible approach. In Section III, the suitability of hierarchical techniques for disparity estimation is outlined. Section IV provides an overview of wavelet decomposition. Section V details the multiresolution approach taken. In section VI, the typical computational gains and compression ratios possible with this …

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI
22 Mar 1996
TL;DR: This paper addresses the issue of extending a binocular disparity based segmentation in the temporal dimension to achieve efficient stereoscopic sequence compression and proposes the following scheme.
Abstract: A binocular disparity based segmentation scheme to compactly represent one image of a stereoscopic image pair given the other image was proposed earlier by us. That scheme adapted the excess bitcount, needed to code the additional image, to the binocular disparity detail present in the image pair. This paper addresses the issue of extending such a segmentation in the temporal dimension to achieve efficient stereoscopic sequence compression. The easiest conceivable temporal extension would be to code one of the sequences using an MPEG-type scheme while the frames of the other stream are coded based on the segmentation. However such independent compression of one of the streams fails to take advantage of the segmentation or the additional disparity information available. To achieve better compression by exploiting this additional information, we propose the following scheme. Each frame in one of the streams is segmented based on disparity. An MPEG-type frame structure is used for motion compensated prediction of the segments in this segmented stream. The corresponding segments in the other stream are encoded by reversing the disparity-map obtained during the segmentation. Areas without correspondence in this stream, arising from binocular occlusions and disparity estimation errors, are filled in using a disparity-map based predictive error concealment method. Over a test set of several different stereoscopic image sequences, high perceived stereoscopic image qualities were achieved at an excess bandwidth that is roughly 40% above that of a highly compressed monoscopic sequence. Stereo perception can be achieved at significantly smaller excess bandwidths, albeit with a perceivable loss in the image quality.

14 citations

Journal ArticleDOI
TL;DR: Control psychophysical experiments show that subjects perceived 3-D color images even when they were presented with only one color image in a stereoscopic pair, with no depth perception degradation and only limited color degradation.
Abstract: Utilizing remote color stereoscopic scenes typically requires the acquisition, transmission, and processing of two color images. However, the amount of information transmitted and processed is large, compared to either monocular images or monochrome stereo images. Existing approaches to this challenge focus on compression and optimization. This paper introduces an innovative complementary approach to the presentation of a color stereoscopic scene, specialized for human perception. It relies on the hypothesis that a stereo pair consisting of one monochromatic image and one color image (a MIX stereo pair) will be perceived by a human observer as a 3-D color scene. Taking advantage of color redundancy, this presentation of a monochromatic-color pair allows for a drastic reduction in the required bandwidth, even before any compression method is employed. Herein we describe controlled psychophysical experiments on up to 15 subjects. These experiments tested both color and depth perception using various combinations of color and monochromatic images. The results show that subjects perceived 3-D color images even when they were presented with only one color image in a stereoscopic pair, with no depth perception degradation and only limited color degradation. This confirms the hypothesis and validates the new approach.

12 citations

Proceedings ArticleDOI
14 Mar 2010
TL;DR: A bandelet-based coding scheme for stereo images that efficiently integrates the coding of the disparity map with the reference image and encoded and transmitted in partitions (squares) which leads to lower entropy.
Abstract: In this work, a bandelet-based coding scheme for stereo images is presented. The proposed scheme efficiently integrates the coding of the disparity map with the reference image. The disparity map is obtained via disparity estimation in a geometric similarity framework. The scheme first computes the bandelet transform of both left and right images. Consequently, each image is segmented into a quadtree where each dyadic square regroups pixels sharing the same geometric flow direction. Then, the disparity map is obtained by studying the geometric similarities between the dyadic squares of both images quadtrees. This is accomplished by the minimization of a cost measure function that is defined on the geometric properties of the quadtrees. Finally, the bandelet transform coefficients of the reference and residual images with the disparity map are encoded and transmitted in partitions (squares) which leads to lower entropy. The experimental evaluation of the proposed scheme shows beneficial performance over other stereoscopic coders in the literature.

11 citations


Cites methods from "Multiresolution based hierarchical ..."

  • ...The block-matching algorithm (BMA) may also be applied on the objects that appear in a stereo pair after an object contour extraction in the two images [6] or on the subbands of a wavelet decomposed stereo image pair in a hierarchical way [10]....

    [...]

Proceedings ArticleDOI
18 Sep 2003
TL;DR: A new optimised method of coding stereoscopic image sequences is presented and compared with already known methods and the effectiveness of the joint motion and disparity vectors estimation as well as the choice of the weighting factors that participate in the proposed interpolative scheme optimises the whole framework.
Abstract: In this paper, a new optimised method of coding stereoscopic image sequences is presented and compared with already known methods. Two basic methods of coding a stereoscopic image sequence are the compatible and joint. The first one uses MPEC for coding the left channel and takes advantage of the spatial disparity redundancy between the two sequences for coding the right channel. The second one employs MPEG for coding the left channel but takes advantage of both temporal redundancy among the right channel frames and spatial redundancy between the corresponding frames of the two channels The proposed method, which is called IMDE, estimates the P and B type of frames of the right channel by an interpolative scheme that takes in to account both the temporal and disparity characteristics. Investigating the effectiveness of the joint motion and disparity vectors estimation as well as the choice of the weighting factors that participate in the proposed interpolative scheme optimises the whole framework.

11 citations


Cites methods from "Multiresolution based hierarchical ..."

  • ...Several stereo compression algorithms have been developed that use block matching or alternative implementations, as hierarchical disparity estimation [2], multiresolution block matching [3], block matching with geometric transform [4], etc....

    [...]

Journal ArticleDOI
TL;DR: The experimental evaluation of the proposed stereoscopic image coder based on the MRF model and MAP estimation of the disparity field shows beneficial performance over other stereoscopic coders in the literature.
Abstract: This paper presents a stereoscopic image coder based on the MRF model and MAP estimation of the disparity field. The MRF model minimizes the noise of disparity compensation, because it takes into account the residual energy, smoothness constraints on the disparity field, and the occlusion field. Disparity compensation is formulated as an MAP-MRF problem in the spatial domain, where the MRF field consists of the disparity vector and occlusion fields. The occlusion field is partitioned into three regions by an initial double-threshold setting. The MAP search is conducted in a block-based sense on one or two of the three regions, providing faster execution. The reference and residual images are decomposed by a discrete wavelet transform and the transform coefficients are encoded by employing the morphological representation of wavelet coefficients algorithm. As a result of the morphological encoding, the reference and residual images together with the disparity vector field are transmitted in partitions, lowering total entropy. The experimental evaluation of the proposed scheme on synthetic and real images shows beneficial performance over other stereoscopic coders in the literature.

10 citations


Cites methods from "Multiresolution based hierarchical ..."

  • ...This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: In this paper, it is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2 /sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions.
Abstract: Multiresolution representations are effective for analyzing the information content of images. The properties of the operator which approximates a signal at a given resolution were studied. It is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2/sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions. In L/sup 2/(R), a wavelet orthonormal basis is a family of functions which is built by dilating and translating a unique function psi (x). This decomposition defines an orthogonal multiresolution representation called a wavelet representation. It is computed with a pyramidal algorithm based on convolutions with quadrature mirror filters. Wavelet representation lies between the spatial and Fourier domains. For images, the wavelet representation differentiates several spatial orientations. The application of this representation to data compression in image coding, texture discrimination and fractal analysis is discussed. >

20,028 citations

Journal ArticleDOI
Ingrid Daubechies1
TL;DR: This work construct orthonormal bases of compactly supported wavelets, with arbitrarily high regularity, by reviewing the concept of multiresolution analysis as well as several algorithms in vision decomposition and reconstruction.
Abstract: We construct orthonormal bases of compactly supported wavelets, with arbitrarily high regularity. The order of regularity increases linearly with the support width. We start by reviewing the concept of multiresolution analysis as well as several algorithms in vision decomposition and reconstruction. The construction then follows from a synthesis of these different approaches.

8,588 citations


"Multiresolution based hierarchical ..." refers background in this paper

  • ...The maximum vertical disparity (VDMAX) is within 3-4 pixels for reasonably composed image pairs....

    [...]

Journal ArticleDOI
TL;DR: A technique for image encoding in which local operators of many scales but identical shape serve as the basis functions, which tends to enhance salient image features and is well suited for many image analysis tasks as well as for image compression.
Abstract: We describe a technique for image encoding in which local operators of many scales but identical shape serve as the basis functions. The representation differs from established techniques in that the code elements are localized in spatial frequency as well as in space. Pixel-to-pixel correlations are first removed by subtracting a lowpass filtered copy of the image from the image itself. The result is a net data compression since the difference, or error, image has low variance and entropy, and the low-pass filtered image may represented at reduced sample density. Further data compression is achieved by quantizing the difference image. These steps are then repeated to compress the low-pass image. Iteration of the process at appropriately expanded scales generates a pyramid data structure. The encoding process is equivalent to sampling the image with Laplacian operators of many scales. Thus, the code tends to enhance salient image features. A further advantage of the present code is that it is well suited for many image analysis tasks as well as for image compression. Fast algorithms are described for coding and decoding.

6,975 citations

BookDOI
01 Jan 1984
TL;DR: A Hierarchical Image Analysis System Based Upon Oriented Zero Crossings of Bandpassed Images and a Tutorial on Quadtree Research.
Abstract: I Image Pyramids and Their Uses.- 1. Some Useful Properties of Pyramids.- 2. The Pyramid as a Structure for Efficient Computation.- II Architectures and Systems.- 3. Multiprocessor Pyramid Architectures for Bottom-Up Image Analysis.- 4. Visual and Conceptual Hierarchy - A Paradigm for Studies of Automated Generation of Recognition Strategies.- 5. Multiresolution Processing.- 6. The Several Steps from Icon to Symbol Using Structured Cone/ Pyramids.- III Modelling, Processing, and Segmentation.- 7. Time Series Models for Multiresolution Images.- 8. Node Linking Strategies in Pyramids for Image Segmentation.- 9. Multilevel Image Reconstruction.- 10. Sorting, Histogramming, and Other Statistical Operations on a Pyramid Machine.- IV Features and Shape Analysis.- 11. A Hierarchical Image Analysis System Based Upon Oriented Zero Crossings of Bandpassed Images.- 12. A Multiresolution Representation for Shape.- 13. Multiresolution Feature Encodings.- 14. Multiple-Size Operators and Optimal Curve Finding.- V Region Representation and Surface Interpolation.- 15. A Tutorial on Quadtree Research.- 16. Multiresolution 3-d Image Processing and Graphics.- 17. Multilevel Reconstruction of Visual Surfaces: Variational Principles and Finite-Element Representations.- VI Time-Varying Analysis.- 18. Multilevel Relaxation in Low-Level Computer Vision.- 19. Region Matching in Pyramids for Dynamic Scene Analysis.- 20. Hierarchical Estimation of Spatial Properties from Motion.- VII Applications.- 21. Multiresolution Microscopy.- 22. Two-Resolution Detection of Lung Tumors in Chest Radiographs.- Index of Contributors.

623 citations

Book
01 Feb 1991

581 citations