scispace - formally typeset
Search or ask a question

Multiresolution based hierarchical disparity estimation for stereo image pair compression

01 Jan 1994-
TL;DR: A multiresolution based approach is proposed for compressing 'still' stereo image pairs and the typical computational gains and compression ratios possible with this approach are provided.
Abstract: Stereo vision is the process of viewing two different perspective projections of the same real world scene and perceiving the depth that was present in the original scene. These projections offer a compact 2-dimensional means of representing a 3-dimensional scene, as seen by one observer. Different display schemes have been developed to ensure that each eye sees the image that is intended for it. Each image in the image pair is referred to as the left or right image depending on the eye it is intended for. The binocular cues contain unambiguous information in contrast to monocular cues like shading or coloring. Hence binocular stereo may be quite useful, for instance, in video based training of personnel. On the entertainment side, it can make mundane TV material lively. Though the concept has been around for more than half a century, only recently have technically effective ways of making stereoscopic displays and the usually required eyeware emerged. Despite this progress, stereo TV can be made a cost effective add-on option only if the increased bandwidth requirement is relaxed somehow. Since the two images are projections of the same scene from two nearby points of view, they are bound to have a lot of redundancy between them. By properly exploiting this redundancy, the two image streams might be compressed and transmitted through a single monocular channel's bandwidth. The first step towards stereoscopic image sequence compression is 'still' stereo image pair compression that exploits the high correlation between the left and right images, in addition to exploiting the spatial correlation within each image. The temporal correlation between the frames can be taken advantage of, along the lines of the MPEG (Motion Picture Experts Group) standards, to achieve further compression. The final step would be to explore the correlation between left and right frames with a time offset between them. In this paper a multiresolution based approach is proposed for compressing 'still' stereo image pairs. In Section II the task at hand is contrasted with the stereo disparity estimation problem in the machine vision community; a block based scheme on the lines of a motion estimation scheme is suggested as a possible approach. In Section III, the suitability of hierarchical techniques for disparity estimation is outlined. Section IV provides an overview of wavelet decomposition. Section V details the multiresolution approach taken. In section VI, the typical computational gains and compression ratios possible with this …

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI
10 Dec 2002
TL;DR: A new method for computing a dense disparity map thanks to the use of a special wavelet transform less sensitive to discontinuity in disparities and partial occlusions is proposed.
Abstract: 3D reconstruction from a set of stereoscopic image pairs requires the ability to match both images: any point in an image should match an homologous point in the other image. Knowing the relative position of the cameras, it is then possible to build a dense disparity map and a Z-buffer which will be used for 3D reconstruction. We propose a new method for computing a dense disparity map thanks to the use of a special wavelet transform less sensitive to discontinuity in disparities and partial occlusions. This method is based on the quarter of wavelet transform. Some experimental results are presented at the end of the paper.

5 citations

Proceedings ArticleDOI
14 Nov 2005
TL;DR: Experimental results show that basic block matching gives better results than ground truth, especially on occluded regions and boundaries.
Abstract: In order to compress stereo image pairs effectively, disparity compensation is the most widely used method. In this paper we examined the effects of using different disparity maps and their properties in an embedded JPEG2000 based disparity compensated stereo image coder. These properties include the block size, estimation method and the resulting entropy of the disparity map. Experimental results show that basic block matching gives better results than ground truth, especially on occluded regions and boundaries.

3 citations


Cites methods from "Multiresolution based hierarchical ..."

  • ...Disparity estimation is the most widely used method in stereo image coding [1,2]....

    [...]

Proceedings ArticleDOI
15 Dec 2003
TL;DR: A stereo image CODEC, that uses overlapped block disparity estimation/compensation in multiresolution wavelet transform domain, to improve rate distortion performance of OBM and discreteWavelet transform's versatile time-frequency localization.
Abstract: In this paper we propose a stereo image CODEC, that uses overlapped block disparity estimation/compensation in multiresolution wavelet transform domain. Overlapped block matching is known to be able to reduce blocking artifacts by linearly combining multiple blocks provided by the disparity vectors of a block and its neighbours. This capability of OBM and discrete wavelet transform's (DWT) versatile time-frequency localization is expected to improve rate distortion performance. With the proposed CODEC gains of up to 1 dB at low bit rates as compared to the benchmark of S. Sethuraman et al. (1994) used in the paper.

3 citations

DissertationDOI
01 Jan 2004
TL;DR: A novel DWT-based embedded stereoscopic still- image codec structure is proposed that preserves the progressive transmission capability of still-image coding algorithms, while suitably adapting to the nuances and special characteristics of stereoscopic imagery.
Abstract: This thesis addresses the issue of encoding and decoding still and time-varying stereoscopic imagery. A review of current encoding techniques is undertaken, with special emphasis on algorithms having SNR and spatial scalability. A stereo image pair consists of two views of the same scene. Due to the redundant nature of both views, prediction-based techniques produce superior results when compared with independent encoding of both images. Some of the most widely used embedded still-image coding techniques rely on discrete wavelet transform (DWT)-based analysis. However, these schemes cannot be adapted in a straightforward manner to encode stereoscopic still-image pairs. In this thesis, a novel DWT-based embedded stereoscopic still-image codec structure is proposed. This scheme preserves the progressive transmission capability of still-image coding algorithms, while suitably adapting to the nuances and special characteristics of stereoscopic imagery. A comparative study of variable-block and fixed-block disparity estimation is also undertaken. Partition artifacts result due to imperfect disparity compensation. Drawbacks in existing compensation techniques are discussed and a novel loop-filtering scheme is proposed. This is used to smooth disparity-compensated images before generating and subsequently encoding residual images. As seen from this thesis, this scheme improves on the performance of current techniques. In addition, the dyadic sampling structure of a 2-D DWT is exploited to obtain discrete levels of spatial-scalability and forms part of an embedded scheme for transmission of stereoscopic still-images at different spatial resolutions. The proposed algorithm is suitably modified to encode time-varying stereoscopic imagery. Drawbacks of current moving-picture hierarchies are analyzed and a novel hierarchy is proposed that insures that a user has the flexibility to view a sequence either in monoscopic (default) or stereoscopic modes. Independent objective results, explaining SNR and spatial scalability features, are presented when encoding a few pictures of a stereoscopic moving image sequence. In addition, informal subjective results are presented when viewing encoded versions a time-varying sequence.

2 citations


Cites background from "Multiresolution based hierarchical ..."

  • ...The non-progressive nature of Chang and Wu’s algorithm [20] and problems due to drift, arising from all algorithms in [75, 77, 20, 76], justifies the development of new stereoscopic moving-image coding algorithms....

    [...]

  • ...On the other hand the authors in [75] and [20] use a simplified version of Fig....

    [...]

  • ...This is a departure from traditional open-loop structures proposed in [76, 20] and [75]....

    [...]

  • ...An early implementation of a stereoscopic moving-image encoder can be found in [75]....

    [...]

  • ...This forms the premise for any state-of-the-art stereoscopic moving-image encoding structure [76, 20, 75]....

    [...]

Book ChapterDOI
TL;DR: The experimental results show the efficiency of proposed scheme by comparison with already known methods and advantages of disparity estimation in the view of scalability overhead and are expected to play a key role in establishing highly flexible stereo video service for ubiquitous display environment where device and network connections are heterogeneous.
Abstract: In this paper, we propose a new stereo video coding scheme for heterogeneous consumer devices by exploiting the concept of spatio-temporal scalability. We use MPEG standard for coding the main sequence and interpolative prediction scheme for predicting the P- and B-type pictures of the auxiliary sequence. The interpolative scheme predicts matching blocks by interpolating both motion predicted macro-block and disparity predicted macro-block and employs weighting factors to minimize the residual errors. To provide flexible stereo video service, we define both a temporally scalable layer and a spatially scalable layer for each eye’s view. The experimental results show the efficiency of proposed scheme by comparison with already known methods and advantages of disparity estimation in the view of scalability overhead. According to the experimental results, we expect the proposed functionalities will play a key role in establishing highly flexible stereo video service for ubiquitous display environment where device and network connections are heterogeneous.

2 citations

References
More filters
Journal ArticleDOI
TL;DR: In this paper, it is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2 /sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions.
Abstract: Multiresolution representations are effective for analyzing the information content of images. The properties of the operator which approximates a signal at a given resolution were studied. It is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2/sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions. In L/sup 2/(R), a wavelet orthonormal basis is a family of functions which is built by dilating and translating a unique function psi (x). This decomposition defines an orthogonal multiresolution representation called a wavelet representation. It is computed with a pyramidal algorithm based on convolutions with quadrature mirror filters. Wavelet representation lies between the spatial and Fourier domains. For images, the wavelet representation differentiates several spatial orientations. The application of this representation to data compression in image coding, texture discrimination and fractal analysis is discussed. >

20,028 citations

Journal ArticleDOI
Ingrid Daubechies1
TL;DR: This work construct orthonormal bases of compactly supported wavelets, with arbitrarily high regularity, by reviewing the concept of multiresolution analysis as well as several algorithms in vision decomposition and reconstruction.
Abstract: We construct orthonormal bases of compactly supported wavelets, with arbitrarily high regularity. The order of regularity increases linearly with the support width. We start by reviewing the concept of multiresolution analysis as well as several algorithms in vision decomposition and reconstruction. The construction then follows from a synthesis of these different approaches.

8,588 citations


"Multiresolution based hierarchical ..." refers background in this paper

  • ...The maximum vertical disparity (VDMAX) is within 3-4 pixels for reasonably composed image pairs....

    [...]

Journal ArticleDOI
TL;DR: A technique for image encoding in which local operators of many scales but identical shape serve as the basis functions, which tends to enhance salient image features and is well suited for many image analysis tasks as well as for image compression.
Abstract: We describe a technique for image encoding in which local operators of many scales but identical shape serve as the basis functions. The representation differs from established techniques in that the code elements are localized in spatial frequency as well as in space. Pixel-to-pixel correlations are first removed by subtracting a lowpass filtered copy of the image from the image itself. The result is a net data compression since the difference, or error, image has low variance and entropy, and the low-pass filtered image may represented at reduced sample density. Further data compression is achieved by quantizing the difference image. These steps are then repeated to compress the low-pass image. Iteration of the process at appropriately expanded scales generates a pyramid data structure. The encoding process is equivalent to sampling the image with Laplacian operators of many scales. Thus, the code tends to enhance salient image features. A further advantage of the present code is that it is well suited for many image analysis tasks as well as for image compression. Fast algorithms are described for coding and decoding.

6,975 citations

BookDOI
01 Jan 1984
TL;DR: A Hierarchical Image Analysis System Based Upon Oriented Zero Crossings of Bandpassed Images and a Tutorial on Quadtree Research.
Abstract: I Image Pyramids and Their Uses.- 1. Some Useful Properties of Pyramids.- 2. The Pyramid as a Structure for Efficient Computation.- II Architectures and Systems.- 3. Multiprocessor Pyramid Architectures for Bottom-Up Image Analysis.- 4. Visual and Conceptual Hierarchy - A Paradigm for Studies of Automated Generation of Recognition Strategies.- 5. Multiresolution Processing.- 6. The Several Steps from Icon to Symbol Using Structured Cone/ Pyramids.- III Modelling, Processing, and Segmentation.- 7. Time Series Models for Multiresolution Images.- 8. Node Linking Strategies in Pyramids for Image Segmentation.- 9. Multilevel Image Reconstruction.- 10. Sorting, Histogramming, and Other Statistical Operations on a Pyramid Machine.- IV Features and Shape Analysis.- 11. A Hierarchical Image Analysis System Based Upon Oriented Zero Crossings of Bandpassed Images.- 12. A Multiresolution Representation for Shape.- 13. Multiresolution Feature Encodings.- 14. Multiple-Size Operators and Optimal Curve Finding.- V Region Representation and Surface Interpolation.- 15. A Tutorial on Quadtree Research.- 16. Multiresolution 3-d Image Processing and Graphics.- 17. Multilevel Reconstruction of Visual Surfaces: Variational Principles and Finite-Element Representations.- VI Time-Varying Analysis.- 18. Multilevel Relaxation in Low-Level Computer Vision.- 19. Region Matching in Pyramids for Dynamic Scene Analysis.- 20. Hierarchical Estimation of Spatial Properties from Motion.- VII Applications.- 21. Multiresolution Microscopy.- 22. Two-Resolution Detection of Lung Tumors in Chest Radiographs.- Index of Contributors.

623 citations

Book
01 Feb 1991

581 citations