scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

A multiresolution framework for stereoscopic image sequence compression

13 Nov 1994-Vol. 2, pp 361-365
TL;DR: The psychophysical property of the human visual system, that only one high resolution image in a stereo image pair is sufficient for satisfactory depth perception, has been used to further reduce the bit rates in this paper.
Abstract: Stereoscopic sequence compression typically involves the exploitation of the spatial redundancy between the left and right streams to achieve higher compressions than are possible with the independent compression of the two streams. In this paper the psychophysical property of the human visual system, that only one high resolution image in a stereo image pair is sufficient for satisfactory depth perception, has been used to further reduce the bit rates. Thus, one of the streams is independently coded along the lines of the MPEG standards, while the other stream is estimated at a lower resolution from this stream. A multiresolution framework has been adopted to facilitate such an estimation of motion and disparity vectors at different resolutions. Experimental results on typical sequences indicate that the additional stream can be compressed to about one-fifth of a highly compressed independently coded stream, without any significant loss in depth perception or perceived image quality. >

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: This work addresses the problem of blockwise bit allocation for coding of stereo images and shows how, given the special characteristics of the disparity field, one can achieve an optimal solution with reasonable complexity, whereas in similar problems in motion compensated video only approximate solutions are feasible.
Abstract: Research in coding of stereo images has focused mostly on the issue of disparity estimation to exploit the redundancy between the two images in a stereo pair, with less attention being devoted to the equally important problem of allocating bits between the two images. This bit allocation problem is complicated by the dependencies arising from using a prediction based on the quantized reference images. We address the problem of blockwise bit allocation for coding of stereo images and show how, given the special characteristics of the disparity field, one can achieve an optimal solution with reasonable complexity, whereas in similar problems in motion compensated video only approximate solutions are feasible. We present algorithms based on dynamic programming that provide the optimal blockwise bit allocation. Our experiments based on a modified JPEG coder show that the proposed scheme achieves higher mean peak signal-to-noise ratio over the two frames (0.2-0.5 dB improvements) as compared with blockwise independent quantization. We also propose a fast algorithm that provides most of the gain at a fraction of the complexity.

56 citations


Cites methods from "A multiresolution framework for ste..."

  • ...Examples include DE in DCT domain [6] or subband domain [10], DE using Markov Random Fields (MRF) models [11], [12], hierarchical segmentation-based DE [13], multiresolution-based DE [14], [15], pixel-based DE with object-based coding [16], [17]....

    [...]

Proceedings ArticleDOI
15 Apr 1994
TL;DR: In this paper, the authors exploit the correlations between 3D-stereoscopic left-right image pairs to achieve high compression factors for imageframe storage and image stream transmission, and they find extremely high correlations between left- right frames offset in time such that perspective-induced disparity between viewpoints and motion-induced parallax from a single viewpoint are nearly identical; they coin the term "WoridLine correlation" for this condition.
Abstract: We exploit the correlations between 3D-stereoscopic left-right image pairs to achieve high compression factors for imageframe storage and image stream transmission. In particular, in image stream transmission, we can find extremely highcorrelations between left-right frames offset in time such that perspective-induced disparity between viewpoints and motion-induced parallax from a single viewpoint are nearly identical; we coin the term "WoridLine correlation' for this condition.We test these ideas in two implementations, (1) straightforward computing of blockwise cross- correlations, and (2)multiresolution hierarchical matching using a wavelet- based compression method. We find that good 3D-stereoscopic imagery can be had for only a few percent more storage space or transmission bandwidth than is required for the corresponding flat imagery.1. INTRODUCTIONThe successful development of compression schemes for motion video that exploit the high correlation between temporallyadjacent frames, e.g., MPEG, suggests that we might analogously exploit the high correlation between spatially or angularlyadjacent still frames, i.e., left-right 3D-stereoscopic image pairs. If left-right pairs are selected from 3D-stereoscopic motionstreams at different times, such that perspective-induced disparity left-right and motion-induced disparity earlier-laterproduce about the same visual effect, then extremely high correlation will exist between the members of these pairs. Thiseffect, for which we coin the term "WorldLine correlation", can be exploited to achieve extremely high compression factorsfor stereo video streams.Our experiments demonstrate that a reasonable synthesis of one image of a left-right stereo image pair can be estimated fromthe other uncompressed or conventionally compressed image augmented by a small set of numbers that describe the localcross-correlations in terms of a disparity map. When the set is as small (in bits) as 1 to 2% of the conventionally compressedimage the stereoscopically viewed pair consisting of one original and one synthesized image produces convincing stereoimagery. Occlusions, for which this approach of course fails, can be handled efficiently by encoding and transmitting errormaps (residuals) of regions where a local statistical operator indicates that an occlusion is probable.Two cross-correlation mapping schemes independently developed by two of us (P.G. and S.S.) have been coded and tested,

39 citations


Cites background from "A multiresolution framework for ste..."

  • ...The specific references that we cite in the text and the general references that we also include in the bibliography point to background literature, as well as to three recent papers [5,6, 7] in which we document the low level details of our recent work....

    [...]

Journal ArticleDOI
TL;DR: The algorithm effectively combines the simplicity and adaptability of the existing block based stereo image compression techniques with an edge/contour based object extraction technique to determine appropriate compression strategy for various areas of the right image.
Abstract: We propose a hybrid scheme to implement an object driven, block based algorithm to achieve low bit-rate compression of stereo image pairs. The algorithm effectively combines the simplicity and adaptability of the existing block based stereo image compression techniques with an edge/contour based object extraction technique to determine appropriate compression strategy for various areas of the right image. Unlike the existing object-based coding such as MPEG-4 developed in the video compression community, the proposed scheme does not require any additional shape coding. Instead, the arbitrary shape is reconstructed by the matching object inside the left frame, which has been encoded by standard JPEG algorithm and hence made available at the decoding end for those shapes in right frames. Yet the shape reconstruction for right objects incurs no distortion due to the unique correlation between left and right frames inside stereo image pairs and the nature of the proposed hybrid scheme. Extensive experiments carried out support that significant improvements of up to 20% in compression ratios are achieved by the proposed algorithm in comparison with the existing block-based technique, while the reconstructed image quality is maintained at a competitive level in terms of both PSNR values and visual inspections.

29 citations


Cites background from "A multiresolution framework for ste..."

  • ...As the demand for applications of stereo images for better information access and interpretation are increasing, stereo image compression is becoming a more and more important area for further research and development, which also directly contributes to three-dimensional (3-...

    [...]

Dissertation
01 Jan 1997
TL;DR: A new algorithm, multiple viewpoint rendering (MVR), is described which produces an equivalent set of images one to two orders of magnitude faster than previous approaches by considering the image set as a single spatio-perspective volume.
Abstract: This thesis describes a computer graphics method for efficiently rendering images of static geometric scene databases from multiple viewpoints. Three-dimensional displays such as parallax panogramagrams, lenticular panoramagrams, and holographic stereograms require samples of image data captured from a large number of regularly spaced camera images in order to produce a three-dimensional image. Computer graphics algorithms that render these images sequentially are inefficient because they do not take advantage of the perspective coherence of the scene. A new algorithm, multiple viewpoint rendering (MVR), is described which produces an equivalent set of images one to two orders of magnitude faster than previous approaches by considering the image set as a single spatio-perspective volume. MVR uses a computer graphics camera geometry based on a common model of parallax-based three-dimensional displays. MVR can be implemented using variations of traditional computer graphics algorithms and accelerated using standard computer graphics hardware systems. Details of the algorithm design and implementation are given, including geometric transformation, shading, texture mapping and reflection mapping. Performance of a hardware-based prototype implementation and comparison of MVR-rendered and conventionally rendered images are included. Applications of MVR to holographic video and other display systems, to three-dimensional image compression, and to other camera geometries are also given. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690).

28 citations


Cites methods from "A multiresolution framework for ste..."

  • ...Sethuraman, Siegel, and Jordan used a hierarchical disparity estimator to compress stereo pairs and stereo sequences [59]....

    [...]

Journal Article
TL;DR: More general platform for the 3-D image representation is introduced, aiming to outgrow the framework of 3- D “image” communication and to open up a novel field of technology, which should be called the “spatial’ communication.
Abstract: This paper surveys the results of various studies on 3-D image coding. Themes are focused on efficient compression and display-independent representation of 3-D images. Most of the works on 3-D image coding have been concentrated on the compression methods tuned for each of the 3-D image formats (stereo pairs, multi-view images, volumetric images, holograms and so on). For the compression of stereo images, several techniques concerned with the concept of disparity compensation have been developed. For the compression of multi-view images, the concepts of disparity compensation and epipolar plane image (EPI) are the efficient ways of exploiting redundancies between multiple views. These techniques, however, heavily depend on the limited camera configurations. In order to consider many other multi-view configurations and other types of 3-D images comprehensively, more general platform for the 3-D image representation is introduced, aiming to outgrow the framework of 3-D “image” communication and to open up a novel field of technology, which should be called the “spatial” communication. Especially, the light ray based method has a wide range of application, including efficient transmission of the physical world, as well as integration of the virtual and physical worlds. key words: 3-D image coding, stereo images, multi-view images, panoramic images, volumetric images, holograms, displayindependent representation, light rays, spatial communication

25 citations

References
More filters
Journal ArticleDOI
TL;DR: In this paper, it is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2 /sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions.
Abstract: Multiresolution representations are effective for analyzing the information content of images. The properties of the operator which approximates a signal at a given resolution were studied. It is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2/sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions. In L/sup 2/(R), a wavelet orthonormal basis is a family of functions which is built by dilating and translating a unique function psi (x). This decomposition defines an orthogonal multiresolution representation called a wavelet representation. It is computed with a pyramidal algorithm based on convolutions with quadrature mirror filters. Wavelet representation lies between the spatial and Fourier domains. For images, the wavelet representation differentiates several spatial orientations. The application of this representation to data compression in image coding, texture discrimination and fractal analysis is discussed. >

20,028 citations

Journal ArticleDOI
TL;DR: It is proved that the rate distortion limit for coding stereopairs cannot in general be achieved by a coder that first codes and decodes the right picture sequence independently of the left picture sequence, and then codes anddecodes theleft picture sequence given the decoded right picture sequences.
Abstract: Two fundamentally different techniques for compressing stereopairs are discussed. The first technique, called disparity-compensated transform-domain predictive coding, attempts to minimize the mean-square error between the original stereopair and the compressed stereopair. The second technique, called mixed-resolution coding, is a psychophysically justified technique that exploits known facts about human stereovision to code stereopairs in a subjectively acceptable manner. A method for assessing the quality of compressed stereopairs is also presented. It involves measuring the ability of an observer to perceive depth in coded stereopairs. It was found that observers generally perceived objects to be further away in compressed stereopairs than they did in originals. It is proved that the rate distortion limit for coding stereopairs cannot in general be achieved by a coder that first codes and decodes the right picture sequence independently of the left picture sequence, and then codes and decodes the left picture sequence given the decoded right picture sequence. >

243 citations

Journal ArticleDOI
TL;DR: A multiresolution representation for video signals is introduced and Interpolation in an FIR (finite impulse response) scheme solves uncovered area problems, considerably improving the temporal prediction.
Abstract: A multiresolution representation for video signals is introduced. A three-dimensional spatiotemporal pyramid algorithms for high-quality compression of advanced television sequences is presented. The scheme utilizes a finite memory structure and is robust to channel errors, provides compatible subchannels, and can handle different scan formats, making it well suited for the broadcast environment. Additional features such as fast random access and reverse playback make it suitable for digital storage as well. Model-based processing is used both over space and time, where motion-based interpolation is used. Interpolation in an FIR (finite impulse response) scheme solves uncovered area problems, considerably improving the temporal prediction. The complexity is comparable to that of previous recursive schemes. Computer simulations indicate that high compression factors (about an order of magnitude) are easily achieved with no apparent loss of quality. The scheme also has a number of commonalities with the emerging MPEG standard. >

204 citations

Journal ArticleDOI
TL;DR: Multiresolution block matching methods for both monocular and stereoscopic image sequence coding are evaluated to drastically reduce the amount of processing needed for block correspondence without seriously affecting the quality of the reconstructed images.
Abstract: Multiresolution block matching methods for both monocular and stereoscopic image sequence coding are evaluated. These methods are seen to drastically reduce the amount of processing needed for block correspondence without seriously affecting the quality of the reconstructed images. The evaluation criteria are the prediction error and the speed of the algorithm for motion, disparity, and fused motion and disparity estimation, in comparison with the full search (exhaustive) method. A new method is also proposed based in multiresolution techniques, for efficient coding of the disparity or the displacement vector field.

84 citations


"A multiresolution framework for ste..." refers background in this paper

  • ...However, several schemes [l] , [2], [3], [4] have been developed, that exploit the disparity relation to achieve compression ratios higher than those, that are possible by the independent compression of the two streams....

    [...]

Journal ArticleDOI
TL;DR: This paper presents two-dimensional motion estimation methods which take advantage of the intrinsic redundancies inside 3DTV stereoscopic image sequences, subject to the crucial assumption that an initial calibration of the stereoscopic sensors provides us with geometric change of coordinates for two matched features.
Abstract: This paper presents two-dimensional motion estimation methods which take advantage of the intrinsic redundancies inside 3DTV stereoscopic image sequences. Most of the previous studies extract, either disparity vector fields if they are involved in stereovision, or apparent motion vector fields to be applied to motion compensation coding schemes. For 3DTV image sequence analysis and transmission, we can jointly estimate these two feature fields. Locally, initial image data are grouped within two views (the left and right ones) at two successive time samples and spatio-temporal coherence has to be used to enhance motion vector field estimation. Three different levels of ‘coherence’ have been experimented subject to the crucial assumption that an initial calibration of the stereoscopic sensors provides us with geometric change of coordinates for two matched features.

61 citations