A multiresolution framework for stereoscopic image sequence compression

doi:10.1109/ICIP.1994.413592

Home
/
Papers
/
A multiresolution framework for stereoscopic image sequence compression

Proceedings Article•DOI•

A multiresolution framework for stereoscopic image sequence compression

Sriram Sethuraman¹, Mel Siegel¹, Angel G. Jordan¹•Institutions (1)

Carnegie Mellon University¹

13 Nov 1994-Vol. 2, pp 361-365

TL;DR: The psychophysical property of the human visual system, that only one high resolution image in a stereo image pair is sufficient for satisfactory depth perception, has been used to further reduce the bit rates in this paper.

read less

Abstract: Stereoscopic sequence compression typically involves the exploitation of the spatial redundancy between the left and right streams to achieve higher compressions than are possible with the independent compression of the two streams. In this paper the psychophysical property of the human visual system, that only one high resolution image in a stereo image pair is sufficient for satisfactory depth perception, has been used to further reduce the bit rates. Thus, one of the streams is independently coded along the lines of the MPEG standards, while the other stream is estimated at a lower resolution from this stream. A multiresolution framework has been adopted to facilitate such an estimation of motion and disparity vectors at different resolutions. Experimental results on typical sequences indicate that the additional stream can be compressed to about one-fifth of a highly compressed independently coded stream, without any significant loss in depth perception or perceived image quality. >

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Optimal blockwise dependent quantization for stereo image coding

[...]

Woontack Woo¹, Antonio Ortega¹•Institutions (1)

University of Southern California¹

01 Sep 1999-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: This work addresses the problem of blockwise bit allocation for coding of stereo images and shows how, given the special characteristics of the disparity field, one can achieve an optimal solution with reasonable complexity, whereas in similar problems in motion compensated video only approximate solutions are feasible.

...read moreread less

Abstract: Research in coding of stereo images has focused mostly on the issue of disparity estimation to exploit the redundancy between the two images in a stereo pair, with less attention being devoted to the equally important problem of allocating bits between the two images. This bit allocation problem is complicated by the dependencies arising from using a prediction based on the quantized reference images. We address the problem of blockwise bit allocation for coding of stereo images and show how, given the special characteristics of the disparity field, one can achieve an optimal solution with reasonable complexity, whereas in similar problems in motion compensated video only approximate solutions are feasible. We present algorithms based on dynamic programming that provide the optimal blockwise bit allocation. Our experiments based on a modified JPEG coder show that the proposed scheme achieves higher mean peak signal-to-noise ratio over the two frames (0.2-0.5 dB improvements) as compared with blockwise independent quantization. We also propose a fast algorithm that provides most of the gain at a fraction of the complexity.

...read moreread less

56 citations

Cites methods from "A multiresolution framework for ste..."

...Examples include DE in DCT domain [6] or subband domain [10], DE using Markov Random Fields (MRF) models [11], [12], hierarchical segmentation-based DE [13], multiresolution-based DE [14], [15], pixel-based DE with object-based coding [16], [17]....
[...]

Proceedings Article•DOI•

Compression of stereo image pairs and streams

[...]

Mel Siegel¹, Priyan Gunatilake¹, Sriram Sethuraman¹, Angel G. Jordan¹•Institutions (1)

Carnegie Mellon University¹

15 Apr 1994

TL;DR: In this paper, the authors exploit the correlations between 3D-stereoscopic left-right image pairs to achieve high compression factors for imageframe storage and image stream transmission, and they find extremely high correlations between left- right frames offset in time such that perspective-induced disparity between viewpoints and motion-induced parallax from a single viewpoint are nearly identical; they coin the term "WoridLine correlation" for this condition.

...read moreread less

Abstract: We exploit the correlations between 3D-stereoscopic left-right image pairs to achieve high compression factors for imageframe storage and image stream transmission. In particular, in image stream transmission, we can find extremely highcorrelations between left-right frames offset in time such that perspective-induced disparity between viewpoints and motion-induced parallax from a single viewpoint are nearly identical; we coin the term "WoridLine correlation' for this condition.We test these ideas in two implementations, (1) straightforward computing of blockwise cross- correlations, and (2)multiresolution hierarchical matching using a wavelet- based compression method. We find that good 3D-stereoscopic imagery can be had for only a few percent more storage space or transmission bandwidth than is required for the corresponding flat imagery.1. INTRODUCTIONThe successful development of compression schemes for motion video that exploit the high correlation between temporallyadjacent frames, e.g., MPEG, suggests that we might analogously exploit the high correlation between spatially or angularlyadjacent still frames, i.e., left-right 3D-stereoscopic image pairs. If left-right pairs are selected from 3D-stereoscopic motionstreams at different times, such that perspective-induced disparity left-right and motion-induced disparity earlier-laterproduce about the same visual effect, then extremely high correlation will exist between the members of these pairs. Thiseffect, for which we coin the term "WorldLine correlation", can be exploited to achieve extremely high compression factorsfor stereo video streams.Our experiments demonstrate that a reasonable synthesis of one image of a left-right stereo image pair can be estimated fromthe other uncompressed or conventionally compressed image augmented by a small set of numbers that describe the localcross-correlations in terms of a disparity map. When the set is as small (in bits) as 1 to 2% of the conventionally compressedimage the stereoscopically viewed pair consisting of one original and one synthesized image produces convincing stereoimagery. Occlusions, for which this approach of course fails, can be handled efficiently by encoding and transmitting errormaps (residuals) of regions where a local statistical operator indicates that an occlusion is probable.Two cross-correlation mapping schemes independently developed by two of us (P.G. and S.S.) have been coded and tested,

...read moreread less

39 citations

Cites background from "A multiresolution framework for ste..."

...The specific references that we cite in the text and the general references that we also include in the bibliography point to background literature, as well as to three recent papers [5,6, 7] in which we document the low level details of our recent work....
[...]

Journal Article•DOI•

A hybrid scheme for low bit-rate coding of stereo images

[...]

Jianmin Jiang¹, Eran A. Edirisinghe²•Institutions (2)

University of Bradford¹, Loughborough University²

01 Feb 2002-IEEE Transactions on Image Processing

TL;DR: The algorithm effectively combines the simplicity and adaptability of the existing block based stereo image compression techniques with an edge/contour based object extraction technique to determine appropriate compression strategy for various areas of the right image.

...read moreread less

Abstract: We propose a hybrid scheme to implement an object driven, block based algorithm to achieve low bit-rate compression of stereo image pairs. The algorithm effectively combines the simplicity and adaptability of the existing block based stereo image compression techniques with an edge/contour based object extraction technique to determine appropriate compression strategy for various areas of the right image. Unlike the existing object-based coding such as MPEG-4 developed in the video compression community, the proposed scheme does not require any additional shape coding. Instead, the arbitrary shape is reconstructed by the matching object inside the left frame, which has been encoded by standard JPEG algorithm and hence made available at the decoding end for those shapes in right frames. Yet the shape reconstruction for right objects incurs no distortion due to the unique correlation between left and right frames inside stereo image pairs and the nature of the proposed hybrid scheme. Extensive experiments carried out support that significant improvements of up to 20% in compression ratios are achieved by the proposed algorithm in comparison with the existing block-based technique, while the reconstructed image quality is maintained at a competitive level in terms of both PSNR values and visual inspections.

...read moreread less

29 citations

Cites background from "A multiresolution framework for ste..."

...As the demand for applications of stereo images for better information access and interpretation are increasing, stereo image compression is becoming a more and more important area for further research and development, which also directly contributes to three-dimensional (3-...
[...]

Dissertation•

Multiple viewpoint rendering for three-dimensional displays

[...]

Michael Halle, Stephen A. Benton

01 Jan 1997

TL;DR: A new algorithm, multiple viewpoint rendering (MVR), is described which produces an equivalent set of images one to two orders of magnitude faster than previous approaches by considering the image set as a single spatio-perspective volume.

...read moreread less

Abstract: This thesis describes a computer graphics method for efficiently rendering images of static geometric scene databases from multiple viewpoints. Three-dimensional displays such as parallax panogramagrams, lenticular panoramagrams, and holographic stereograms require samples of image data captured from a large number of regularly spaced camera images in order to produce a three-dimensional image. Computer graphics algorithms that render these images sequentially are inefficient because they do not take advantage of the perspective coherence of the scene. A new algorithm, multiple viewpoint rendering (MVR), is described which produces an equivalent set of images one to two orders of magnitude faster than previous approaches by considering the image set as a single spatio-perspective volume. MVR uses a computer graphics camera geometry based on a common model of parallax-based three-dimensional displays. MVR can be implemented using variations of traditional computer graphics algorithms and accelerated using standard computer graphics hardware systems. Details of the algorithm design and implementation are given, including geometric transformation, shading, texture mapping and reflection mapping. Performance of a hardware-based prototype implementation and comparison of MVR-rendered and conventionally rendered images are included. Applications of MVR to holographic video and other display systems, to three-dimensional image compression, and to other camera geometries are also given. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690).

...read moreread less

28 citations

Cites methods from "A multiresolution framework for ste..."

...Sethuraman, Siegel, and Jordan used a hierarchical disparity estimator to compress stereo pairs and stereo sequences [59]....
[...]

Journal Article•

Compression and Representation of 3-D Images

[...]

Takeshi Naemura, Masahide Kaneko, Hiroshi Harashima

25 Mar 1999-IEICE Transactions on Information and Systems

TL;DR: More general platform for the 3-D image representation is introduced, aiming to outgrow the framework of 3- D “image” communication and to open up a novel field of technology, which should be called the “spatial’ communication.

...read moreread less

Abstract: This paper surveys the results of various studies on 3-D image coding. Themes are focused on efficient compression and display-independent representation of 3-D images. Most of the works on 3-D image coding have been concentrated on the compression methods tuned for each of the 3-D image formats (stereo pairs, multi-view images, volumetric images, holograms and so on). For the compression of stereo images, several techniques concerned with the concept of disparity compensation have been developed. For the compression of multi-view images, the concepts of disparity compensation and epipolar plane image (EPI) are the efficient ways of exploiting redundancies between multiple views. These techniques, however, heavily depend on the limited camera configurations. In order to consider many other multi-view configurations and other types of 3-D images comprehensively, more general platform for the 3-D image representation is introduced, aiming to outgrow the framework of 3-D “image” communication and to open up a novel field of technology, which should be called the “spatial” communication. Especially, the light ray based method has a wide range of application, including efficient transmission of the physical world, as well as integration of the virtual and physical worlds. key words: 3-D image coding, stereo images, multi-view images, panoramic images, volumetric images, holograms, displayindependent representation, light rays, spatial communication

...read moreread less

25 citations

1
2
3
4
…
5
6
7

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

A theory for multiresolution signal decomposition: the wavelet representation

[...]

Stéphane Mallat¹•Institutions (1)

New York University¹

01 Jul 1989-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: In this paper, it is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2 /sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions.

...read moreread less

Abstract: Multiresolution representations are effective for analyzing the information content of images. The properties of the operator which approximates a signal at a given resolution were studied. It is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2/sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions. In L/sup 2/(R), a wavelet orthonormal basis is a family of functions which is built by dilating and translating a unique function psi (x). This decomposition defines an orthogonal multiresolution representation called a wavelet representation. It is computed with a pyramidal algorithm based on convolutions with quadrature mirror filters. Wavelet representation lies between the spatial and Fourier domains. For images, the wavelet representation differentiates several spatial orientations. The application of this representation to data compression in image coding, texture discrimination and fractal analysis is discussed. >

...read moreread less

20,028 citations

Journal Article•DOI•

Data compression of stereopairs

[...]

Michael G. Perkins

01 Apr 1992-IEEE Transactions on Communications

TL;DR: It is proved that the rate distortion limit for coding stereopairs cannot in general be achieved by a coder that first codes and decodes the right picture sequence independently of the left picture sequence, and then codes anddecodes theleft picture sequence given the decoded right picture sequences.

...read moreread less

Abstract: Two fundamentally different techniques for compressing stereopairs are discussed. The first technique, called disparity-compensated transform-domain predictive coding, attempts to minimize the mean-square error between the original stereopair and the compressed stereopair. The second technique, called mixed-resolution coding, is a psychophysically justified technique that exploits known facts about human stereovision to code stereopairs in a subjectively acceptable manner. A method for assessing the quality of compressed stereopairs is also presented. It involves measuring the ability of an observer to perceive depth in coded stereopairs. It was found that observers generally perceived objects to be further away in compressed stereopairs than they did in originals. It is proved that the rate distortion limit for coding stereopairs cannot in general be achieved by a coder that first codes and decodes the right picture sequence independently of the left picture sequence, and then codes and decodes the left picture sequence given the decoded right picture sequence. >

...read moreread less

243 citations

Journal Article•DOI•

Interpolative multiresolution coding of advance television with compatible subchannels

[...]

Kamil Metin Uz¹, Martin Vetterli¹, Didier J. LeGall²•Institutions (2)

Columbia University¹, Telcordia Technologies²

01 Mar 1991-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: A multiresolution representation for video signals is introduced and Interpolation in an FIR (finite impulse response) scheme solves uncovered area problems, considerably improving the temporal prediction.

...read moreread less

Abstract: A multiresolution representation for video signals is introduced. A three-dimensional spatiotemporal pyramid algorithms for high-quality compression of advanced television sequences is presented. The scheme utilizes a finite memory structure and is robust to channel errors, provides compatible subchannels, and can handle different scan formats, making it well suited for the broadcast environment. Additional features such as fast random access and reverse playback make it suitable for digital storage as well. Model-based processing is used both over space and time, where motion-based interpolation is used. Interpolation in an FIR (finite impulse response) scheme solves uncovered area problems, considerably improving the temporal prediction. The complexity is comparable to that of previous recursive schemes. Computer simulations indicate that high compression factors (about an order of magnitude) are easily achieved with no apparent loss of quality. The scheme also has a number of commonalities with the emerging MPEG standard. >

...read moreread less

204 citations

Journal Article•DOI•

Evaluation of multiresolution block matching techniques for motion and disparity estimation

[...]

Dimitrios Tzovaras¹, Michael G. Strintzis¹, Haralambos Sahinoglou¹•Institutions (1)

Aristotle University of Thessaloniki¹

01 Mar 1994-Signal Processing-image Communication

TL;DR: Multiresolution block matching methods for both monocular and stereoscopic image sequence coding are evaluated to drastically reduce the amount of processing needed for block correspondence without seriously affecting the quality of the reconstructed images.

...read moreread less

Abstract: Multiresolution block matching methods for both monocular and stereoscopic image sequence coding are evaluated. These methods are seen to drastically reduce the amount of processing needed for block correspondence without seriously affecting the quality of the reconstructed images. The evaluation criteria are the prediction error and the speed of the algorithm for motion, disparity, and fused motion and disparity estimation, in comparison with the full search (exhaustive) method. A new method is also proposed based in multiresolution techniques, for efficient coding of the disparity or the displacement vector field.

...read moreread less

84 citations

"A multiresolution framework for ste..." refers background in this paper

...However, several schemes [l] , [2], [3], [4] have been developed, that exploit the disparity relation to achieve compression ratios higher than those, that are possible by the independent compression of the two streams....
[...]

Journal Article•DOI•

Constrained disparity and motion estimators for 3DTV image sequence coding

[...]

Ahmed Tamtaoui¹, Claude Labit¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

01 Nov 1991-Signal Processing-image Communication

TL;DR: This paper presents two-dimensional motion estimation methods which take advantage of the intrinsic redundancies inside 3DTV stereoscopic image sequences, subject to the crucial assumption that an initial calibration of the stereoscopic sensors provides us with geometric change of coordinates for two matched features.

...read moreread less

Abstract: This paper presents two-dimensional motion estimation methods which take advantage of the intrinsic redundancies inside 3DTV stereoscopic image sequences. Most of the previous studies extract, either disparity vector fields if they are involved in stereovision, or apparent motion vector fields to be applied to motion compensation coding schemes. For 3DTV image sequence analysis and transmission, we can jointly estimate these two feature fields. Locally, initial image data are grouped within two views (the left and right ones) at two successive time samples and spatio-temporal coherence has to be used to enhance motion vector field estimation. Three different levels of ‘coherence’ have been experimented subject to the crucial assumption that an initial calibration of the stereoscopic sensors provides us with geometric change of coordinates for two matched features.

...read moreread less

61 citations