




Did you find this useful? Give us your feedback
40 citations
29 citations
...disparity prediction to separate transforms for residual images [40, 14, 48, 3, 34, 42]....
[...]
23 citations
12 citations
12 citations
2,945 citations
243 citations
204 citations
63 citations
61 citations
Future research will address in the short term fine-tuning the architectures and algorithms and understanding their fundamental mathematical and psychophysical efficiencies, and in the long term issues such as multiple camera schemes and object based compression methods.
APPROACHTheir basic approach to compression of 3D-stereoscopic imagery is based on the observation that disparity, the relative offset between corresponding points in an image pair, varies only slowly over most of the image field.
When the set is as small (in bits) as 1 to 2% of the conventionally compressed image the stereoscopically viewed pair consisting of one original and one synthesized image produces convincing stereo imagery.
Topics that the authors need to address in the context of compression of 3D-stereoscopic imagery include:• Optimizing implementation of the WorldLine approach.•
The successful development of compression schemes for motion video that exploit the high correlation between temporally adjacent frames, e.g., MPEG, suggests that the authors might analogously exploit the high correlation between spatially or angularly adjacent still frames, i.e., left-right 3D-stereoscopic image pairs.
Using three cameras: compute predictors for left and right views given the middle view, transmit the middle view and the predictors, synthesize 3D-stereoscopic views at the receiver.
The fundamental issue is that when 3D-stereoscopy is implemented on a single display each eye gets in some sense only half the display.
Their experiments demonstrate that a reasonable synthesis of one image of a left-right stereo image pair can be estimated from the other uncompressed or conventionally compressed image augmented by a small set of numbers that describe the local cross-correlations in terms of a disparity map.
In fact, because the two views comprising a 3D-stereoscopic image pair are nearly identical, i.e., the information content of both together is only a little more than the information content of one alone, it is possible to find representations of image pairs and streams that take up little more storage space and transmission bandwidth than the space or bandwidth that is required by either alone.
the bandwidth must apparently be doubled to transmit 3D-stereoscopic image streams at the same spatial resolution and temporal update frequency as either flat image stream.
One component may be either lossless or slightly lossy, as in conventional compression of flat imagery; the other component is by itself a very lossy (or "deep") method of compression.
The human visual perception system has an effective way to deal with occlusions: the authors have a detailed understanding of the image semantics, from which the authors effortlessly and unconsciously draw inferences that fill in the missing information.
This is the obvious candidate for initial experiments because it is easy to code and because the authors have a strong intuitive understanding of its parameters.
The price may be extracted in either essentially the spatial domain, e.g., by assigning the odd lines to the left eye and the even lines to the right eye, or in essentially the temporal domain, e.g., by assigning alternate frames to the left and right eye.
Each disparity is a vector with two components, horizontal and vertical, so the net compression has an upper bound of 1/32, about 3%.