scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Stereo image quality: effects of mixed spatio-temporal resolution

TL;DR: It was found that spatial filtering of one channel of a stereo video-sequence may be an effective means of reducing the transmission bandwidth: the overall sensation of depth was unaffected by low-pass filtering, while ratings of quality and of sharpness were strongly weighted towards the eye with the greater spatial resolution.
Abstract: We explored the response of the human visual system to mixed-resolution stereo video-sequences, in which one eye view was spatially or temporally low-pass filtered. It was expected that the perceived quality, depth, and sharpness would be relatively unaffected by low-pass filtering, compared to the case where both eyes viewed a filtered image. Subjects viewed two 10-second stereo video-sequences, in which the right-eye frames were filtered vertically (V) and horizontally (H) at 1/2 H, 1/2 V, 1/4 H, 1/4 V, 1/2 H 1/2 V, 1/2 H 1/4 V, 1/4 H 1/2 V, and 1/4 H 1/4 V resolution. Temporal filtering was implemented for a subset of these conditions at 1/2 temporal resolution, or with drop-and-repeat frames. Subjects rated the overall quality, sharpness, and overall sensation of depth. It was found that spatial filtering produced acceptable results: the overall sensation of depth was unaffected by low-pass filtering, while ratings of quality and of sharpness were strongly weighted towards the eye with the greater spatial resolution. By comparison, temporal filtering produced unacceptable results: field averaging and drop-and-repeat frame conditions yielded images with poor quality and sharpness, even though perceived depth was relatively unaffected. We conclude that spatial filtering of one channel of a stereo video-sequence may be an effective means of reducing the transmission bandwidth.
Citations
More filters
Journal ArticleDOI
31 Jan 2011
TL;DR: An overview of the algorithmic design used for extending H.264/MPEG-4 AVC towards MVC is provided and a summary of the coding performance achieved by MVC for both stereo- and multiview video is provided.
Abstract: Significant improvements in video compression capability have been demonstrated with the introduction of the H.264/MPEG-4 advanced video coding (AVC) standard. Since developing this standard, the Joint Video Team of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) has also standardized an extension of that technology that is referred to as multiview video coding (MVC). MVC provides a compact representation for multiple views of a video scene, such as multiple synchronized video cameras. Stereo-paired video for 3-D viewing is an important special case of MVC. The standard enables inter-view prediction to improve compression capability, as well as supporting ordinary temporal and spatial prediction. It also supports backward compatibility with existing legacy systems by structuring the MVC bitstream to include a compatible “base view.” Each other view is encoded at the same picture resolution as the base view. In recognition of its high-quality encoding capability and support for backward compatibility, the stereo high profile of the MVC extension was selected by the Blu-Ray Disc Association as the coding format for 3-D video with high-definition resolution. This paper provides an overview of the algorithmic design used for extending H.264/MPEG-4 AVC towards MVC. The basic approach of MVC for enabling inter-view prediction and view scalability in the context of H.264/MPEG-4 AVC is reviewed. Related supplemental enhancement information (SEI) metadata is also described. Various “frame compatible” approaches for support of stereo-view video as an alternative to MVC are also discussed. A summary of the coding performance achieved by MVC for both stereo- and multiview video is also provided. Future directions and challenges related to 3-D video are also briefly discussed.

683 citations


Cites background from "Stereo image quality: effects of mi..."

  • ...In this scheme, one of the views is significantly blurred or more coarsely quantized than the other [52], or is coded with a reduced spatial resolution [53], [54], with an impact on the stereo quality that may be imperceptible....

    [...]

Journal ArticleDOI
26 Jul 2010
TL;DR: The most important perceptual aspects of stereo vision are discussed and their implications for stereoscopic content creation are formalized into a set of basic disparity mapping operators that enable us to control and retarget the depth of a stereoscopic scene in a nonlinear and locally adaptive fashion.
Abstract: This paper addresses the problem of remapping the disparity range of stereoscopic images and video. Such operations are highly important for a variety of issues arising from the production, live broadcast, and consumption of 3D content. Our work is motivated by the observation that the displayed depth and the resulting 3D viewing experience are dictated by a complex combination of perceptual, technological, and artistic constraints. We first discuss the most important perceptual aspects of stereo vision and their implications for stereoscopic content creation. We then formalize these insights into a set of basic disparity mapping operators. These operators enable us to control and retarget the depth of a stereoscopic scene in a nonlinear and locally adaptive fashion. To implement our operators, we propose a new strategy based on stereoscopic warping of the input video streams. From a sparse set of stereo correspondences, our algorithm computes disparity and image-based saliency estimates, and uses them to compute a deformation of the input views so as to meet the target disparities. Our approach represents a practical solution for actual stereo production and display that does not require camera calibration, accurate dense depth maps, occlusion handling, or inpainting. We demonstrate the performance and versatility of our method using examples from live action post-production, 3D display size adaptation, and live broadcast. An additional user study and ground truth comparison further provide evidence for the quality and practical relevance of the presented work.

418 citations

Journal ArticleDOI
01 Apr 2006
TL;DR: Results on asymmetric and symmetric coding showed that the relationship between perceived image quality and average bit rate is not straightforward, and in some cases, image quality ratings of a symmetric coded pair can be higher than for an asymmetriccoded pair, even if the averaged bit rate for the symmetric pair is lower, than for the asymmetric pair.
Abstract: JPEG compression of the left and right components of a stereo image pair is a way to save valuable bandwidth when transmitting stereoscopic images. This paper presents results on the effects of camera-base distance (B) and JPEG coding on overall image quality, perceived depth, perceived sharpness, and perceived eye strain. In the experiment, two stereoscopic still scenes were used, varying in depth (three different camera-base distances: 0, 8, and 12 cm) and compression ratio (4 levels: original, 1:30, 1:40, and 1:60). All levels of compression were applied to both the left and right stereo image, resulting in a 4 × 4 matrix of all possible symmetric and asymmetric coding combinations. The observers were asked to assess image quality, sharpness, depth, and eye strain. Results showed that an increase in JPEG coding had a negative effect on image quality, sharpness, and eye strain, but had no effect on perceived depth. An increase in camera-base distance increased perceived depth and reported eye strain, but had no effect on perceived sharpness. Results on asymmetric and symmetric coding showed that the relationship between perceived image quality and average bit rate is not straightforward. In some cases, image quality ratings of a symmetric coded pair can be higher than for an asymmetric coded pair, even if the averaged bit rate for the symmetric pair is lower, than for the asymmetric pair. Furthermore, sharpness and eye strain correlated highly and medium, respectively, with perceived image quality.

188 citations

Journal ArticleDOI
TL;DR: The correlation between subjective and objective evaluation of color plus depth video and transmission over Internet protocol (IP) is investigated, and subjective results are used to determine more accurate objective quality assessment metrics for 3D color plus Depth video.
Abstract: In the near future, many conventional video applications are likely to be replaced by immersive video to provide a sense of ldquobeing there.rdquo This transition is facilitated by the recent advancement of 3D capture, coding, transmission, and display technologies. Stereoscopic video is the simplest form of 3D video available in the literature. ldquoColor plus depth maprdquo based stereoscopic video has attracted significant attention, as it can reduce storage and bandwidth requirements for the transmission of stereoscopic content over communication channels. However, quality assessment of coded video sequences can currently only be performed reliably using expensive and inconvenient subjective tests. To enable researchers to optimize 3D video systems in a timely fashion, it is essential that reliable objective measures are found. This paper investigates the correlation between subjective and objective evaluation of color plus depth video. The investigation is conducted for different compression ratios, and different video sequences. Transmission over Internet protocol (IP) is also investigated. Subjective tests are performed to determine the image quality and depth perception of a range of differently coded video sequences, with packet loss rates ranging from 0% to 20%. The subjective results are used to determine more accurate objective quality assessment metrics for 3D color plus depth video.

169 citations


Cites methods from "Stereo image quality: effects of mi..."

  • ...This method has been used to reduce the storage and bandwidth requirements for stereo video applications [21], [22]....

    [...]

Journal ArticleDOI
TL;DR: A novel client-driven multiview video streaming system that allows a user to watch 3D video interactively with significantly reduced bandwidth requirements by transmitting a small number of views selected according to his/her head position is presented.
Abstract: We present a novel client-driven multiview video streaming system that allows a user to watch 3D video interactively with significantly reduced bandwidth requirements by transmitting a small number of views selected according to his/her head position. The user's head position is tracked and predicted into the future to select the views that best match the user's current viewing angle dynamically. Prediction of future head positions is needed so that views matching the predicted head positions can be prefetched in order to account for delays due to network transport and stream switching. The system allocates more bandwidth to the selected views in order to render the current viewing angle. Highly compressed, lower quality versions of some other views are also prefetched for concealment if the current user viewpoint differs from the predicted viewpoint. An objective measure based on the abruptness of the head movements and delays in the system is introduced to determine the number of additional lower quality views to be prefetched. The proposed system makes use of multiview coding (MVC) and scalable video coding (SVC) concepts together to obtain improved compression efficiency while providing flexibility in bandwidth allocation to the selected views. Rate-distortion performance of the proposed system is demonstrated under different experimental conditions.

159 citations


Cites background from "Stereo image quality: effects of mi..."

  • ...According to subjective quality tests reported in [12] and [13], humans perceive high quality 3-D video as long as one of the eyes sees a high quality view....

    [...]

References
More filters
Book
01 Jan 1971
TL;DR: Foundations of Cyclopean Perception as mentioned in this paper is a classic work on cyclopean perception that has influenced a generation of vision researchers, cognitive scientists, and neuroscientists and has inspired artists, designers, and computer graphics pioneers.
Abstract: This classic work on cyclopean perception has influenced a generation of vision researchers, cognitive scientists, and neuroscientists and has inspired artists, designers, and computer graphics pioneers. In Foundations of Cyclopean Perception (first published in 1971 and unavailable for years), Bela Julesz traced the visual information flow in the brain, analyzing how the brain combines separate images received from the two eyes to produce depth perception. Julesz developed novel tools to do this: random-dot stereograms and cinematograms, generated by early digital computers at Bell Labs. These images, when viewed with the special glasses that came with the book, revealed complex, three-dimensional surfaces; this mode of visual stimulus became a paradigm for research in vision and perception. This reprint edition includes all 48 color random-dot designs from the original, as well as the special 3-D glasses required to view them.Foundations of Cyclopean Perception has had a profound impact on the vision studies community. It was chosen as one of the one hundred most influential works in cognitive science in a poll conducted by the University of Minnesota's Center for Cognitive Sciences. Many copies are "permanently borrowed" from college libraries; used copies are sought after online. Now, with this facsimile of the 1971 edition, the book is available again to cognitive scientists, neuroscientists, vision researchers, artists, and designers.

2,449 citations


"Stereo image quality: effects of mi..." refers background or result in this paper

  • ...In describing the subjective impression produced by a mixed-resolution stereo image, Julesz [ 4 ] noted that the overall sharpness of the image was determined by the high-resolution member of the mixed-resolution pair....

    [...]

  • ...Observe that the lines are at Reference level in the left-middle panel of Fig. 5. This is consistent with the view that depth can be conveyed by low spatial frequencies alone [ 4 ]—these were unaffected by low-pass filtering in the present experiment....

    [...]

Journal ArticleDOI
TL;DR: It is proved that the rate distortion limit for coding stereopairs cannot in general be achieved by a coder that first codes and decodes the right picture sequence independently of the left picture sequence, and then codes anddecodes theleft picture sequence given the decoded right picture sequences.
Abstract: Two fundamentally different techniques for compressing stereopairs are discussed. The first technique, called disparity-compensated transform-domain predictive coding, attempts to minimize the mean-square error between the original stereopair and the compressed stereopair. The second technique, called mixed-resolution coding, is a psychophysically justified technique that exploits known facts about human stereovision to code stereopairs in a subjectively acceptable manner. A method for assessing the quality of compressed stereopairs is also presented. It involves measuring the ability of an observer to perceive depth in coded stereopairs. It was found that observers generally perceived objects to be further away in compressed stereopairs than they did in originals. It is proved that the rate distortion limit for coding stereopairs cannot in general be achieved by a coder that first codes and decodes the right picture sequence independently of the left picture sequence, and then codes and decodes the left picture sequence given the decoded right picture sequence. >

243 citations


"Stereo image quality: effects of mi..." refers background in this paper

  • ...Perkins [ 9 ] has indicated that mixed-resolution stereo image sequences can provide acceptable image quality, although the latter comments were based on informal observations....

    [...]

Journal ArticleDOI
TL;DR: It is concluded that blur, but not blockiness, is an acceptable form of monocular degradation, provided that binocular vision assigns greater weight to the nondegraded input.
Abstract: For efficient storage and transmission of stereoscopic images over bandwidth-limited channels, compression can be achieved by degrading 1 monocular input of a stereo pair and maintaining the other at the desired quality. The desired quality of the fused stereoscopic image can be achieved, provided that binocular vision assigns greater weight to the nondegraded input. A psychophysical matching procedure was used to determine if such over-weighting occurred when the monocular degradation included blur or blocking artifacts. Over-weighting of the nondegraded input occurred for blur, but under-weighting of the nondegraded input occurred for blockiness. Some participants exhibited ocular dominance, but this did not affect the blur results. The authors conclude that blur, but not blockiness, is an acceptable form of monocular degradation.

103 citations


"Stereo image quality: effects of mi..." refers background in this paper

  • ...Other psychophysical data from our laboratory [ 15 ], a portion of which are shown in Fig. 7, supported the view that mixed-resolution stereoscopic sequences created using spatial filtering, produced subjectively sharp binocular percepts—nearly as sharp as the high-resolution channel of the mixed-resolution pair....

    [...]

  • ...In that experiment [ 15 ], subjects were required to adjust the level of blur of one member of a mixed-resolution pair, until the perceived sharpness of the binocular percept matched that of a standard image....

    [...]

Proceedings ArticleDOI
TL;DR: The main finding was that viewers preferred the stereoscopic version over the non-stereoscopic version of the sequences, provided that the sequence did not contain noticeable stereo artifacts, such as exaggerated disparity.
Abstract: In comparison to conventional displays, 3D stereoscopic displays convey additional information about the 3D structure of a scene by providing information that can be used to extract depth. In the present study we evaluated the psychovisual impact of stereoscopic images on viewers. Thirty-three non-expert viewers rated sensation of depth, perceived sharpness, subjective image quality, and relative preference for stereoscopic over non-stereoscopic images. Rating methods were based on procedures described in ITU- Rec. 500. Viewers also rated sequences in which the left- and right-eye images were processed independently, using a generic MPEG-2 codec, at bit-rates of 6, 3, and 1 Mbits/s. The main finding was that viewers preferred the stereoscopic version over the non-stereoscopic version of the sequences, provided that the sequence did not contain noticeable stereo artifacts, such as exaggerated disparity. Perceived depth was rated greater for stereoscopic than for non-stereoscopic sequences, and perceived sharpness of stereoscopic sequences was rated the same or lower compared to non-stereoscopic sequences. Subjective image quality was influenced primarily by apparent sharpness of the video sequences, and less so by perceived depth.

78 citations

Journal ArticleDOI
TL;DR: The present paper summarizes the latest results of research and shows in which fields work must be carried out on hitherto unanswered questions for the implementation of three-dimensional television.
Abstract: At present the foundations are being laid out for the implementation of three-dimensional television. Apart from the engineering problems to be solved there are a number of questions connected with human factors which have to be answered. These questions are related to fundamental requirements with regard to certain system features such as screen size, viewing distance and the spatio-temporal resolution of the image in the three dimensions of space. Since the implementation of data-compressing techniques is essential for the recording and transmission of 3DTV picture signals, it is necessary to show the possibilities of irrelevance reduction as well as to define the demands made on the quality of reconstructed images. The present paper summarizes the latest results of research and shows in which fields work must be carried out on hitherto unanswered questions.

68 citations


"Stereo image quality: effects of mi..." refers background in this paper

  • ...Using natural images, Dinstein, Kim, Tselgov and Henik [5], Pastoor [ 6 ] and colleagues from the European DISTIMA project, and Yano & Yuyama [7] from NHK showed that it is possible to significantly reduce the spatial resolution of an image viewed by one eye without affecting the overall impression of sharpness....

    [...]

Trending Questions (1)
What are signal amplifiers that drive video computer graphics and stereo audio over extended distances?

We conclude that spatial filtering of one channel of a stereo video-sequence may be an effective means of reducing the transmission bandwidth.