scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A perceptual metric for stereoscopic image quality assessment based on the binocular energy

01 Jun 2013-Multidimensional Systems and Signal Processing (Springer US)-Vol. 24, Iss: 2, pp 281-316
TL;DR: A full reference metric for quality assessment of stereoscopic images based on the binocular fusion process characterizing the 3D human perception is proposed and the difference of binocular energy has shown a high correlation with the human judgement for different impairments and is used to build the Binocular Energy Quality Metric (BEQM).
Abstract: Stereoscopic imaging is becoming very popular and its deployment by means of photography, television, cinema. . .is rapidly increasing. Obviously, the access to this type of images imposes the use of compression and transmission that may generate artifacts of different natures. Consequently, it is important to have appropriate tools to measure the quality of stereoscopic content. Several studies tried to extend well-known metrics, such as the PSNR or SSIM, to 3D. However, the results are not as good as for 2D images and it becomes important to have metrics dealing with 3D perception. In this work, we propose a full reference metric for quality assessment of stereoscopic images based on the binocular fusion process characterizing the 3D human perception. The main idea consists of the development of a model allowing to reproduce the binocular signal generated by simple and complex cells, and to estimate the associated binocular energy. The difference of binocular energy has shown a high correlation with the human judgement for different impairments and is used to build the Binocular Energy Quality Metric (BEQM). Extensive experiments demonstrated the performance of the BEQM with regards to literature.
Citations
More filters
Journal ArticleDOI
TL;DR: Experimental results confirm the hypothesis and show that the proposed framework significantly outperforms conventional 2D QA metrics when predicting the quality of stereoscopically viewed images that may have been asymmetrically distorted.
Abstract: We develop a framework for assessing the quality of stereoscopic images that have been afflicted by possibly asymmetric distortions. An intermediate image is generated which when viewed stereoscopically is designed to have a perceived quality close to that of the cyclopean image. We hypothesize that performing stereoscopic QA on the intermediate image yields higher correlations with human subjective judgments. The experimental results confirm the hypothesis and show that the proposed framework significantly outperforms conventional 2D QA metrics when predicting the quality of stereoscopically viewed images that may have been asymmetrically distorted.

348 citations


Cites background from "A perceptual metric for stereoscopi..."

  • ...[23] proposed a 3D QA algorithm that measures the difference of binocular energy between the reference and tested stereopairs, and thus considers the potential influence of binocularity on perceived 3D quality....

    [...]

Journal ArticleDOI
TL;DR: A no-reference binocular image quality assessment model that operates on static stereoscopic images that significantly outperforms the conventional 2D full-reference QA algorithms applied to stereopairs, as well as the 3D full -reference IQA algorithms on asymmetrically distorted stereopair images.
Abstract: We develop a no-reference binocular image quality assessment model that operates on static stereoscopic images. The model deploys 2D and 3D features extracted from stereopairs to assess the perceptual quality they present when viewed stereoscopically. Both symmetric- and asymmetric-distorted stereopairs are handled by accounting for binocular rivalry using a classic linear rivalry model. The NSS features are used to train a support vector machine model to predict the quality of a tested stereopair. The model is tested on the LIVE 3D Image Quality Database, which includes both symmetric- and asymmetric-distorted stereoscopic 3D images. The experimental results show that our proposed model significantly outperforms the conventional 2D full-reference QA algorithms applied to stereopairs, as well as the 3D full-reference IQA algorithms on asymmetrically distorted stereopairs.

249 citations

Journal ArticleDOI
26 Jul 2013
TL;DR: The principles and methods of modern algorithms for automatically predicting the quality of visual signals are discussed and divided into understandable modeling subproblems by casting the problem as analogous to assessing the efficacy of a visual communication system.
Abstract: Finding ways to monitor and control the perceptual quality of digital visual media has become a pressing concern as the volume being transported and viewed continues to increase exponentially. This paper discusses the principles and methods of modern algorithms for automatically predicting the quality of visual signals. By casting the problem as analogous to assessing the efficacy of a visual communication system, it is possible to divide the quality assessment problem into understandable modeling subproblems. Along the way, we will visit models of natural images and videos, of visual perception, and a broad spectrum of applications.

206 citations

Journal ArticleDOI
TL;DR: The binocular integration behaviors-the binocular combination and the binocular frequency integration, are utilized as the bases for measuring the quality of stereoscopic 3D images and it is found that the proposed metrics can also address the quality assessment of the synthesized color-plus-depth3D images well.
Abstract: The objective approaches of 3D image quality assessment play a key role for the development of compression standards and various 3D multimedia applications. The quality assessment of 3D images faces more new challenges, such as asymmetric stereo compression, depth perception, and virtual view synthesis, than its 2D counterparts. In addition, the widely used 2D image quality metrics (e.g., PSNR and SSIM) cannot be directly applied to deal with these newly introduced challenges. This statement can be verified by the low correlation between the computed objective measures and the subjectively measured mean opinion scores (MOSs), when 3D images are the tested targets. In order to meet these newly introduced challenges, in this paper, besides traditional 2D image metrics, the binocular integration behaviors-the binocular combination and the binocular frequency integration, are utilized as the bases for measuring the quality of stereoscopic 3D images. The effectiveness of the proposed metrics is verified by conducting subjective evaluations on publicly available stereoscopic image databases. Experimental results show that significant consistency could be reached between the measured MOS and the proposed metrics, in which the correlation coefficient between them can go up to 0.88. Furthermore, we found that the proposed metrics can also address the quality assessment of the synthesized color-plus-depth 3D images well. Therefore, it is our belief that the binocular integration behaviors are important factors in the development of objective quality assessment for 3D images.

201 citations


Cites methods from "A perceptual metric for stereoscopi..."

  • ...The Gabor response of binocular vision was modeled in the study [37] to measure the quality of stereo images....

    [...]

Journal ArticleDOI
TL;DR: Experimental results show that compared with the relevant existing metrics, the proposed metric can achieve higher consistency with subjective assessment of stereoscopic images.
Abstract: Perceptual quality assessment is a challenging issue in 3D signal processing research. It is important to study 3D signal directly instead of studying simple extension of the 2D metrics directly to the 3D case as in some previous studies. In this paper, we propose a new perceptual full-reference quality assessment metric of stereoscopic images by considering the binocular visual characteristics. The major technical contribution of this paper is that the binocular perception and combination properties are considered in quality assessment. To be more specific, we first perform left-right consistency checks and compare matching error between the corresponding pixels in binocular disparity calculation, and classify the stereoscopic images into non-corresponding, binocular fusion, and binocular suppression regions. Also, local phase and local amplitude maps are extracted from the original and distorted stereoscopic images as features in quality assessment. Then, each region is evaluated independently by considering its binocular perception property, and all evaluation results are integrated into an overall score. Besides, a binocular just noticeable difference model is used to reflect the visual sensitivity for the binocular fusion and suppression regions. Experimental results show that compared with the relevant existing metrics, the proposed metric can achieve higher consistency with subjective assessment of stereoscopic images.

187 citations

References
More filters
Journal ArticleDOI
TL;DR: In this paper, it is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2 /sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions.
Abstract: Multiresolution representations are effective for analyzing the information content of images. The properties of the operator which approximates a signal at a given resolution were studied. It is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2/sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions. In L/sup 2/(R), a wavelet orthonormal basis is a family of functions which is built by dilating and translating a unique function psi (x). This decomposition defines an orthogonal multiresolution representation called a wavelet representation. It is computed with a pyramidal algorithm based on convolutions with quadrature mirror filters. Wavelet representation lies between the spatial and Fourier domains. For images, the wavelet representation differentiates several spatial orientations. The application of this representation to data compression in image coding, texture discrimination and fractal analysis is discussed. >

20,028 citations

Journal ArticleDOI
TL;DR: This method is used to examine receptive fields of a more complex type and to make additional observations on binocular interaction and this approach is necessary in order to understand the behaviour of individual cells, but it fails to deal with the problem of the relationship of one cell to its neighbours.
Abstract: What chiefly distinguishes cerebral cortex from other parts of the central nervous system is the great diversity of its cell types and interconnexions. It would be astonishing if such a structure did not profoundly modify the response patterns of fibres coming into it. In the cat's visual cortex, the receptive field arrangements of single cells suggest that there is indeed a degree of complexity far exceeding anything yet seen at lower levels in the visual system. In a previous paper we described receptive fields of single cortical cells, observing responses to spots of light shone on one or both retinas (Hubel & Wiesel, 1959). In the present work this method is used to examine receptive fields of a more complex type (Part I) and to make additional observations on binocular interaction (Part II). This approach is necessary in order to understand the behaviour of individual cells, but it fails to deal with the problem of the relationship of one cell to its neighbours. In the past, the technique of recording evoked slow waves has been used with great success in studies of functional anatomy. It was employed by Talbot & Marshall (1941) and by Thompson, Woolsey & Talbot (1950) for mapping out the visual cortex in the rabbit, cat, and monkey. Daniel & Whitteiidge (1959) have recently extended this work in the primate. Most of our present knowledge of retinotopic projections, binocular overlap, and the second visual area is based on these investigations. Yet the method of evoked potentials is valuable mainly for detecting behaviour common to large populations of neighbouring cells; it cannot differentiate functionally between areas of cortex smaller than about 1 mm2. To overcome this difficulty a method has in recent years been developed for studying cells separately or in small groups during long micro-electrode penetrations through nervous tissue. Responses are correlated with cell location by reconstructing the electrode tracks from histological material. These techniques have been applied to

12,923 citations


"A perceptual metric for stereoscopi..." refers background in this paper

  • ...This has always aroused interests of scientists coming from various domains thus leading to many experiments allowing to better understand this property and explain the involved factors (Hubel and Wiesel 1962, 1970; Barlow et al. 1967; Fleet et al. 1996; Ohzawa and Freeman 1986a,b)....

    [...]

Journal ArticleDOI
TL;DR: This paper has designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms.
Abstract: Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame stereo methods designed to assess the different components and design decisions made in individual stereo algorithms. Using this taxonomy, we compare existing stereo methods and present experiments evaluating the performance of many different variants. In order to establish a common software platform and a collection of data sets for easy evaluation, we have designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can be easily extended to include new algorithms. We have also produced several new multiframe stereo data sets with ground truth, and are making both the code and data sets available on the Web.

7,458 citations

Journal ArticleDOI
TL;DR: In this article, the first stage consists of linear filters that are oriented in space-time and tuned in spatial frequency, and the outputs of quadrature pairs of such filters are squared and summed to give a measure of motion energy.
Abstract: A motion sequence may be represented as a single pattern in x–y–t space; a velocity of motion corresponds to a three-dimensional orientation in this space. Motion sinformation can be extracted by a system that responds to the oriented spatiotemporal energy. We discuss a class of models for human motion mechanisms in which the first stage consists of linear filters that are oriented in space-time and tuned in spatial frequency. The outputs of quadrature pairs of such filters are squared and summed to give a measure of motion energy. These responses are then fed into an opponent stage. Energy models can be built from elements that are consistent with known physiology and psychophysics, and they permit a qualitative understanding of a variety of motion phenomena.

3,504 citations

Journal ArticleDOI
TL;DR: An image information measure is proposed that quantifies the information that is present in the reference image and how much of this reference information can be extracted from the distorted image and combined these two quantities form a visual information fidelity measure for image QA.
Abstract: Measurement of visual quality is of fundamental importance to numerous image and video processing applications. The goal of quality assessment (QA) research is to design algorithms that can automatically assess the quality of images or videos in a perceptually consistent manner. Image QA algorithms generally interpret image quality as fidelity or similarity with a "reference" or "perfect" image in some perceptual space. Such "full-reference" QA methods attempt to achieve consistency in quality prediction by modeling salient physiological and psychovisual features of the human visual system (HVS), or by signal fidelity measures. In this paper, we approach the image QA problem as an information fidelity problem. Specifically, we propose to quantify the loss of image information to the distortion process and explore the relationship between image information and visual quality. QA systems are invariably involved with judging the visual quality of "natural" images and videos that are meant for "human consumption." Researchers have developed sophisticated models to capture the statistics of such natural signals. Using these models, we previously presented an information fidelity criterion for image QA that related image quality with the amount of information shared between a reference and a distorted image. In this paper, we propose an image information measure that quantifies the information that is present in the reference image and how much of this reference information can be extracted from the distorted image. Combining these two quantities, we propose a visual information fidelity measure for image QA. We validate the performance of our algorithm with an extensive subjective study involving 779 images and show that our method outperforms recent state-of-the-art image QA algorithms by a sizeable margin in our simulations. The code and the data from the subjective study are available at the LIVE website.

3,146 citations