scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Toward the Assessment of Quality of Experience for Asymmetric Encoding in Immersive Media

TL;DR: The focus of this contribution is the development of a QoE assessment framework, in line with the latest standardization progress in the field of QOE assessment, for understanding the visual effect of asymmetric and symmetric encoding for immersive media.
Abstract: The assessment of Quality of Experience (QoE) for stereoscopic 3-D video is a challenging task, especially in 3-D video compression and transmission applications. The focus of this contribution is the development of a QoE assessment framework, in line with the latest standardization progress in the field of QoE assessment, for understanding the visual effect of asymmetric and symmetric encoding for immersive media. Asymmetric stereoscopic video coding exploits the binocular suppression of the human vision system by representing one of the two views with a lower quality. This processing, while of limited effects on image quality, may influence the overall QoE. Many studies show that the QoE of immersive media such as 3-DTV can be thought as the combination of the perceived visual quality, the perceived depth quality, the visual fatigue, and visual discomfort. In this paper, we aim at: 1) exploiting the concept of preference of experience and protocols recently standardized for characterizing QoE; 2) conducting a case study using these standardized protocols to investigate the factors involving visual discomfort in stereoscopic video sequences with a focus on binocular rivalry; and 3) presenting the results of subjective experiments performed, by using the perceptual quality and preference of experience assessment protocols, for evaluating the impact of symmetrical, asymmetrical, and alternate coding schemes.
Citations
More filters
Journal ArticleDOI
TL;DR: The relation between depth map quality and overall quality of LF image is studied and evidence that the estimated quality score by the proposed framework has a significant correlation with subjective quality rating is achieved.
Abstract: Immersive media, such as free view point video and 360° video, are expected to be dominant as broadcasting services. The light field (LF) imaging is being considered as a next generation imaging technology offering the possibility to provide new services, including six degree-of-freedom video. The drawback of this technology is in the size of the generated content thus requiring novel compression systems and the design of ad-hoc methodologies for evaluating the perceived quality. In this paper, the relation between depth map quality and overall quality of LF image is studied. Next, a reduced reference quality assessment metric for LF images is presented. To predict the quality of distorted LF images, the measure of distortion in the depth map is exploited. To test and validate the proposed framework, a subjective experiment has been performed, and a LF image quality dataset has been created. The dataset is also used for evaluating the performance of state-of-the-art quality metrics, when applied to LF images. The achieved results evidence that the estimated quality score by the proposed framework has a significant correlation with subjective quality rating. Consequently, reference data can be delivered to the clients thus allowing the local estimation of the perceived quality of service.

73 citations


Cites background from "Toward the Assessment of Quality of..."

  • ...Therefore, the research is focused on the design of new models for data delivery while optimizing the streaming performances for increasing the perceived experience [2]–[5]....

    [...]

Journal ArticleDOI
TL;DR: A novel blind image quality assessment (IQA) method via multiscale natural scene statistical analysis (MNSS) based on two new natural scene statistics (NSS) models specific to the DIBR-synthesized IQA, which achieves better performance than each component and state-of-the-art quality methods.
Abstract: This paper proposes to blindly evaluate the quality of images synthesized via a depth image-based rendering (DIBR) procedure. As a significant branch of virtual reality (VR), superior DIBR techniques provide free viewpoints in many real applications, including remote surveillance and education; however, limited efforts have been made to measure the performance of DIBR techniques, or equivalently the quality of DIBR-synthesized views, especially in the condition when references are unavailable. To achieve this aim, we develop a novel blind image quality assessment (IQA) method via multiscale natural scene statistical analysis (MNSS). The design principle of our proposed MNSS metric is based on two new natural scene statistics (NSS) models specific to the DBIR-synthesized IQA. First, the DIBR-introduced geometric distortions damage the local self-similarity characteristic of natural images, and the damage degrees of self-similarity present particular variations at different scales. Systematically combining the measurements of the variations mentioned above can gauge the naturalness of the input image and thus indirectly reflect the quality changes of images generated using different DIBR methods. Second, it was found that the degradations in main structures of natural images at different scales remain almost the same, whereas the statistical regularity is destroyed in the DIBR-synthesized views. Estimating the deviation of degradations in main structures at different scales between one DIBR-synthesized image and the statistical model, which is constructed based on a large number of natural images, can quantify how a DIBR method damages the main structures and thus infer the image quality. Via trials, the two NSS-based features extracted above can well predict the quality of DIBR-synthesized images. Further, the two features come from distinct points of view, and we hence integrate them via a straightforward multiplication to derive the proposed blind MNSS metric, which achieves better performance than each component and state-of-the-art quality methods.

40 citations

Journal ArticleDOI
TL;DR: In this paper, some insights on why QoE assessment is so difficult are provided by presenting few major issues as well as a general summary of quality/QoE formation and conception including human auditory and vision systems.
Abstract: Quality of experience (QoE) assessment occupies a key role in various multimedia networks and applications. Recently, large efforts have been devoted to devise objective QoE metrics that correlate with perceived subjective measurements. Despite recent progress, limited success has been attained. In this paper, we provide some insights on why QoE assessment is so difficult by presenting few major issues as well as a general summary of quality/QoE formation and conception including human auditory and vision systems. Also, potential future research directions are described to discern the path forward. This is an academic and perspective article, which is hoped to complement existing studies and prompt interdisciplinary research.

33 citations

Journal ArticleDOI
TL;DR: A bitrate-based no-reference (NR) VQA metric combining the visual perception of video contents and their visual perception features, namely, BRVPVC is designed and shown to have a higher accuracy than six common FR VZA metrics and eight NR V QA metrics, and it is close to other two NR VqA metrics in accuracy.
Abstract: In video communication, the quality of video is mainly determined by bitrate in general. Moreover, the effect of video contents and their visual perception on video quality assessment (VQA) is often overlooked. However, in fact, for different videos, although the bitrates are the same, their VQA scores are still significantly different. Hence, it is assumed that the bitrate, video contents, and human visual characteristics mainly affect the VQA. Based on the above three aspects, in this paper, we designed a bitrate-based no-reference (NR) VQA metric combining the visual perception of video contents, namely, BRVPVC. In this metric, first an initial VQA model was proposed by only considering the bitrate alone. Then, the visual perception model for video contents was designed based on the texture complexity and local contrast of image, temporal information of video, and their visual perception features. Finally, two models were synthesized by adding certain weight coefficients into an overall VQA metric, namely, BRVPVC. Furthermore, ten reference videos and 150 distorted videos in the LIVE video database were used to test the metric. Moreover, based on the results of evaluating the videos in LIVE, VQEG, IRCCyN, EPFL-PoliMI, IVP, CSIQ, and Lisbon databases, the performance of BRVPVC is respectively compared with that of six full-reference (FR) metrics and ten NR VQA metrics. The results show that our VQA metric has a higher accuracy than six common FR VQA metrics and eight NR VQA metrics, and it is close to other two NR VQA metrics in accuracy. The corresponding values of Pearson linear correlation coefficient and Spearman rank order correlation coefficient reached 0.8547 and 0.8260, respectively. In addition, the computational complexity of proposed VQA metric is lower than video signal-to-noise ratio, video quality model, motion-based video integrity evaluation, spatiotemporal most apparent distortion, V-BLINDS, and V-CORNIA metrics. Moreover, the proposed metric has a better generalization property than these metrics.

23 citations


Cites background from "Toward the Assessment of Quality of..."

  • ...A few but efficient HVS characteristics should be selected while designing the VQA metrics [13], [33]....

    [...]

Journal ArticleDOI
TL;DR: Evidence is found that, even if some attributes that are considered to be central to immersive experiences are lacking, (i.e., interactivity or avatar representation of the user), 360°-video recreations of virtual environments lead to realistic reactions on users.
Abstract: This research analyzes the psychological reactions of participants to the neutral, positive, or negative attitudes of a virtual audience recorded in 360°-video. Participants were asked to deliver three speeches, each accompanied by a different type of reaction from the virtual audience. Measures of user state included questionnaires, psychophysiological measures, and voice recordings. The results showed that, compared to the neutral audience, the negative audience elicited increases in skin conductance level and heart rate variability, decreases in voice intensity, and a higher ratio of silent parts in the speech, as well as a more negative self-reported valence, higher anxiety, and lower social presence. These findings evidence that, even if some attributes that are considered to be central to immersive experiences are lacking, (i.e., interactivity or avatar representation of the user), 360°-video recreations of virtual environments lead to realistic reactions on users. These results support the effectiveness of 360°-video virtual audiences for public speaking training and social anxiety treatment.

22 citations

References
More filters
Proceedings ArticleDOI
07 Jan 2007
TL;DR: By augmenting k-means with a very simple, randomized seeding technique, this work obtains an algorithm that is Θ(logk)-competitive with the optimal clustering.
Abstract: The k-means method is a widely used clustering technique that seeks to minimize the average squared distance between points in the same cluster. Although it offers no accuracy guarantees, its simplicity and speed are very appealing in practice. By augmenting k-means with a very simple, randomized seeding technique, we obtain an algorithm that is Θ(logk)-competitive with the optimal clustering. Preliminary experiments show that our augmentation improves both the speed and the accuracy of k-means, often quite dramatically.

7,539 citations


"Toward the Assessment of Quality of..." refers methods in this paper

  • ...We analyzed those results by clustering the HRCs in two groups by using the K-means algorithm [96] based on the...

    [...]

Journal ArticleDOI
TL;DR: The main goal of the HEVC standardization effort is to enable significantly improved compression performance relative to existing standards-in the range of 50% bit-rate reduction for equal perceptual video quality.
Abstract: High Efficiency Video Coding (HEVC) is currently being prepared as the newest video coding standard of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. The main goal of the HEVC standardization effort is to enable significantly improved compression performance relative to existing standards-in the range of 50% bit-rate reduction for equal perceptual video quality. This paper provides an overview of the technical features and characteristics of the HEVC standard.

7,383 citations

Journal ArticleDOI
06 Jun 1996-Nature
TL;DR: The visual processing needed to perform this highly demanding task can be achieved in under 150 ms, and ERP analysis revealed a frontal negativity specific to no-go trials that develops roughly 150 ms after stimulus onset.
Abstract: How long does it take for the human visual system to process a complex natural image? Subjectively, recognition of familiar objects and scenes appears to be virtually instantaneous, but measuring this processing time experimentally has proved difficult. Behavioural measures such as reaction times can be used, but these include not only visual processing but also the time required for response execution. However, event-related potentials (ERPs) can sometimes reveal signs of neural processing well before the motor output. Here we use a go/no-go categorization task in which subjects have to decide whether a previously unseen photograph, flashed on for just 20 ms, contains an animal. ERP analysis revealed a frontal negativity specific to no-go trials that develops roughly 150 ms after stimulus onset. We conclude that the visual processing needed to perform this highly demanding task can be achieved in under 150 ms.

3,284 citations

Book
01 Jan 1971
TL;DR: Foundations of Cyclopean Perception as mentioned in this paper is a classic work on cyclopean perception that has influenced a generation of vision researchers, cognitive scientists, and neuroscientists and has inspired artists, designers, and computer graphics pioneers.
Abstract: This classic work on cyclopean perception has influenced a generation of vision researchers, cognitive scientists, and neuroscientists and has inspired artists, designers, and computer graphics pioneers. In Foundations of Cyclopean Perception (first published in 1971 and unavailable for years), Bela Julesz traced the visual information flow in the brain, analyzing how the brain combines separate images received from the two eyes to produce depth perception. Julesz developed novel tools to do this: random-dot stereograms and cinematograms, generated by early digital computers at Bell Labs. These images, when viewed with the special glasses that came with the book, revealed complex, three-dimensional surfaces; this mode of visual stimulus became a paradigm for research in vision and perception. This reprint edition includes all 48 color random-dot designs from the original, as well as the special 3-D glasses required to view them.Foundations of Cyclopean Perception has had a profound impact on the vision studies community. It was chosen as one of the one hundred most influential works in cognitive science in a poll conducted by the University of Minnesota's Center for Cognitive Sciences. Many copies are "permanently borrowed" from college libraries; used copies are sought after online. Now, with this facsimile of the 1971 edition, the book is available again to cognitive scientists, neuroscientists, vision researchers, artists, and designers.

2,449 citations


"Toward the Assessment of Quality of..." refers background in this paper

  • ...binocular vision [33], exploit the assumption that binocular perception can support differences in quality between the two views: therefore, the two views can be represented at unequal resolutions or bit-rates....

    [...]

Journal ArticleDOI

1,992 citations


"Toward the Assessment of Quality of..." refers methods in this paper

  • ...In this article, the conversion of the preference frequencies to continuous-scale quality scores is performed using the popular Bradley-Terry (BT) model [90]....

    [...]

Trending Questions (1)
What are the main distortions that affect the quality of experience in immersive media?

The paper does not explicitly mention the main distortions that affect the quality of experience in immersive media.