Hdr-vqm
Summary (4 min read)
Introduction
- High Dynamic Range (HDR) signals fundamentally differ from the traditional low dynamic range (LDR) ones in that pixels are related to the physical luminance in the scene (i.e. scenereferred).
- Such high luminance values often exceed the capabilities of the traditional low dynamic range (LDR) capturing and display devices.
- It is therefore important to develop objective methods for HDR video quality measurement and benchmark their performance against subjective ground truth.
- The latter quality assessment method employs a computational model to provide estimates of the subjective video quality.
II. BACKGROUND
- Humans perceive the outside visual world through the interaction between luminance (measured in candela per square meter cd/m2) and the eyes.
- The rods are more sensitive than cones but do not provide color vision.
- With regards to human eyes, their dynamic range depends on the time allowed to adjust or adapt to the given luminance levels.
- HDR imaging technologies therefore aim to overcome the inadequacies of the LDR capture and display technologies via better video signal capture, representation and display, so that the dynamic range of the video can better match the instantaneous range of the eye.
- This, nonetheless, is sufficient for most purposes.
III. THE PROPOSED OBJECTIVE HDR VIDEO QUALITY MEASURE
- It takes as input the source and the distorted HDR video sequences.
- Note that throughout the paper the authors use the notation src and hrc (hypothetical reference circuit) to respectively denote reference and distorted video sequences.
- As shown in the figure, the first two steps are meant to convert the native input luminance to perceived luminance.
- These can therefore be seen as pre-processing steps.
- The last step is that of error pooling which is achieved via spatio-temporal processing of the subband errors.
A. From native HDR values to emitted luminance: modeling the display processing
- The authors begin with two observations with regard to HDR video signal representation.
- Therefore, the exact scene luminance at each pixel location will be, generally, unknown.
- With regard to HDR displays, the inherent hardware limitations will impose a limit on the maximum luminance that can be displayed.
- While one can adopt different strategies (from simple ones like linear scaling to more sophisticated ones) for the said display based pre-processing, this is not the main focus of the work.
- For the purpose of the method described in this paper, it is sufficient to highlight that in the general case, it is important that the characteristics of the HDR display are taken into account and the HDR video transformed (pre-processed) accordingly i.e. Nsrc → Esrc and Nhrc → Ehrc, for objective HDR video quality estimation.
B. Transformation from emitted to perceived luminance
- The second step in the design of HDR-VQM concerns the transformation of the emitted luminance to perceived luminance i.e. Esrc → Psrc and Ehrc → Phrc as indicated in Figure 1.
- An implication of such non-linearity is that the changes introduced by an HDR video December 1, 2014 DRAFT processing algorithm in the emitted luminance may not have a direct correspondence to the actual modification of visual quality.
- To further quantify this, it was found that the linear correlation between the original and transformed signals was 0.9334 for PU encoding and 0.9071 for logarithmic, for the range between 1 - 200 cd/m2.
- Thus, PU encoding can better approximate the response of HVS which is approximately linear at lower luminance and increasingly logarithmic for higher luminance values.
C. Computation of subband error signal
- The proposed HDR-VQM is based on spatio-temporal analysis of an error video whose frames denote the localized perceptual error between a source and distorted video.
- The authors first describe the steps to obtain the subband error signal and then present the details of the spatio-temporal processing.
- The authors employed log-Gabor filters, introduced in [10], to calculate the perceptual error at different scales and orientations.
- Video frames in the perceived luminance domain (i.e Psrc and Phrc) were decomposed into a set of subbands by computing the inverse DFT of the product of the frames’s DFT with frequency domain filter defined in (1).
- The authors can then obtain the total error at each pixel in each video frame by pooling across scales and orientations.
D. From spatio-temporal subband errors to overall video quality: The Pooling step
- Video signals propagate information along both spatial and temporal dimensions.
- Due to visual acuity limitations of the eye, humans fixate their attention to local regions when viewing a video because only a small area of the eye retina, generally referred to as fovea, has a high visual acuity.
- In the light of this, a reasonable strategy for objective video quality measurement is by analyzing the video in a spatio-temporal (ST) dimension [12], [13], [14], so that the impact of distortions can be localized along both spatial and temporal axes.
- 2) Spatial and Long term temporal pooling:.
- Therefore, the authors first perform spatial pooling on spatio-temporal error frames STv,ts in order to obtain the short-term quality scores, as illustrated in Figure 3.
IV. HDR VIDEO DATASET
- To the best of their knowledge there are currently no publicly available subjectively annotated HDR video datasets dealing with the issue of visual quality.
- Therefore, for verifying the prediction performance of HDR-VQM and other objective methods, an in-house and comprehensive HDR video dataset was used.
- This section provides a brief description of the dataset.
A. Test material preparation
- The dataset used 10 source HDR video sequences3.
- The spatial versus temporal information measures (computed on tone mapped version of video frames) for each source sequence is shown in Figure 4.
- In general, any backward-compatible HDR compression scheme comprises [16] of 3 steps: (a) forward tone mapping in order to convert HDR video to LDR (8-bit precision), (b) compression and decompression of the LDR video by a standard LDR video compression method, (c) inverse tone mapping of the decoded LDR bit stream to reconstruct HDR video.
- The LDR video was encoded and decoded using H.264/AVC at different bit rates.
B. Rating methodology
- The authors study involved 25 paid observers who were not expert in image or video processing.
- They were seated in a standardized room conforming to the International Telecommunication Union Recommendation (ITU-R) BT500-11 recommendations [17].
- Prior to the test, observers were screened for visual acuity by using a Monoyer optometric table and for normal color vision by using Ishiharas tables.
- For rating the test stimuli, the authors adopted the absolute category rating with hidden reference (ACR-HR), which is one of the rating methods recommended by the ITU December 1, 2014 DRAFT in Rec. ITU-T P.910 [18].
- The rating method also includes the source sequences (i.e. undistorted) to be shown as any other test stimulus without informing the observers.
C. Display
- For displaying the HDR video sequences, SIM2 Solar47 HDR display was used which has a maximum displayable luminance of 4000 cd/m2.
- In their study this was set to 200 cd/m2 as it provided comfortable viewing conditions for the observers [20].
- The authors however observed that this approach suffers from at least two drawbacks.
- To ameliorate these two issues, the authors opted for a temporally more coherent strategy and the normalization factor was determined as the maximum of the mean of top 5% luminance values of all the frames in an HDR video sequence.
- Then, the native HDR values N were converted to emitted luminance values E as E = N × 179 max(MT5) (6) where the multiplication factor of 179 is the luminous efficacy of equal energy white light that is defined and used by the Radiance file format (RGBE) for the conversion to actual luminance value.
A. Correlation based comparisons
- The first set of experimental results are reported in terms of two criteria: Pearson linear correlation coefficient Cp (for prediction accuracy) and Spearman rank order correlation coefficient Cs (for monotonicity), between the subjective score and the objective prediction.
- The better performance of HDR-VQM relative to these methods therefore indicates the added value of taking into account frequency and orientation information.
- As a result, for similar MOS values across sequences, the corresponding RPSNR values can be quite different.
- The RPSNR value for the first condition was 32.70 dB while the corresponding subjective score was 1.04.
- Of course, one should rely on correlation based comparisons and outlier ratio analysis (presented in the next subsection) to draw more general conclusions about the performance of different objective methods.
B. Outlier ratio analysis
- Outlier ratio analysis is another approach to evaluate objective methods for their prediction accuracy.
- Particularly, it can be very useful in applications such as video compression where one is generally interested in the rate distortion (RD) behavior of objective methods i.e. how the objective visual quality varies with bit rates for different source sequences and to what extent that compares with the subjective video quality.
- Therefore, the authors first computed the absolute prediction error between the subjective MOS and logistically transformed objective scores for each of the 80 test conditions.
- The authors find that HDR-VQM has the least number of outliers (22%).
- The main advantage of outlier analysis is that it helps to evaluate metric accuracy by taking into account the variability or uncertainty (expressed via 95% confidence intervals in their dataset) in subjective opinions, which are ignored in correlation based comparisons.
VI. DISCUSSION
- The previous sections proposed and verified the performance of an objective HDR video quality estimator HDR-VQM.
- Also recall that in (3) the authors did not employ a more sophisticated weighting such as one based on CSF.
- The authors find that the relative execution time for HDRVQM is reasonable considering the improvements (i.e. smallest % of outliers) in performance over other methods.
- The reader will also appreciate the fact that video quality judgment, in general, can depend on several extraneous factors (such as display type, viewing distance, ambient lighting conditions etc.) apart from the distortions themselves.
- This allows HDR-VQM to adapt to some of the physical factors that may affect subjective quality December 1, 2014 DRAFT judgment.
VII. CONCLUDING THOUGHTS
- HDR imaging is increasingly becoming popular in the multimedia signal processing community primarily as a tool towards enhancing the immersive video experience of the user.
- There are very few works that address the issue of assessing the impact of HDR video processing algorithms on the perceptual video quality both from subjective and objective angles.
- To that extent and within the scope of its application, HDR-VQM is a reasonable objective tool for HDR video quality measurement.
- To enable others to use the proposed method as well as validate it independently, a software implementation will soon be made available online for free download and use.
ACKNOWLEDGMENT
- The authors wish to thank Romuald Pepion for his help in generating the subjective test results used in this paper.
- This work has been supported by the NEVEx project FUI11 financed by the French government.
Did you find this useful? Give us your feedback
Citations
86 citations
Cites background or methods or result from "Hdr-vqm"
...[15] found that HDR-VQM was performing significantly better than HDR-VDP-2 for both video and still image content....
[...]
...1 Full-referencemetrics To the best of our knowledge, there are only two metrics for HDR quality assessment that have a publicly available implementation: (1) HDR-VDP: high dynamic range visible difference predictor [10, 12, 13] and (2) HDR-VQM: an objective quality measure for high dynamic range video [15]....
[...]
...The authors of [15] found that HDR-VQM is the best metric, far beyond HDR-VDP-2, in contradiction to the findings of [9], which showed lower performance for HDR-VQM when compared to HDR-VDP-2....
[...]
...HDR-VDP-2 [15], which makes it a suitable alternative to HDR-VDP-2....
[...]
...[15] have reported that their HDR-VQM metric performs similar or slightly better than HDR-VDP-2 for HDR image quality assessment....
[...]
62 citations
Cites background from "Hdr-vqm"
...A large number of top-down image quality metrics have been proposed; as for bottom-up models, they have been recently extended to new imaging format such as HDR-VQM [12]....
[...]
61 citations
58 citations
54 citations
References
4,333 citations
"Hdr-vqm" refers methods in this paper
...We compared the performance of HDR-VQM with a few popular LDR methods including PSNR and multi-scale SSIM [11]....
[...]
...We compared the performance of HDR-VQM with a few popular LDR methods including NTIA-VQM [14] PSNR and multi-scale SSIM [13]....
[...]
...The input to all these methods were the perceived luminance values Psrc and Phrc and hence we refer to them as P-PSNR and P-SSIM....
[...]
...It is also interesting to point out the proposed HDR-VQM, P-PSNR, and P-SSIM compute quality based on perceived luminance....
[...]
...Thus, we compute the said error by suing the following equation (similar formulation has been used in previous works such as [13] although not for directly modeling masking effect):...
[...]
3,077 citations
"Hdr-vqm" refers methods in this paper
...We employed log-Gabor filters, introduced in [12], to calculate the perceptual error at different scales and orientations....
[...]
1,268 citations
"Hdr-vqm" refers background or methods in this paper
...We compared the performance of HDR-VQM with a few popular LDR methods including NTIA-VQM [14] PSNR and multi-scale SSIM [13]....
[...]
...In the light of this, a reasonable strategy for objective video quality measurement is by analyzing the video in a spatio-temporal (ST) dimension [14], [15], [16], so that the impact of distortions can be localized along both spatial and temporal axes....
[...]
691 citations
686 citations
"Hdr-vqm" refers methods in this paper
...of network-centric Quality of Service (QoS) in multimedia systems is being extended by relying on the concept of Quality of Experience (QoE) [1]....
[...]
Related Papers (5)
Frequently Asked Questions (2)
Q2. What future works have the authors mentioned in the paper "Hdr-vqm: an objective quality measure for high dynamic range video" ?
The immediate future work will ensue further refinement of the presented method in view of some of the mentioned limitations as well as further validation with larger HDR video datasets.