scispace - formally typeset
Open AccessJournal ArticleDOI

HDR-VDP-2.2: a calibrated method for objective quality prediction of high-dynamic range and standard images

Reads0
Chats0
TLDR
The main contribution is toward improving the frequency-based pooling in HDR-VDP-2 to enhance its objective quality prediction accuracy by formulating and solving a constrained optimization problem and thereby finding the optimal pooling weights.
Abstract
With the emergence of high-dynamic range (HDR) imaging, the existing visual signal processing systems will need to deal with both HDR and standard dynamic range (SDR) signals. In such systems, computing the objective quality is an important aspect in various optimization processes (e.g., video encoding). To that end, we present a newly calibrated objective method that can tackle both HDR and SDR signals. As it is based on the previously proposed HDR-VDP-2 method, we refer to the newly calibrated metric as HDR-VDP-2.2. Our main contribution is toward improving the frequency-based pooling in HDR-VDP-2 to enhance its objective quality prediction accuracy. We achieve this by formulating and solving a constrained optimization problem and thereby finding the optimal pooling weights. We also carried out extensive cross-validation as well as verified the performance of the new method on independent databases. These indicate clear improvement in prediction accuracy as compared with the default pooling weights. The source codes for HDR-VDP-2.2 are publicly available online for free download and use.

read more

Content maybe subject to copyright    Report

HAL Id: hal-01149491
https://hal.archives-ouvertes.fr/hal-01149491
Submitted on 7 May 2015
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-
entic research documents, whether they are pub-
lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diusion de documents
scientiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Copyright
HDR-VDP-2.2: a calibrated method for objective
quality prediction of high-dynamic range and standard
images: a calibrated method for objective quality
prediction of high-dynamic range and standard images
Manish Narwaria, Rafal Mantiuk, Matthieu Perreira da Silva, Patrick Le
Callet
To cite this version:
Manish Narwaria, Rafal Mantiuk, Matthieu Perreira da Silva, Patrick Le Callet. HDR-VDP-2.2:
a calibrated method for objective quality prediction of high-dynamic range and standard images:
a calibrated method for objective quality prediction of high-dynamic range and standard images.
Journal of Electronic Imaging, SPIE and IS&T, 2014, 24 (1), pp.010501. �10.1117/1.JEI.24.1.010501�.
�hal-01149491�

HDR-VDP-2.2: a calibrated method
for objective quality prediction of
high-dynamic range and standard
images
Manish Narwaria
Rafal K. Mantiuk
Mattheiu Perreira Da Silva
Patrick Le Callet

HDR-VDP-2.2: a calibrated
method for objective
quality prediction of
high-dynamic range and
standard images
Manish Narwaria,
a
Rafal K. Mantiuk,
b,
*
Mattheiu Perreira Da Silva,
a
and Patrick Le Callet
a
a
LUNAM University, IRCCyN CNRS UMR 6597, Polytech Nantes,
Rue Christian Pauc, La Chantrerie B.P. 50609 44306, Nantes Cedex
3, France
b
Bangor University, School of Computer Science, Dean Street,
Bangor, LL57 1UT, United Kingdom
Abstract. With the emergence of high-dynamic range
(HDR) imaging, the existing visual signal processing sys-
tems will need to deal with both HDR and standard
dynamic range (SDR) signals. In such systems, computing
the objective quality is an important aspect in various opti-
mization processes (e.g., video encoding). To that end, we
present a newly calibrated objective method that can
tackle both HDR and SDR signals. As it is based on the
previously proposed HDR-VDP-2 method, we refer to the
newly calibrated metric as HDR-VDP-2.2. Our main con-
tribution is toward improving the frequency-based pooling
in HDR-VDP-2 to enhance its objective quality prediction
accuracy. We achieve this by formulating and solving a
constrained optimization problem and thereby finding the
optimal pooling weights. We also carried out extensive
cross-validation as well as verified the performance of the
new method on independent databases. These indicate
clear improvement in prediction accuracy as compared
with the default pooling weights. The source codes for
HDR-VDP-2.2 are publicly available online for free down-
load and use.
© 2015 SPIE and IS&T [DOI: 10.1117/1.JEI.24.1
.010501]
Keywords: high-dynamic range; objective quality; HDR-VDP-2.
Paper 14600L received Sep. 26, 2014; accepted for publication
Dec. 29, 2014; published online Jan. 22, 2015.
1 Introduction
Human eyes have a remarkable ability to adapt and adjust
to varying luminance conditions. As a result, humans can
clearly visualize and see in lighting conditions ranging from
a moonlit night to bright sunshine. In terms of physical lumi-
nance values, the former is in the range of about 10
2
cdm
2
,
while the latter is more than 10
7
cdm
2
, a dynamic range in
excess of 9 orders of magnitude. Ho wever, when it comes to
scene capture and display, such large luminance ranges are
beyond the capabilities of current standard dynamic range
(SDR) imaging systems. Nevertheless, with the emergence
of HDR imaging, it is now possible to capture and display
scenes that can encapsulate much a higher dynamic range
(HDR) than the traditional or SDR imaging techniques.
1
Particularly, typical SDR systems deal with signals up to 3
orders of magnitude. In contrast, with HDR imaging, scenes
up to 5 orders of magnitude can be processed and displayed
and it can also include SDR signals (e.g., tone-mapped signals).
Therefore,it is logical to assume thatthefuturevideoprocessing
systems will have to deal with both SDR and HDR signals.
2 Background and Motivation
While human judgments of perceptual visual quality remain
the most accurate, they cannot be employed in all situations.
For instance, in a real-time video streaming application, it
may be unfeasible to get human judgments of visual quality
to continuously monitor the traffic from a quality aspect. In
the light of such scenarios, objective quality measurement
via the use of a computational model is more desirable. To
that end, many objective methods have been proposed in the
past. However, most of them have been designed for and
tested only on SDR visual signals.
2
As mentioned before,
with the emergence of HDR imaging, video processing sys-
tems may have to deal with both HDR and SDR signals.
Thus, an objective quality measurement method that could
potentially be applicable over a larger dynamic range (i.e.,
both SDR and HDR domains) is desirable. In that context,
the HDR-VDP-2 algorithm
3
can be an attractive solution.
HDR-VDP-2 is a visibility prediction metric. It provides a
two-dimensional map with probabilities of detection at each
pixel point, which is obviously related to the perceived qual-
ity because a higher detection probability implies a higher
distortion level at the specific point. However, in the case
of supra-threshold distortions (i.e., distortions clearly visible
to the eye), the error visibility will mostly be 1, and in such
cases a single number denoting the visual quality is more
desirable. This can be accomplished via the pooling of errors
in the frequency bands. In the original implementation, the
pooling weights were determined by optimization on an
existing SDR dataset. There are, however, three limitations
of that approach, especially in the context of dealing with
SDR and HDR conditions. First, the original paper
3
used
only an SDR image quality dataset which did not include any
HDR images. Second, the optimization was done on a rel-
atively small number of images. Final ly, since the optimiza-
tion was unconstrained, it lead to negative pooling weights
that may not be easily interpretable.
This letter seeks to address the specific issues raised with
regards to pooling in HDR-VDP-2. To that end, we reopti-
mized the pooling weights on a combined dataset of subjec-
tively rated HDR and SDR images. As a result, the newly
calibrated model is expected to be more effective across
both HDR and SDR test conditions. Second, we also refor-
mulated the said optimization as being constrained, due to
which the resultant weights can be computed in a bounded
manner, leading to better interpretability. Finally, we verified
the prediction performance of the new weights via extensive
cross-validation studies on a collection of nearly 3000
images (including HDR and SDR content and their corre-
sponding subjective quality ratings).
3 Method Calibration
In HDR-VDP-2, the following equation is used to predict the
quality score Q
hdrvdp
for a distorted image with respect to its
reference:
*Address all correspondence to: Rafal K. Mantiuk, E-mail: mantiuk@gmail.com 0091-3286/2015/$25.00 © 2015 SPIE and IS&T
Journal of Electronic Imaging 010501-1 JanFeb 2015
Vol. 24(1)
JEI Letters

Q
hdrvdp
¼
1
F:O
X
F
f¼1
X
O
o¼1
w
f
log
1
I
X
I
i¼1
D
2
p
½f; oðiÞþε
; (1)
where i is the pixel index, D
p
denotes the noise-normalized
difference between the fth spatial frequency (f ¼ 1 to F)
band and oth orientation (o ¼ 1 to O) of the steerable pyra-
mid for the reference and test images, ε ¼ 10
5
is a const ant
to avoid singularities when D
p
is close to 0, and I is the total
number of pixels. In the above equation, w
f
is the vector of
per-band pooling weights, which can be determined by
maximizing correlations with subjective opini on scores.
However, unconstrained optimization in this case may lead
to some negative w
f
. Since w
f
determines the weight (impor-
tance) of each frequency band, a negative w
f
is implausible
and may indicate overfitting. Therefore, in this letter, we
introduce a constraint on w
f
during optimization.
Let Q
hdrvdp
and S, respectively, denote the vector of objec-
tive quality scores from HDR-VDP-2 and subjective scores
for a given set of N images. Then, the aim is to maximize the
Spearman rank-order correlation between the two vectors
with w
f
being the optimized variables. To that end, we first
rank the values in Q
hdrvdp
and S from 1 to N and obtain new
vectors R
hdrvdp
and R
subjective
, which consist of the respective
ranks. Further, define E ¼ R
hdrvdp
R
subjective
as the rank
difference vector. Then, the optimization problem can be
denoted as
maximize
w
f
1
6
P
N
i¼1
E
2
i
NðN
2
1Þ
; subject to w
f
0: (2)
Also note that in our case, the said optimization is solved
using the NelderMead method, which does not require
computing gradients. This is because our objective function
is not continuous and differentiable as we use the Spearman
rank-order correlation. Our aim was to calibrate the metric so
that it can handle both HDR and SDR condit ions. Thus, we
computed the optimized w
f
based on a set of subjectively
rated SDR and HDR images. In particula r, our study used
two recent HDR datasets,
4,5
in which there is a total 366
subjectively rated compressed HDR images. In contrast to
the HDR case, there are several SDR datasets that are pub-
licly available, and we selected the two biggest ones in terms
of the number of images (TID2008
6
and CSIQ
7
). Note that
these datasets use different rating methodologies. Therefore,
for the HDR (scale of 1 to 5) and TID2008 (scale of 1 to 9)
datasets, which report mean opinion scores (MOSs), we first
converted the MOSs to difference MOS. On the other hand,
the CSIQ dataset report s difference mean opinion score
(DMOS). Finally, we rescaled all the DMOSs between 0
and 100. This enabled a more consistent scale of rating
scores during optimization.
4 Cross-Validation Results
In this section, we outline the method used to verify the per-
formance of new weights. Recall that we used four datasets:
two each for the HDR and SDR cases. The former has a total
of 10 source (reference) contents, while the latter has a total
of 55 (30 in CSIQ and 25 in TID2008) source contents.
So, there were 65 source contents in total and 2932 distorted
contents (obtained by applying different distortion types and
levels to the source content). For the cross-validation studies,
we selected all the distorted images from 45 (this corre-
sponds to about 70%) source contents as the training set to
find the optimal w
f
vector, and the remaining images from
20 source contents were used as the test set. To enable a more
robust estimate of the prediction performance, we randomly
repeated the said division into training and test sets over
1000 iterations, and it was ensured that the two sets were
different in terms of the source content. Hence, in each of
the 1000 iterations, the prediction performance was assessed
only for untrained content, thus providing a reasonably
robust approach toward content-independent verification. The
reader may also be informed that with the stated data parti-
tion (45 source contents as training set and remaining as test
set), there were an average of 2032 and 900 images, respec-
tively, in the training and test sets, during each iteration.
The experimental results for this case (exp 1) are shown
in Fig. 1(a), where the performance is measured in terms of
mean (over 1000 iterations) values of Pearson and Spearman
correlation values (a higher value implies better for these
measures). Recall that the existing LDR methods cannot
be directly used for HDR. Nevertheless, we also employed
peak signal to noise ratio (PSNR) and structural similarity
index measure (SSIM) (both are LDR methods) on a percep-
tually uniformly (PU)
8
transformed HDR signal to compute
objective quality and to provide a base line for comparison.
One can notice that the prediction performance using the
weights obtained from the training set is better than the
default weights as well as the two modified LDR methods.
We have also plotted in the same figure, the 95% confidence
intervals (using error bars) to provide an indication of uncer-
tainty in the measured values. As can be seen, the confidence
intervals do not overlap indicating a better performance with
the trained weights from statistical considerations. It was also
found that the retrained weights lead to a larger improvement
in case of HDR images but did not jeopar dize the prediction
accuracy for the LDR case, and this improved the overall
(a) (b)
(c)
Fig. 1 Comparative prediction performance: (a) cross-validation tests (exp 1), 900 test images, (b) new
HDR dataset (exp 2), 50 test images, and (c) plot of retrained and default weights. Error bars indicate
95% confidence intervals. The same colors of the bars are used for (a) and (b).
Journal of Electronic Imaging 010501-2 JanFeb 2015
Vol. 24(1)
JEI Letters

prediction performance. Finally, to verify the consistency in
prediction, we computed the number of outliers over 1000
iterations based on box plots (which are convenient tools
to visualize data variability and detect points outside the
quartiles). We found that for all the cases, the number of out-
liers was less than 1% of the total points, indicating good
consistency.
5 Validation on Independent Dataset
The results in the previous section revealed that the trained
weights lead to better prediction accuracies with content-
independent training and test sets. This section provides fur-
ther evidence of that by using another independent HDR
dataset reported in Ref. 9. Note that this dataset includes
source content and a set of distortions that did not appear in
any of the other datasets we used for calibration. To perform
the experiment (exp 2), we first obtained the optimal w
f
by
using all the datasets used in the previous section. The result-
ant w
f
(referred to as the new weights) was then used to pre-
dict the quality of HDR images in the new dataset. The
comparative results along with those from PU-PSNR and
PU-SSIM are shown in Fig. 1(b), from which we can see that
the new weights improved the prediction accuracies over
default weights as well as the two modified LDR methods.
Note that the statistical differences are not apparent because
of the much smaller size of the dataset: 50 images versus 900
used in Sec. 4.
Finally, we compare the retrained and default weights via
the frequency versus weight plot shown in Fig. 1(c). The fre-
quency is expressed in cycles per degree (cyc/deg), and the
left and right bars at each cyc/deg, respectively, indicate
default and retrained weights. We notice that the retrained
weights reduce the importance of low frequency bands.
However, it may also be mentioned that they need not be
related to the contrast sensitivity function because the goal
of pooling is to quantify quality (or annoyance level) which
may not always be at the level of visibility thresholds.
Also note that the negative weights found in the original
HDR-VDP-2 could cause an increase of quality with a higher
amount of distortion. This situation is valid only in very spe-
cific cases such as denoising and contrast enhancement
(where visual quality may be enhanced). However, since this
condition is not included in any of the datas ets that we used,
the retrained weights lend to better physical interpretability
(since all of them are positive, the quality will decrease with
an increased level of distortion).
6 Concluding Remarks
Visual quality asses sment is a useful tool in many image
and video processing applications. In addition, the recent
interests of the multimedia signal processing community in
HDR imaging have lead to activities toward development
and standardization of HDR image and video processing
tools (e.g., extension of the JPEG standard to support HDR
image compression). In such scenarios, an objective quality
prediction tool is needed to validate such tools from the view
point of visual quality benchmarking with both SDR and
HDR signals. In that context, the contribution of this letter
can be summarized as follows:
We identified and addressed the speci fic issue of feature
pooling in HDR-VDP-2, and thus proposed the extension
HDR-VDP-2.2. Specifically, we computed the pooling
weights via constrained optimization on a set of subjectively
rated SDR and HDR images, in order that the resultant metric
would be effective across a large luminance range of the vis-
ual signal. This represents a clear advantage over existing
SDR metrics that may not be directly applicable in the case
of HDR signals.
We verified the performance of the new weights by way of
extensive cross-validation and also on an independent
HDR dataset. In this way, the prediction performance of
HDR-VDP-2 (both with new and default pooling weights)
has also been verified and benchmarked on a test bed with
nearly 3000 HDR and SDR images.
With regards to the practical implications of the work
reported in this letter, we note that the new version HDR-
VDP-2.2 is a more accurate objective visual quality estimator
for both HDR and SDR conditions. Hence, it is expected to
be useful in standardizing HDR and SDR visual signal
processing tools with regards to their impact on visual qual-
ity and can also be employed as a standalone quality predic-
tor. While no objective quality method can entirely replace
subjective opinion, nevertheless the proposed improved
version HDR-VDP-2.2 can still be useful in certain scenarios
and applications. A software implementation of HDR-VDP-
2.2 is freely available for download at Ref. 10.
It should also be stressed that the retrained pooling
weights are related to the characteristics of perceptual noise
introduced in different frequency bands as a result of the
processing considered in the datasets used. Thus, in the cur-
rent work, we considered a standard processing method
(including compression, tone mapping, and inverse tone
mapping). Hence, HDR-VDP-2.2 is expected to be more
accurate with the current use cases of HDR deployment
in the HDR delivery chain. For other applications, the pool-
ing weights may need to be revisited and possible profiles
may be added to HDR-VDP-2.2.
Acknowledgments
This work was supported by COST Action IC1005 and the
NEVEx project FUI11 financed by the French government.
References
1. F. Banterle et al., Advanced High Dynamic Range Imaging: Theory
and Practice, AK Peters (CRC Press), Natrick, Massachusetts
(2011). ISBN: 978-156881-719-4.
2. W. Lin and C. Kuo, Perceptual visual quality metrics: a survey,
J. Visual Commun. Image Represent. 22, 297312 (2011).
3. R. Mantiuk et al., HDR-VDP-2: a calibrated visual metric for visibility
and quality predictions in all luminance conditions, ACM Trans.
Graphics 30, 40 (2011).
4. M. Narwaria et al., Tone mapping based high dynamic range image
compression: study of optimization criterion and perceptual quality,
Opt. Eng. 52(10), 102008 (2013).
5. M. Narwaria et al., Impact of tone mapping in high dynamic range
image compression, in Proc. Eighth Int. Workshop on Video Process-
ing and Quality Metrics for Consumer Electronics (VPQM) (2014).
6. N. Ponomarenko et al., Color image database for evaluation of image
quality metrics, in Proc. Int. Workshop Multimedia Signal Process,
Cairns, Queensland p. 403408, IEEE (2008).
7. E. Larson and D. Chandler, Categorical image quality (CSIQ) data-
base, 2010, http://vision.okstate.edu/csiq (September 2014).
8. T. Aydin et al., Extending quality metrics to full luminance range
images, Proc. SPIE 6806, 68060B (2008).
9. G. Valenzise et al., Performance evaluation of objective quality metrics
for HDR image compression, Proc. SPIE 9217, 92170C (2014).
10. http://hdrvdp.sf.net/.
Journal of Electronic Imaging 010501-3 JanFeb 2015
Vol. 24(1)
JEI Letters
Citations
More filters
Journal ArticleDOI

Blind Quality Assessment of Tone-Mapped Images Via Analysis of Information, Naturalness, and Structure

TL;DR: An effective and efficient no-reference objective quality metric which can automatically assess LDR images created by different TMOs without access to the original HDR images is developed.
Journal ArticleDOI

ExpandNet : a deep convolutional neural network for high dynamic range expansion from low dynamic range content

TL;DR: This paper presents a method for generating HDR content from LDR content based on deep Convolutional Neural Networks (CNNs) termed ExpandNet, which accepts LDR images as input and generates images with an expanded range in an end‐to‐end fashion.
Journal ArticleDOI

Hdr-vqm

TL;DR: An objective HDR video quality measure (HDR-VQM) based on signal pre-processing, transformation, and subsequent frequency based decomposition is presented, which is one of the first objective method for high dynamic range video quality estimation.
Journal ArticleDOI

Benchmarking of objective quality metrics for HDR image quality assessment

TL;DR: It is suggested that the performance of most full-reference metrics can be improved by considering non-linearities of the human visual system, while further efforts are necessary to improve performance of no-reference quality metrics for HDR content.
Posted Content

ExpandNet: A Deep Convolutional Neural Network for High Dynamic Range Expansion from Low Dynamic Range Content

TL;DR: ExpandNet as discussed by the authors uses a CNN to reconstruct missing information that was lost from the original signal due to quantization, clipping, tone mapping or gamma correction, and uses a multiscale architecture which avoids the use of upsampling layers to improve image quality.
References
More filters
Journal ArticleDOI

Perceptual visual quality metrics: A survey

TL;DR: A systematic, comprehensive and up-to-date review of perceptual visual quality metrics (PVQMs) to predict picture quality according to human perception.
Proceedings ArticleDOI

HDR-VDP-2: a calibrated visual metric for visibility and quality predictions in all luminance conditions

TL;DR: The visibility metric is shown to provide much improved predictions as compared to the original HDR-VDP and VDP metrics, especially for low luminance conditions, and is comparable to or better than for the MS-SSIM, which is considered one of the most successful quality metrics.
Book

Advanced High Dynamic Range Imaging: Theory and Practice

TL;DR: This book provides a practical introduction to the emerging new discipline of high dynamic range imaging that combines photography and computer graphics by providing detailed equations and code that gives the reader the tools needed to experiment with new techniques for creating compelling images.
Proceedings ArticleDOI

Color image database for evaluation of image quality metrics

TL;DR: A new image database for testing full-reference image quality assessment metrics is presented, based on 1700 test images, which can be used for evaluating the performances of visual quality metrics as well as for comparison and for the design of new metrics.
Proceedings ArticleDOI

Extending Quality Metrics to Full Luminance Range Images

TL;DR: To estimate quality of images shown on bright displays, this work proposes a straightforward extension to the popular quality metrics, such as PSNR and SSIM, that makes them capable of handling all luminance levels visible to the human eye without altering their results for typical CRT display Luminance levels.
Related Papers (5)