scispace - formally typeset
Open AccessProceedings ArticleDOI

Impact of tone-mapping operators and viewing devices on visual quality of experience

Reads0
Chats0
TLDR
Preliminary results show that small screen displays (SSDs) have an impact on the performance of TMOs compared to computer displays, and the larger the mobile resolution, the better the subjective results.
Abstract
The development of HDR imaging is seen as an important step towards improving the visual quality of experience (QoE) of the end user in many applications. In practice, Tone-mapping operators (TMOs) provide a useful means for converting a high dynamic range (HDR) image to a low dynamic range (LDR) image in order to achieve better visualization on standard displays. Although mobile devices are becoming popular, the techniques for displaying the content of HDR images on the screens of such devices are still in the early stages. While several studies have been conducted to evaluate TMOs on conventional displays, few studies have been carried out to date to evaluate TMOs on small screen displays, such as those used in mobile devices. In this paper we evaluate, using subjective and objective methods, the most popular Tone-mapping-operators in different mobile displays and resolutions under normal viewing conditions for the end-user. Preliminary results show that small screen displays (SSDs) have an impact on the performance of TMOs compared to computer displays. In general, the larger the mobile resolution, the better the subjective results. We also found clear differences between SSDs and LDRs performances. The best TMO for mobile displays is iCAM06 and for computer displays it is Photographic Reproduction.

read more

Content maybe subject to copyright    Report

University of Plymouth
PEARL https://pearl.plymouth.ac.uk
Faculty of Arts and Humanities School of Art, Design and Architecture
2016-07-12
Impact of tone-mapping operators and
viewing devices on visual quality of
experience
Ifeachor, E
http://hdl.handle.net/10026.1/13252
10.1109/ICC.2016.7510690
2016 IEEE International Conference on Communications, ICC 2016
All content in PEARL is protected by copyright law. Author manuscripts are made available in accordance with
publisher policies. Please cite only the published version using the details provided on the item record or
document. In the absence of an open licence (e.g. Creative Commons), permissions for further reuse of content
should be sought from the publisher or author.

Impact of Tone-mapping Operators and Viewing
Devices on Visual Quality of Experience
Shaymaa Al-Juboori, Is-Haka Mkwawa, Lingfen Sun and Emmanuel Ifeachor
Centre for Signal Processing and Multimedia Communication
Plymouth University, UK
E-mail: {shaymaa.al-juboori, is-haka.mkwawa, l.sun, e.ifeachor} @ plymouth.ac.uk
Abstract The development of HDR imaging is seen as an
important step towards improving the visual quality of
experience (QoE) of the end user in many applications. In
practice, Tone-mapping operators (TMOs) provide a useful
means for converting a high dynamic range (HDR) image to a
low dynamic range image (LDR) in order to achieve better
visualization on standard displays. Although mobile devices are
becoming popular, the techniques for displaying the content of
HDR images on the screens of such devices are still in the early
stages. While several studies have been conducted to evaluate
TMOs on conventional displays, few studies have been carried
out to evaluate TMOs on small screen displays, such as those
used in mobile devices. In this paper we evaluate, using
subjective and objective methods, the most popular Tone-
mapping-operators in different mobile displays and resolutions
under normal viewing conditions for the end-user. Preliminary
results show that small screen displays (SSDs) have an impact on
the performance of TMOs compared to computer displays. In
general, the larger the mobile resolution, the better the subjective
results. We also found clear differences between SSDs and LDRs
performances. The best TMO for mobile displays is iCAM06 and
for computer displays it is Photographic Reproduction.
KeywordsHDR, Tone mapping operators, Subjective tests,
Objective test, Small screen devices, mobile devices, Low
dynamic range, Standard dynamic range, Quality of Experience.
I. INTRODUCTION
In recent years, we have witnessed widespread application
of High Dynamic Range (HDR) imaging due to its ability to
capture a wide range of luminance values, similar to that of the
human visual system (HVS). The application areas include
home-entertainment, security, scientific image, video
processing, computer graphics and multimedia
communications [1]. However, in practice the full HDR
content cannot be displayed on standard or low dynamic range
(LDR) displays, and this diminishes the benefits of HDR
technology to many users. To address this, Tone-Mapping
Operators (TMO) are used to convert HDR images so that they
can be displayed on low-dynamic-range displays and preserve
as far as possible the perception of HDR [2].
A large number of different TMO algorithms have been
proposed in recent years, with varying degrees of success in
preserving the perceptual quality of HDR images. The need to
evaluate the performance of TMO algorithms to inform the
choice of algorithms for different displays and application is
widely recognised [1]. There has been a number of studies
undertaken to address this, but most of these were carried out
using large conventional displays such as those of TV sets and
PC monitors [1,2] and very few using small screen devices
such as those of mobile phones [3,4]. There is also no concrete
indication of which TMO performs the best.
With advances in mobile wireless communication, the
popularity of mobile devices and mobile applications are
growing dramatically. It is predicted that by 2019, there will
be 8.2 billion handheld or personal mobile-ready devices and
3.2 billion mobile-to-mobile connections [5]. With the ability
and convenience to be used anywhere and at any time, smart
mobile devices have become the main means for receiving
multimedia content [3]. The need remains to understand how
TMO algorithms perform on small screen devices, such as
those of the mobile phones. This is exacerbated because of the
existence of many different mobile devices and brands with
different resolutions, sizes and models.
It is unclear how current TMO algorithms perform in small
screen devices (SSD), such as mobile phones and tablets, and
whether they can be used directly for SSD or as SSD-friendly,
or more specifically mobile-friendly. The importance of this
issue has recently began to be addressed [3,4]. However, only
two studies have been reported so far. Ubano et al. [3] carried
out the first subjective evaluation of seven TMO algorithms on
three different displays including LDR and a mobile device for
still images. They found that the TMOs perform significantly
different for SSDs compared with LDRs. However, only one
mobile device (with a screen size of 2.8’’) was tested. Melo et.
al. [4] carried out a subjective evaluation of six different TMO
algorithms for video using three displays (HDR, LDR and
Tablet) and did not find major differences between SSDs and
LDRs. Their work was limited to video and the testing was
only based on one tablet. In both studies, the Quality of
Experience (QoE) of the end-user was not taken into
consideration in the experiments.
QoE driven multimedia systems have increasingly come
into focus in both research and industry. Capturing the end-
user’s aesthetic expectations is the aim rather than simply
delivering content based on a technology-centric approach.
HDR is one of the important new developments which provide
end-users with enhanced realistic viewing experience and thus
improving the QoE [6].
QoE assessments are traditionally performed in laboratories
under controlled viewing conditions. However, the Web is now
considered as an important platform for uncontrolled QoE
assessments with large numbers of participants. It also helps to
create a realistic test environment, as the assessment is done
directly on the participants’ devices. However, it is not clear
whether different mobile devices have differential impact on

the QoE of HDR images, and if so, to what extent the impact is
compared to conventional displays.
In this paper, we investigate the impact of different mobile
devices and resolutions in assessing QoE of tone-mapping
operators and address a number of major concerns regarding
TMOs, e.g.: Are the TMOs which were successful for
traditional displays also successful for SSDs? Do different
device sizes/ resolutions affect the QoE?
The rest of the paper is organized as follows. Section II
reviews briefly the related work in evaluating TMOs and
Section III discusses the experimental framework. Section IV
presents the experimental results. Section V discusses the
objective quality metrics and their result. In Section VI, we
evaluate the performance indices between four subjective tests
on the one hand and between subjective and objective tests on
the other. Conclusions and future work are given in Section
VII.
II. RELATED WORK
Error metrics and psychophysical experiments are the two
main methodologies for evaluating TMO. Error metrics are
objective methods that compute quality indices by comparing
images [7]. The comparison can be made based on differences
in the physical quantities of the images or by attempting to
simulate the HVS in order to identify which aspects of the
image would be perceived by the HVS as being different.
Psychophysical experiments are subjective and based on
human participants. These experiments are conducted in
controlled environments and can make use of a number of
evaluation methods for comparing images such as rating,
pairwise comparison or ranking. Several psychophysical
experiments have previously been conducted. Cadick [8]
adopted a direct rating Full Reference comparison of the tone
mapped images of real scenes, and a subjective ranking of tone
mapped images without a real reference. They applied 14
methods, and three typical real world HDR scenes. More
recently, Salih [9] compared six tone operators using visual
rating by comparing the printings and LDR display devices.
The study concluded that photographic reproduction TMO is
the best in terms of visual preference. Urbano et al [3] was one
of the first studies aimed specifically at SSDs. They evaluated
several TMOs on displays with different sizes using a pairwise
comparison test of the processed images with a reference of
real scenes. Three different displays were used, two 17” and
one 2.8” displays with resolutions of 1024×682 and 240×320,
respectively. The authors concluded that the order of
preference for TMO between the displays was different and
that for mobile devices, the content that offered stronger detail
reproduction, more saturated colors and overall brighter image
appearance were preferred.
Despite a large body of research devoted to the evaluation
of TMOs, there is no standard methodology for performing
such studies. The choice of method depends on the application
and what is relevant to the study. In this study, we employ
Non-Reference (NR) and Full Reference (FR) methods since in
many end-user viewing applications there is no need for
comparing with “perfect” or “reference” image. In the FR
image quality evaluation, the task is to determine the quality of
reproduction with reference to the original image which has to
be available. In NR evaluation, the original is not available and
image quality features can be used instead [7].
III. EXPERIMENTAL FRAMEWORK
Two sets of subjective, visual quality assessments were
conducted using the same dataset in generic environments. 60
observers were involved and the viewing conditions included
indoor and outdoor environments, with natural and artificial
light. Participants were free to look at the images on the
Websites in the way they felt comfortable. Typically,
subjective quality assessment involves quality rating, and the
final result is expressed as a Mean Opinion Score (MOS), that
is the average of the individual scores.
Fig. 1. Experimental setup (a) computer test (b) mobile test
Two experimental setups were designed for this study (c.f.,
Fig. 1). In the first experiment, a website [11] is designed and
accessed from LDR displays of personal computers. The
instructions for the test were made available on the website.
We chose a discrete, five-level scale rating table for ITU-R
quality ratings. This is more suitable for naïve observers (non-
experts in image processing) as it is easier for them to quantify
the quality from bad” (1) to “excellent” (5) [15]. Gamma
correction of 2.2 was applied to the tone mapped images as a
last step of the tone-mapping algorithms in order to
compensate for the non-linearity of displaying devices [1]. The
experiment has two tests, test 1 and 2 which are FR and NR,
respectively. Two websites were created for each test of all TM
images and the MOS results were submitted to a database at
the end of each test (Continuous test). Participants were asked
to read the instructions and then view 30 images (divided into 2
websites 15 images per website).
The web site for the second experiment was designed to be
accessed from SSDs, i.e. smart phones and tablets [11] as
shown in Fig. 2. The instructions for the second experiment
were sent to participants in a recruitment email. The MOS in
this case is an eleven-grade quality scale (‘10=no further
improvement is possible and’ 0=A worse quality cannot be
imagined) [15]. There were two tests in this experiment, FR
and NR. Each test has three websites for the TM images. For
each test, participants submitted their MOS, individually, for
each image. Next” and “Previous” buttons allow participants to
evaluate next images or to review previous images. Participants
can also swipe the screen to move forward and backwards. A
progress bar appears below the TM images as an indicator of
percentage of progress so far (c.f., Fig. 2).

Fig. 2. Mobile website
A. Participants
The total number of participants for the entire study was 60.
All of the participants were between 20 and 50 years old and
had normal or corrected vision and non-experts in HDR, but
have a clear understanding of the test.
B. Devices
In the Mobile experiment, five different devices for a total
of 30 participants were used as shown in TABLE I., while for
the computer experiment a total of 30 participants were also
used; TM images were displayed on Philips Brilliance
221P3LPYES, 21.5-inch LED-backlit, LCD panel display with
a native display resolution of 1920×1080.
TABLE I. DEVICES FOR MOBILE EXPERIMENT
Devices
No of
Users
/Devise
Features
Resolution / pixels
IPhone 6
9
4.7 inch Retina
HD display,
1334×750
IPhone 5S
7
4 inch Retina
display
1136×640
Samsung
Galaxy noteII
5
5.5 inch Super
AMOLED
display
1280×720
Samsung
Galaxy S4
3
5 inch HD
Super AMO
LED display
1920×1080
IPad mini 3
6
7.9 inches, IPS
LCD
2048×1536
C. Considered TMOs
In this study, we used ten local and global well-known tone
mapping operators; Ashikhmin AL1, Ferwerda AL2, Adaptive
Logarithmic Mapping AL3, iCAM06 AL4, Fattal AL5,
Pattanaik AL6, Photographic Reproduction AL7, Tumblin
Rushmeier AL8, Ward AL9 and Bilateral Filtering AL10
[8,9,13,14].
D. Dataset
The dataset consists of three HDR images and 30 HDR
images obtained from the ten tone mapping algorithms
(computed using Banterle’s HDR toolbox for MATLAB and
iCAM06 source code which are freely available with the
default settings of operators' performance as presented in the
respective papers)[13,14]. The images were selected for this
study, based on their visual content, quality and the dynamic
range of the content. We used an existing HDR image
database; the indoor scene is Oxford Church, Author: Banterle,
Resolution: 840×886. The dynamic ranges of images are about
10
0
: 10
3
cd/m
2
. The outdoor scene is Warwick, Author:
Banterle, Resolution: 1189×598, the dynamic ranges is about
10
-1
: 10
1
cd/m
2
. Indoor and Outdoor scene Office Resolution:
1165×751, the dynamic range of the image is about 10
-2
: 10
1
cd/m
2
.
IV. EXPERIMENTAL RESULTS
The first step of the analysis of the results is the calculation
of the mean opinion score. The raw subjective scores were
converted into a corresponding MOS for each sequence with
95% confidence interval. In each test, the quality score values
were converted to the range [1:10] by mapping the lowest and
highest quality score values to 1 and 10, respectively,
Intermediate values were scaled proportionally.
Fig. 3. (a) and (b) shows the results of the mobile
experiments. In (a), iCAM06 Al4 and Bilateral Filtering AL10
had the best performance from the observers’ point of view,
with a very good MOS scores between 8.8 and 8.2 for the three
images. These two operators preserve good details compared to
the reference image. Adaptive Logarithmic Mapping AL3
obtained MOS less than 8, while The worst TMO was
Pattanaik AL6 with MOS of 1 for all images and Ferwerda
AL2. Moreover, in (b) for the NR test iCAM06 Al4 and
Bilateral Filtering AL10 still performs as best TMOs, Adaptive
Logarithmic Mapping AL3 and Ashikhmin AL1 obtained good
results of MOS between 7 and 8. While Pattanaik AL6 still
with the lowest MOS of 1.
The results of the computer experiment are illustrated in
Fig. 4. The FR test (a), shows the results of the three images;
Church, Warwick and Office. Photographic Reproduction AL7
had the best performance from the observers’ point of view,
with very good MOS scores around 9 for the three images,
while Adaptive Logarithmic Mapping AL3 and iCAM06 AL4
performed well as well with MOS between 8 and 9. The global
Drago TMO AL7 is based on logarithmic compression of
luminance. While the best performance of local operator of
Reinhard came from applying the dodging and burning
technique, authors provided an efficient way of compressing
the dynamic range while reducing halo artefacts [8]. Less halo
results into a very good overall image quality. In the other hand
Fattal AL5 and Pattanaik AL6 were the worst TMOs. The
reason behind the low performance of Pattanaik is that it’s
using a multiscale decomposition of the image according to a
comprehensive psychophysically-derived filter banks.
However, it may still present halos which affected the overall
quality of the image [8]. In the NR test (b) the MOS results
were the same on the FR test, but with different MOS results
for both the best TMOs and the worst one. While Fattal et al.
treat HDR images with a gradient attenuation method. Their
method is very good at increasing local contrast without
creating halo artefacts [17]. By comparing the results of
computer and mobile experiments in Fig. 3. and Fig. 4. , in (a)
the FR test we can see that in computer came close to each oth-

(a)
(b)
Fig. 3. MOS for Mobile experiment (a) FR test (b) NR test
-er and less variation between the MOS of subjects. The results
of Pattanaik AL6 in the mobile test had the lowest MOS
compared with the other TMOs for the three images (MOS=1),
but for the computer test it was the lowest MOS as well, with
an average of (MOS=2.5) which is significantly higher from
the mobile results. While in (b) we can see that less variation in
the results appear for both tests. For mobile test AL4 performed
better, while in computer test AL7 had better MOS results.
Moreover, from the results we can see the variance in terms of
highest and the lowest performance of TMOs is very clear
between SSDs and LDRs.
Different mobile and tablet devices have different display
features TABLE I. The devices have been used in this study;
have screen sizes varying between 4 and 7.9 inch and with
different screen resolutions. Fig.5. shows the results of SSDs
behavior in uncontrolled viewing conditions (a) FR test (b) NR
test. In (a) and (b) the results suggests that the screen resolution
and size are particularly important for higher MOS results.
iPad mini 3 gave the favorable results compared to other
devices; iPhone 6 behaved very well and iPhone 5S comes in
third place. Samsung Galaxy note II had the lowest results if
compared to overall device types. To analyze the results, we
can see that the SSDs resolution effect in the first place,
moreover, there is no vast difference in mobile devices
performance in uncontrolled viewing conditions for HDR
image evaluation whether it was NR or FR tests.
V. OBJECTIVE QUALITY METRICS
Subjective rating may be a reliable evaluation method, but it
is expensive and time consuming, and more importantly, it is
difficult to be embedded into optimization frameworks. The
goal of objective image quality assessment research is to
provide quality metrics that can predict perceived image
quality automatically [10].
(a)
(b)
Fig. 4. MOS for Computer experiment (a) FR test (b) NR test.
(a)
(b)
Fig. 5. SSDs behavior in uncontrolled viewing conditions tests (a) FR (b) NR
As there is no established standard for evaluating HDR image
quality [6][7][10][18], we chose to use three error metrics;
Shannon Entropy (E), The Multi-Exposure Peak Signal Noise
Ratio (mPSNR) and Visual difference predictor for HDR
images HDR-VDP-2. Entropy is used to measure the salient
features of the image. Large entropy means that the fused
image contains more information and implies a better image
fusion [10]. The mPSNR metric is an extension of the peak
signal-to-noise ratios (PSNR) metric to HDR domain.

Citations
More filters

Perceptual effects of daylight patterns in architecture

TL;DR: The outcomes of this work revealed that the façade and daylight patterns impacted both the subjective and physiological responses of participants, demonstrating in a VR setting that façades and their interaction with light can have quantifiable effects on occupants.
Proceedings ArticleDOI

Investigation of relationships between changes in EEG features and subjective quality of HDR images

TL;DR: Investigating the relationships between changes in EEG features and subjective quality test scores for High Dynamic Range (HDR) images viewed with a mobile device suggests that increases in the degree of coupling are associated with decreases in HDR quality and suggests that in the HDR image QoE assessment, human emotions play a significant role.
References
More filters
Journal ArticleDOI

Objective Quality Assessment of Tone-Mapped Images

TL;DR: An objective quality assessment algorithm for tone-mapped images is proposed by combining: 1) a multiscale signal fidelity measure on the basis of a modified structural similarity index and 2) a naturalness measure onThe basis of intensity statistics of natural images.
Journal ArticleDOI

iCAM06: A refined image appearance model for HDR image rendering

TL;DR: Evaluation of the model proved iCAM06 to have consistently good HDR rendering performance in both preference and accuracy making it a good candidate for a general-purpose tone-mapping operator with further potential applications to a wide-range of image appearance research and practice.
Book

Advanced High Dynamic Range Imaging: Theory and Practice

TL;DR: This book provides a practical introduction to the emerging new discipline of high dynamic range imaging that combines photography and computer graphics by providing detailed equations and code that gives the reader the tools needed to experiment with new techniques for creating compelling images.
Journal ArticleDOI

Technical Section: Evaluation of HDR tone mapping methods using essential perceptual attributes

TL;DR: This work presents an overview about the effects of basic image attributes in high dynamic range tone mapping, and proposes a scheme of relationships between these attributes, leading to the definition of an overall image quality measure.
Proceedings ArticleDOI

Tone mapping of HDR images: A review

TL;DR: A comparative study of most famous tone mapping algorithms is presented and it is concluded that Reinhard tone mapping operators are the best in term of visual pleasure and maintaining image integrity.
Related Papers (5)