scispace - formally typeset
Open AccessJournal ArticleDOI

A No-Reference Metric for Perceived Ringing Artifacts in Images

Reads0
Chats0
TLDR
A novel no-reference metric that can automatically quantify ringing annoyance in compressed images is presented and shows to be highly consistent with subjective data.
Abstract
A novel no-reference metric that can automatically quantify ringing annoyance in compressed images is presented. In the first step a recently proposed ringing region detection method extracts the regions which are likely to be impaired by ringing artifacts. To quantify ringing annoyance in these detected regions, the visibility of ringing artifacts is estimated, and is compared to the activity of the corresponding local background. The local annoyance score calculated for each individual ringing region is averaged over all ringing regions to yield a ringing annoyance score for the whole image. A psychovisual experiment is carried out to measure ringing annoyance subjectively and to validate the proposed metric. The performance of our metric is compared to existing alternatives in literature and shows to be highly consistent with subjective data.

read more

Content maybe subject to copyright    Report

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 20, NO. 4, APRIL 2010 529
A No-Reference Metric for Perceived Ringing
Artifacts in Images
Hantao Liu, Student Member, IEEE, Nick Klomp, and Ingrid Heynderickx
Abstract—A novel no-reference metric that can automatically
quantify ringing annoyance in compressed images is presented.
In the first step a recently proposed ringing region detection
method extracts the regions which are likely to be impaired by
ringing artifacts. To quantify ringing annoyance in these detected
regions, the visibility of ringing artifacts is estimated, and is
compared to the activity of the corresponding local background.
The local annoyance score calculated for each individual ringing
region is averaged over all ringing regions to yield a ringing
annoyance score for the whole image. A psychovisual experiment
is carried out to measure ringing annoyance subjectively and to
validate the proposed metric. The performance of our metric is
compared to existing alternatives in literature and shows to be
highly consistent with subjective data.
Index Terms—Human vision model, image quality assessment,
objective metric, ringing artifact annoyance.
I. Introduction
O
BJECTIVE metrics have the aim to automatically pro-
vide a quantitative measure for image quality aspects,
and to eventually serve as computational alternative for expen-
sive image quality assessments by human observers. They are
of fundamental importance to a broad range of applications,
such as the optimization of digital imaging systems, bench-
marking of image and video coding, and quality monitoring
and control in displays [1]. They are generally classified into
full-reference (FR) metrics and no-reference (NR) metrics,
depending on the use of the original image or video. FR
metrics are based on measuring the similarity or fidelity
between the distorted image and its original version, which
is considered as a distortion-free reference. The most widely
used FR metrics are mean squared error and peak signal-to-
noise ratio. These metrics, however, have long been criticized
for their poor correlation with perceived image quality [1].
A lot of research effort is devoted to the development of
FR metrics that can reflect the way human beings perceive
image quality [2]. Improved alternatives of FR metrics include
the structural similarity index [3] and the visual information
Manuscript received March 12, 2009, revised June 23, 2009 and August 7,
2009. First version published November 3, 2009; current version published
April 2, 2010. This paper was recommended by Associate Editor, X. Li.
H. Liu and N. Klomp are with the Department of Mediamatics, Delft
University of Technology, Delft 2628 CD, The Netherlands (e-mail:
hantao.liu@tudelft.nl; n.c.r.klomp@tudelft.nl).
I. Heynderickx is with Philips Research Laboratories, Eindhoven 5656 AE,
The Netherlands. She is also with Delft University of Technology, Delft 2628
CD, The Netherlands (e-mail: ingrid.heynderickx@philips.com).
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TCSVT.2009.2035848
fidelity index [4]. Since FR metrics require the access to the
original, which is however not always available in real-world
applications, they are usually employed as tools for in-lab
testing of image and video processing algorithms. NR metrics
instead are more practical because the quality prediction is
based on the distorted image only. However, designing NR
metrics is still an academic challenge mainly due to the limited
understanding of the human visual system (HVS).
In the last decades, considerable progress on the develop-
ment of NR metrics is made, and some successful methods
are reported in the literature [5]–[19]. In [5], natural scene
statistics are used to blindly measure the quality of images
compressed by JPEG2000. The approach in [5] relies on the
assumption that typical natural images exhibit strong statistical
regularities, and therefore, reside in a tiny area of the space
containing all possible images. Based on this assumption it
quantifies image quality by detecting variations in statistical
image features in the wavelet domain. In [6] and [7], NR
image quality assessment is formulated as a machine learning
problem, in which the HVS is treated as a black box whose
input–output relationship, such as the one between image
characteristics and the quality rating, is to be learned. After
appropriate training with subjective data, these models proved
to be able to consistently predict the perceived quality of JPEG
compressed or otherwise distorted images.
A large number of NR metrics, proposed, e.g., in [8]–[19],
are based on directly measuring a specific type of artifact
created by a specific image distortion process, such as blur
caused by acquisition systems, sensor noise, and compression
artifacts. In such a scenario, the design of the NR metric can
make use of the specific characteristics of the artifact, and
therefore, generally obtains a higher reliability with perceived
quality degradation [1]. Fortunately, in many practical appli-
cations, the distortion processes involved are known, and thus,
the design of specific NR metrics turns out to be much more
realistic and useful. They can, for example, be combined to
predict the overall perceived quality. Various examples of this
approach are given in literature. A blockiness metric (see, e.g.,
[8]–[11]) can be combined with a flatness metric (see, e.g.,
[12] and [13]) to evaluate the quality of images or video after
block-based compression. A ringing metric and a blur metric
are often combined to assess the image quality of wavelet-
based compression (see, e.g., [14]–[16]). In [17] and [18], mul-
tiple artifact metrics are adopted to predict the overall quality
of still images or video. In addition to assessing the overall
image quality, these specific artifact metrics individually are
1051-8215/$26.00
c
2010 IEEE
Authorized licensed use limited to: Technische Universiteit Delft. Downloaded on April 18,2010 at 21:23:23 UTC from IEEE Xplore. Restrictions apply.

530 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 20, NO. 4, APRIL 2010
beneficial for optimizing real-time digital imaging systems
[20]–[22]. In the video chain of current television (TV)-sets,
various NR metrics, which quantify the quality of the incoming
video based on the occurrence of individual artifacts, are used
to adapt the parameter settings of the video enhancement algo-
rithms accordingly (see, e.g., [23] and [24]). To optimize the
performance of both applications mentioned above, reliably
modeling specific types of artifacts has clear added value.
Since the widespread use of compression, research on NR
metrics is mainly dedicated to compression artifacts and trans-
mission errors [25]. Especially, the blocking artifact, which is
one of the most annoying artifacts introduced by block-based
compression algorithms [26], such as JPEG or MPEG/H.263,
got a lot of attention. Another compression artifact, especially
visible at relatively high-bit rates of block-based compression
[21], [26], but also in wavelet compression [27], is ringing.
Unlike the blocking artifact, whose spatial location is very
regular and thus easily predictable, the location of ringing is
edge dependent, and as such also image content dependent.
This makes the task of quantifying ringing annoyance much
more difficult. In this paper, we present our recent efforts to
develop a NR ringing metric, validate its performance using
a subjective study of ringing annoyance in JPEG compressed
images, and compare its performance against existing ringing
metrics. Before discussing our approach (Section III) and its
performance (Sections IV and V), a more extended explanation
of the occurrence and visibility of ringing, and an overview
of existing ringing metrics are given in Section II.
II. Background
A. Perceived Ringing Artifacts
1) Physical Structure: Current image and video coding
techniques are based on lossy data compression, which con-
tains an inherent irreversible information loss. This loss is
due to coarse quantization of the image’s representation in
the frequency domain. The loss within a certain spectral band
of the signal in the transform domain reveals itself most
prominently at those spatial locations where the contribution
from this spectral band to the overall signal power is significant
(see [26], [27], and [38]). Since the high-frequency compo-
nents play a significant role in the representation of an edge,
coarse quantization in this frequency range (i.e., truncation
of the high-frequency transform coefficients) consequently
results in apparent irregularities around edges in the spatial
domain, which are usually referred to as ringing artifacts.
More specifically, ringing artifacts manifest themselves in the
form of ripples or oscillations around high-contrast edges in
compressed images. They can range from imperceptible to
very annoying, depending on the data source, target bit rate, or
underlying compression scheme [38]. As an example, Fig. 1
illustrates ringing artifacts induced by JPEG compression on
a natural image.
The occurrence of ringing spreads out to a finite region
surrounding the edges, depending on the specific implementa-
tion of the coding technique. For example, in discrete cosine
transform (DCT) coding ringing appears outwards from the
edge up to the encompassing block’s boundary [26]. An
Fig. 1. Illustration of ringing artifacts. (a) Natural image compressed with
JPEG (MATLAB’s imwrite function with “quality” of 30). (b) Gray-scale
intensity profile along one row of the compressed image [indicated by the
solid double arrowhead line in (a)]. Dashed lines “e1, “e2, and “e3” indicate
the position of the sharp intensity transitions (i.e., edges) along that arrow.
Ringing can be perceived as fluctuations in the gray-scale values around the
edges at “e1, “e2, and “e3, while the image content here should be uniform.
example of how to calculate the extent of the ringing region
in a particular codecs is given in [38]. In addition to the edge
location dependency, the behavior of ringing also depends on
the strength of the edges. It is found in [14], [29], and [38]
that, over a wide range of compression ratios, the variance
of the ringing artifacts is proportional to the contrast of the
associated edge. These important findings have great potential
in the design of a reliable ringing metric, and therefore, are
explicitly adopted in our algorithm.
2) Masking of the HVS: Taking into account the way
the HVS perceives artifacts, while removing perceptual re-
dundancies, can be greatly beneficial for matching objective
artifact measurement to the human perception of artifacts
[39]. Masking designates the reduction in the visibility of one
stimulus due to the simultaneous presence of another, and
it is strongest when both stimuli have the same or similar
frequency, orientation, and location [41]. It is basically due to
the limitations in sensitivity of a certain cell or neuron at the
retina in relation to the activity of its surrounding cells and
neurons. There are two fundamental visual masking effects
highly relevant to the perception of ringing artifacts [28]–[31].
The first one is luminance masking, which refers to the effect
that the visibility of a distortion (such as ringing) is maximum
for medium background intensity, and it is reduced when the
distortion occurs against a very low or very high intensity
background [40]. This masking phenomenon happens because
of the brightness sensitivity of the HVS, where the average
brightness of the surrounding background alters the visibility
threshold of a distortion [42]. The second masking effect
is texture masking, which refers to the observation that a
distortion (such as ringing) is more visible in homogenous
areas than in textured or detailed areas [40]. In textured
image regions, small variations in the texture are masked by
the macro properties of genuine high-frequency details, and
therefore, are not perceived by the HVS [38]. The effect
of luminance and texture masking on ringing artifacts is
illustrated in Figs. 2 and 3, respectively.
B. Existing Ringing Metrics
Until recently, only a limited amount of research effort was
devoted to the development of a ringing metric. Some of these
Authorized licensed use limited to: Technische Universiteit Delft. Downloaded on April 18,2010 at 21:23:23 UTC from IEEE Xplore. Restrictions apply.

LIU et al.: A NO-REFERENCE METRIC FOR PERCEIVED RINGING ARTIFACTS IN IMAGES 531
Fig. 2. Example of luminance masking on ringing artifacts. (a) Image patch
compressed with JPEG (MATLAB’s imwrite function with “quality” of 30).
(b) Pixel intensity profile along one row of the compressed image patch
[indicated by the solid double arrowhead line in (a)]. Original image includes
two adjacent parts with different gray-scale levels (i.e., 5 for “a1” and 127 for
“a2”). Note that although both sides of a step edge exhibit ringing artifacts,
the visibility of ringing differs.
Fig. 3. Example of texture masking on ringing artifacts. (a) Image patch
extracted from a JPEG compressed image of bit rate 0.59 bits per pixel
(b/p). (b) Pixel intensity profile along one row of the compressed image patch
[indicated by the solid double arrowhead line in (a)]. Dashed line “e” indicates
the object boundary edge. Note that although both sides of the edge at “e”
exhibit ringing artifacts, the visibility of ringing differs.
metrics are FR, others NR. A FR approach presented in [14]
starts from finding important edges in the original image (noise
and insignificant edges are removed by applying a threshold to
the Sobel gradient image), and then measures ringing around
each edge by calculating the difference between the processed
image and the reference. Since this metric needs the original
image, it has its limitations, e.g., for the application in a TV
chain. The NR ringing metric, proposed in [17], performs a
anisotropic diffusion on the image and measures the noise
spectrum filtered out by the anisotropic diffusion process. The
basic idea behind this metric is that due to the effectiveness
of anisotropic diffusion on deringing, the artifacts would be
mostly assimilated into the spectrum of the filtered noise. The
NR ringing metric described in [16] indentifies the ringing
regions around strong edges in the compressed image, and
defines ringing as the ratio of the activity in middle low
over middle high frequencies in these ringing regions. An
obvious shortcoming of the metrics defined in [14], [16], and
[17] is the absence of masking, typically occurring in the
HVS, with the consequence that these metrics do not always
reflect perceived ringing. Typical masking characteristics, such
as luminance and texture masking, are explicitly considered
in the metrics defined in [28] and [29], in which ringing
regions are no longer simply assumed to surround all strong
edges in an image, but are determined by a model of the
HVS. Including a HVS model in an objective metric might
improve its accuracy, but often is computationally intensive
for real-time applications. For example, the HVS model used
in the metric presented in [28] largely depends on a parameter
estimation procedure, which requires a number of calculations
to achieve an optimal selection. The model described in [29] is
based on a computationally heavy clustering scheme, including
both color clustering and texture clustering. From a practical
point of view, it is highly desirable to reduce the complexity
of the HVS-based metric without compromising its overall
performance.
The essential idea behind most of the existing metrics
mentioned so far (see, e.g., in [14], [16], and [28]) is that
they consist of a two-step approach. The first step identifies
the spatial location, where perceived ringing occurs, and the
second step quantifies the visibility or annoyance of ringing
in the detected regions. This approach intrinsically avoids the
estimation of ringing in irrelevant regions in an image, thus
making the quantification of ringing annoyance more reliable,
and the calculation more efficient. Additionally, a local de-
termination of the artifact metric provides a spatially varying
quality degradation profile within an image, which is useful
in, e.g., video chain optimization as mentioned in Section I.
Since ringing occurs near sharp edges, where it is not visually
masked by local texture or luminance, the detection of ringing
regions largely relies on an edge detection method followed by
a HVS model. Existing methods (such as, e.g., [14], [16], [28],
and [29]) usually employ an ordinary edge detector, where a
threshold is applied to the gradient image to capture strong
edges. Depending on the choice of the threshold, this runs
the risk of omitting obvious ringing regions near nondetected
edges (e.g., in case of a high threshold) or of increasing
the computational cost by modeling the rather complex HVS
near irrelevant edges (e.g., in case of a low threshold). This
implies that to ensure a reliable detection of perceived ringing
while maintaining low complexity for real-time applications,
an efficient approach for both detecting relevant edges and
modeling the HVS is needed. Quantification of the annoyance
of ringing in the detected areas can be easily achieved by
calculating the signal difference between the ringing regions
and their corresponding reference, as used in the FR approach
described in [14]. However, for a NR ringing metric, the
quantification of ringing becomes more challenging mainly
due to the lack of a reference. Metrics in literature (such as
in [16] and [28]) estimate the visibility of ringing artifacts
from the local variance in intensity around each pixel within
the detected ringing regions, and average these local variances
over all ringing regions to obtain an overall annoyance score.
This approach, however, has limited reliability, since it does
not include background texture in the ringing regions, which
might affect ringing visibility.
To validate the performance of a ringing metric, its predicted
quality degradation should be evaluated against subjectively
perceived image quality. To prove whether a ringing metric is
robust against different compression levels and different image
content, the correlation between its objective predictions and
subjective ringing ratings must be calculated. Unfortunately,
Authorized licensed use limited to: Technische Universiteit Delft. Downloaded on April 18,2010 at 21:23:23 UTC from IEEE Xplore. Restrictions apply.

532 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 20, NO. 4, APRIL 2010
Fig. 4. Schematic overview of the proposed ringing region detection method.
In PEM, each perceptually relevant LS is labeled in a different color. In the
CRR map, the white areas indicate the detected perceived ringing regions,
and the spatial location of these regions is illustrated in a separate image by
green areas.
only the performance of the metric reported in [14] is evaluated
against subjective data of perceived ringing. For all other
metrics (such as the ones in [15]–[17] and [28]) nothing can
be concluded with respect to their performance in predicting
perceived ringing. Since we had no access to the data used
in [14] for our metric evaluation, we performed our own
subjective experiment.
1
In this paper, we propose a NR ringing metric based on
the same two-step approach mentioned above. For the first
step, we rely on our ringing region detection method (see [30]
and [31]), the performance of which in terms of extracting
regions with perceived ringing has been shown to be promising
[31]. Therefore, we consider this part of the metric readily
applicable for the second step, in which the ringing annoyance
is quantified. To quantify ringing annoyance, we consider each
detected ringing region as a perceptual element, in which the
local visibility of ringing artifacts is estimated. The contrast
in activity between each ringing region and its corresponding
background is calculated as the local annoyance score, which
is then averaged over all ringing regions to yield an overall
ringing annoyance score. It should be noted that the proposed
metric is built upon the luminance component of images only
in order to reduce the computational load. The performance
of the NR metric is evaluated against subjective ringing
annoyance in JPEG compression.
III. Proposed NR Ringing Metric
A. Perceived Ringing Region Detection
For the design of our ringing region detection method (see
[30] and [31]), we explicitly exploited the specific physical
structure of ringing artifacts and some properties of the
HVS. The overall proposed algorithm is schematically shown
in Fig. 4, which mainly consists of two processing steps:
1) extraction of edges relevant for ringing, which results in
a perceptual edge map (PEM), and 2) detection of perceived
ringing regions, which yields a computational ringing region
(CRR) map. This method is already described in more detail
in [30] and [31], and is only briefly repeated here.
1
The data collected from this experiment are available to the image
quality assessment community on the website http://mmi.tudelft.nl/ingrid/
ringing.html.
To extract the most relevant edges for the purpose of ringing
detection, an advanced edge detector is used. It adopts a
bilateral filter [32] to largely smooth “irrelevant edges” (i.e.,
in textured areas), while the position of the “relevant edges”
(e.g., contours of objects) is retained. Subsequently, a Canny
edge detector [33] is applied on the filtered image to obtain the
“relevant edges. The detected edges are combined into line
segments [hereafter referred to as line segment (LS)], which
are defined as elements of connected edge pixels. These LSs
are constructed over the Canny edge map by a simple grouping
process, including skeletonizing, edge linking, noise removal,
and LS labeling. Fig. 4 shows the extracted PEM, which is
formed by a set of these LSs. It clearly illustrates the selection
of the edges more relevant for ringing (i.e., the contours of the
leopard) in combination with the avoidance of the irrelevant
edges (i.e., the texture in the skin of the leopard).
To select the edges around which ringing is actually per-
ceived each LS of the PEM is examined individually on the
occurrence of perceived ringing. To this end, the region around
a LS is divided into three zones: 1) the edge region (i.e.,
EdReg); 2) the detection region (i.e., DeReg); and 3) the
feature extraction region (i.e., FeXReg). First, the level of
texture or detail is estimated from the FeXReg, and those parts
of the DeReg, in which the visibility of ringing is masked by
texture, are discarded. Subsequently, the average luminance in
each remaining part of the DeReg is calculated and those parts
with a value above or below a certain threshold are discarded.
In this way, only those regions around each LS, in which
ringing is visible, are extracted, and then accumulated in the
CRR map as illustrated in Fig. 4.
B. Ringing Annoyance Estimation
The CRR map indicates the spatial location of perceived
ringing, but it does not give any information yet on how
annoying the ringing artifacts in the detected region are. To
quantify ringing annoyance, we first split up the detected
region in the CRR map into so-called ringing objects (ROs).
Fig. 5 illustrates the definition of an RO. It starts from the
LSs of the PEM, shown in Fig. 4. Each LS is considered
to be split up in a set of connected components (i.e., objects)
depending on the local level of texture and averaged luminance
in its DeReg (as defined in [30] and [31]). Then, by using the
model of the HVS, the visibility of ringing in each object
is determined. By removing the objects, in which ringing is
invisible due to masking, the remaining objects are defined as
ROs. As an example, illustrated in Fig. 5(b) the LS1 of the
PEM in Fig. 4 is split up in two ROs, while the LS2 remains
as one RO. Some of the LSs, e.g., LS5, LS6, LS8, and LS9,
do not result in an RO, since no visible ringing is detected
around this LS based on the HVS. So, each RO intrinsically
is a single cluster resulting from the application of the human
vision model to the LSs of the PEM. Hence, the definition of
an RO fully relies on the local image content, and as such, is
independent of scaling or cropping the image. Once the ROs
are defined [as illustrated in Fig. 5(c)], a ringing annoyance
score (RAS) is calculated for each of them, and the overall
annoyance score for the image is simply the mean of the RAS
over all ROs.
Authorized licensed use limited to: Technische Universiteit Delft. Downloaded on April 18,2010 at 21:23:23 UTC from IEEE Xplore. Restrictions apply.

LIU et al.: A NO-REFERENCE METRIC FOR PERCEIVED RINGING ARTIFACTS IN IMAGES 533
Fig. 5. Illustration of the definition of an RO. (a) Original JPEG image and
two (out of ten) of its detected LSs (i.e., LS1 and LS2 of the PEM in Fig. 4).
(b) Implementation of the human vision model to LS1 and LS2, resulting in
two separate ROs for LS1 and one RO for LS2. (c) All detected ROs as a
result of applying the human vision model to the whole PEM (i.e., ten LSs);
they are indicated with different colors.
Fig. 6. Illustration of region assignment. (a) RO [see “RO3” in Fig. 5(b)]
with its corresponding LS and feature extraction region (FeXReg).
(b) Corresponding edge of LS covered by the dilated RO is assigned as the
Sub-LS. (c) Corresponding region of FeXReg covered by the dilated RO is
assigned as the Sub-FeXReg. (d) Results of region assignment.
The approach taken to quantify perceived ringing is inspired
by the basic idea used in the FR metric [14], and is accom-
plished by the following two steps: 1) calculating the activity
of each RO; and 2) comparing that activity to the activity in
the neighboring background to which the RO belongs.
1) Region Assignment: To implement the two steps men-
tioned above, we first assign two relevant components to each
RO in the CRR map: 1) the edge corresponding to each
LS (i.e., referred to as Sub-LS), which is used to determine
whether a pixel in the RO is a visible ringing pixel, and
2) the corresponding FeXReg region (i.e., referred to as Sub-
FeXReg), which is employed as the reference for the RO. The
FeXReg is located far away from the LS, and thus unlikely
to be impaired by ringing artifacts. This region assignment is
implemented by thickening an RO with a dilation operation.
The corresponding LS and FeXReg which are covered by
the RO during the dilation process are referred to as the
Sub-LS and Sub-FeXReg, respectively. Fig. 6 illustrates this
procedure. A specific RO (i.e., “RO3” in the CRR map of
Fig. 5) with its corresponding LS and FeXReg are shown
in Fig. 6(a). When dilating the RO with a square structuring
element of 5 pixels width (e.g., for an image of 256 × 384
(height × width) pixels), the region of LS which is covered by
the expanded RO is assigned as the Sub-LS (i.e., the yellow
region in Fig. 6(b)). The Sub-FeXReg [i.e., the purple region
Fig. 7. Illustration of the list of coordinates as the result of region assignment
(the total number of RO in the CRR map is n).
in Fig. 6(c)] is assigned in the same way by dilating the
RO with a square structuring element of 9 pixels width. The
resulting Sub-LS and Sub-FeXReg are shown in Fig. 6(d). It
is noted that the size of the structuring element should be
linearly scaled with the image size. The region assignment
mentioned above is performed for each RO in the CRR map
to eventually obtain a list of coordinates, which indicates the
spatial location of each individual RO and its corresponding
Sub-LS and Sub-FeXReg. Fig. 7 indicates the format of such a
resulting list of coordinates. This way of working intrinsically
facilitates the subsequent local analysis and processing of
image characteristics.
2) Local Visibility of Ringing Pixels: Since ringing man-
ifests itself in the form of artificial oscillations in the spatial
domain, its local behavior can be reasonably described as the
intensity variance of pixels in the neighborhood [28], [29]. In
this paper, determining whether a pixel in an RO is a visible
ringing pixel is based on calculating the local variance (LV)
in intensity in its 3 × 3 neighborhood, which is formulated as
LV(i, j)=
1
9
i+1
k=i1
j+1
l=j1
I
(
k, l
)
1
9
i+1
k=i1
j+1
l=j1
I(k, l)
2
( i, j ) Coord{RO
n
} (1)
where LV(i, j) denotes the local variance computed over a
3 × 3 template, centered at pixel (i, j ) having an intensity
I(i, j) within the nth ringing object (i.e., RO
n
).
The LV only yields an accurate result in case the RO is
originally smooth around the edge; indeed, otherwise the LV
can be high due to the activity of a textured or edge pixel.
One would expect that the issue of considering texture as
ringing is efficiently avoided by the application of a texture
masking model in the ringing region detection phase (see
[30] and [31]). However, we experienced that the dilation
operation used in the human vision model may misclassify
certain edge or texture components into an RO. In addition,
there might be pixels in the RO exhibiting no or a very small
intensity variance in their neighborhoods, which means they
are not impaired by ringing artifacts (e.g., in higher bit rate
compression). This implies that an RO still possibly contains
spurious ringing pixels, which manifest themselves either as
“noisy pixels” (i.e., misclassified edge or texture pixels) or
as “unimpaired pixels” (i.e., pixels with a very low variance
in intensity in the neighborhood). Fig. 8 gives an example
of the image content underneath a detected RO (i.e., “RO2”
as illustrated in Fig. 5), where noisy pixels and unimpaired
pixels coexist with real ringing pixels. Calculating the LV
Authorized licensed use limited to: Technische Universiteit Delft. Downloaded on April 18,2010 at 21:23:23 UTC from IEEE Xplore. Restrictions apply.

Citations
More filters
Journal ArticleDOI

A Feature-Enriched Completely Blind Image Quality Evaluator

TL;DR: The proposed opinion-unaware BIQA method does not need any distorted sample images nor subjective quality scores for training, yet extensive experiments demonstrate its superior quality-prediction performance to the state-of-the-art opinion-aware BIZA methods.
Journal ArticleDOI

Perceptual image quality assessment: a survey

TL;DR: This survey provides a general overview of classical algorithms and recent progresses in the field of perceptual image quality assessment and describes the performances of the state-of-the-art quality measures for visual signals.
Journal ArticleDOI

Visual Attention in Objective Image Quality Assessment: Based on Eye-Tracking Data

TL;DR: Whether and to what extent the addition of NSS is beneficial to objective quality prediction in general terms is evaluated, and some practical issues in the design of an attention-based metric are addressed.
Journal ArticleDOI

No-Reference Image Blur Assessment Based on Discrete Orthogonal Moments

TL;DR: The experimental results demonstrate that the proposed blind image blur evaluation algorithm can produce blur scores highly consistent with subjective evaluations and outperforms the state-of-the-art image blur metrics and several general-purpose no-reference quality metrics.
Journal ArticleDOI

Hybrid No-Reference Quality Metric for Singly and Multiply Distorted Images

TL;DR: This paper introduces a new multiply distorted image database (MDID2013), which is composed of 324 images that are simultaneously corrupted by blurring, JPEG compression and noise injection, and proposes a new six-step blind metric (SISBLIM) for quality assessment of both singly and multiply distorted images.
References
More filters
Journal ArticleDOI

Image quality assessment: from error visibility to structural similarity

TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.
Journal ArticleDOI

A Computational Approach to Edge Detection

TL;DR: There is a natural uncertainty principle between detection and localization performance, which are the two main goals, and with this principle a single operator shape is derived which is optimal at any scale.
Proceedings ArticleDOI

Bilateral filtering for gray and color images

TL;DR: In contrast with filters that operate on the three bands of a color image separately, a bilateral filter can enforce the perceptual metric underlying the CIE-Lab color space, and smooth colors and preserve edges in a way that is tuned to human perception.
Journal ArticleDOI

Image information and visual quality

TL;DR: An image information measure is proposed that quantifies the information that is present in the reference image and how much of this reference information can be extracted from the distorted image and combined these two quantities form a visual information fidelity measure for image QA.
Book

JPEG2000 : image compression fundamentals, standards, and practice

TL;DR: This work has specific applications for those involved in the development of software and hardware solutions for multimedia, internet, and medical imaging applications.
Related Papers (5)
Frequently Asked Questions (10)
Q1. What are the contributions in "A no-reference metric for perceived ringing artifacts in images" ?

A novel no-reference metric that can automatically quantify ringing annoyance in compressed images is presented. 

To facilitate further benchmarking of ringing metrics, apart from developing computational models, future work should also focus on collecting and distributing more reliable subjective data. 

Since the high-frequency components play a significant role in the representation of an edge, coarse quantization in this frequency range (i.e., truncation of the high-frequency transform coefficients) consequently results in apparent irregularities around edges in the spatial domain, which are usually referred to as ringing artifacts. 

Since ringing manifests itself in the form of artificial oscillations in the spatial domain, its local behavior can be reasonably described as the intensity variance of pixels in the neighborhood [28], [29]. 

As suggested in [39], the metric’s performance can also be evaluated with nonlinear correlations using a nonlinear mapping function for the objective predictions before computing the correlation. 

On the other hand, without a sophisticated nonlinear fitting (often including various parameters) the correlation coefficients cannot mask a bad performance of the metric itself. 

Quantification of the annoyance of ringing in the detected areas can be easily achieved by calculating the signal difference between the ringing regions and their corresponding reference, as used in the FR approach described in [14]. 

It should be noted that the proposed metric is built upon the luminance component of images only in order to reduce the computational load. 

An obvious shortcoming of the metrics defined in [14], [16], and [17] is the absence of masking, typically occurring in the HVS, with the consequence that these metrics do not always reflect perceived ringing. 

The effect of the spurious ringing pixels on the RAS is avoided by applying two thresholds: 1) high threshold (Thr−vc−high), and 2) low threshold (Thr−vc−low).