What are the future works in "Characterizing perceptual artifacts in compressed video streams" ?

The current work also lays out a work plan for future studies. Firstly, objective VQA methods need to be reexamined and further developed to detect each compression artifacts reliably and efficiently. Secondly, video encoders may be designed to eliminate or minimize the impact of these perceptual artifacts.

(Open Access) Characterizing perceptual artifacts in compressed video streams (2014) | Kai Zeng

Q: What are the contributions mentioned in the paper "Characterizing perceptual artifacts in compressed video streams" ?

In this paper, the authors reexamine the perceptual artifacts created by standard video compression, summarizing commonly observed spatial and temporal perceptual distortions in compressed video, with emphasis on the perceptual temporal artifacts that have not been well identified or accounted for in previous studies. Furthermore, a floating effect detection method is proposed that not only detects the existence of floating, but also segments the spatial regions where floating occurs∗.

Characterizing Perceptual Artifacts in Compressed

Video Streams

Kai Zeng, Tiesong Zhao, Abdul Rehman and Zhou Wang

Dept. of Electrical & Computer Engineering, University of Waterloo, Waterloo, ON, Canada

ABSTRACT

To achieve optimal video quality under bandwidth and power constraints, modern video coding techniques em-

ploy lossy coding schemes, which often create compression artifacts that may lead to degradation of perceptual

video quality. Understanding and quantifying such perceptual artifacts play important roles in the development

of eﬀective video compression, streaming and quality enhancement systems. Moreover, the characteristics of

compression artifacts evolve over time due to the continuous adoption of novel coding structures and strategies

during the development of new video compression standards. In this paper, we reexamine the perceptual arti-

facts created by standard video compression, summarizing commonly observed spatial and temporal perceptual

distortions in compressed video, with emphasis on the perceptual temporal artifacts that have not been well

identiﬁed or accounted for in previous studies. Furthermore, a ﬂoating eﬀect detection method is proposed that

not only detects the existence of ﬂoating, but also segments the spatial regions where ﬂoating occurs

∗

Keywords: video compression, video quality assessment, compression artifact, H.264-MPEG4/AVC, HEVC,

ﬂickering, ﬂoating detection

1. INTRODUCTION

The demand for high-performance network video communications has been increasing exponentially in recent

years. According to Cisco Visual Networking Index, the sum of all forms of video (TV, VoD, Internet, and

P2P) will constitute approximately 90 percent of global consumer traﬃc by 2015.

A high-performance video

compression technology is critical for current industrial video communication systems to catch up with such

increasing demand. A fundamental issue in the design of video compression systems is to achieve an optimal

compromise between the availability of resources (i.e. bandwidth, power, and time) and the perceptual quality

of the compressed video. The constraint in available resources often leads to degradations of perceptual quality

by introducing compression artifacts in the decoded video. For example, large quantization step could reduce

power consumption, encoding time, as well as the bandwidth needed to encode the video, but, unfortunately,

also results in video quality degradation.

Consumers’ expectations for better Quality-of-Experience (QoE) nowdays has been higher than ever before.

Despite the fast technological development in telecommunication and display devices, poor video quality origi-

nated from compression and streaming processes has disappointed a large volume of consumers, resulting in major

revenue lost in digital media communication industry. Based on a recent viewer experience study,

“In 2012,

global premium content brands lost $2.16 billion of revenue due to poor quality video streams and are expected

to miss out on an astounding $20 billion through 2017”. The poor video quality keeps challenging the viewers’

patience and becomes a core threat to the video service ecosystem. According to the same study,

roughly

60% of all video streams experienced quality degradation in 2012. In another recent study,

90.4% interviewers

reported “end-user video quality monitoring as either “critical”, “very important”, or “important” to their video

Further author information: (Send correspondence to Kai Zeng)

Kai Zeng: E-mail: kzeng@uwaterloo.ca, Telephone: 1 519 888 4567 ext. 31449

Tiesong Zhao: E-mail: ztiesong@uwaterloo.ca, Telephone: 1 519 888 4567 ext. 31448

Abdul Rehman: E-mail: abdul.rehman@uwaterloo.ca, Telephone: 1 519 888 4567 ext. 31449

Zhou Wang: E-mail: zhou.wang@uwaterloo.ca, Telephone: 1 519 888 4567 ext. 35301

∗

Image and video examples that demonstrate various types of spatial and temporal compression artifacts are available

at https://ece.uwaterloo.ca/

z70wang/research/compression_artifacts/.

Presented at: IS&T/SPIE Annual Symposium on Electronic Imaging, San Francisco, CA, Feb. 2-6, 2014

Published in: Human Vision and Electronic Imaging XIX, Proc. SPIE, vol. 9014. @SPIE

initiatives”, and almost half of the customer phone calls is related to video quality problems in Video-on-Demand

(VOD) services and HDTV. Additionally, even though 58.1% of the interviewed subjects reported the end-user

QoE is “critical” and requires to be monitored, only 31% said they use network monitoring tools to discover

quality problems.

Therefore, there is an urgent need of eﬀective and eﬃcient objective video quality assessment

(VQA) tools in current media network communication systems that can provide reliable quality measurement of

end users’ visual QoE.

Since compression is a major source of video quality degradation, we focuses on perceptual artifacts generated

by standard video compression techniques in the current work. Various types of artifacts created by standard

compression schemes had been summarized previously.

Objective VQA techniques had also been designed to

automatically evaluate the perceptual quality of compressed video streams.

However, recent studies suggest

that widely recognized VQA models (though promising) only achieve limited success in predicting the perceptual

coding gain between state-of-the-art video coding techniques, and problems often occur when speciﬁc temporal

artifacts appear in the compressed video streams.

This is likely due to the adoption of the novel coding

structures and strategies in the latest development of video compression standards such as H.264/AVC

and the

high eﬃciency video coding (HEVC).

This motivates us to reexamine the perceptual artifacts created by video

compression, with emphasis on the perceptual temporal artifacts that have not been well identiﬁed or accounted

for in previous studies.

In this paper, we ﬁrst attempt to elaborate various spatial and temporal artifacts originated from standard

video compression. These include both conventional artifacts and those emerged recently in the new coding

standards, such as various ﬂickering and ﬂoating eﬀects. Examples are provided to demonstrate the artifacts in

diﬀerent categories. Possible reasons and consequences of these artifacts together with their perceptual impact

are discussed in the context of compression. Finally, an objective ﬂoating artifact detection scheme is proposed,

which not only detects the existence of ﬂoating, but also indicates the location of ﬂoating regions in each video

frame.

2. PERCEPTUAL ARTIFACTS IN COMPRESSED VIDEO

A diagram that summarizes various types of compression artifacts is given in Fig. 1. Both spatial and temporal

artifacts may exist in compressed video, where spatial artifacts refer to the distortions that can be observed in

individual frames while temporal artifacts can only be seen during video playback. Both spatial and temporal

artifacts can be further divided into categories and subcategories of more speciﬁc distortion types. A detailed

description of the appearance and causes of each type of perceptual compression artifacts will be given in the

following sections. In addition to these artifacts, there are a number of other perceptual video artifacts that

are often seen in real-world visual communication applications. These include those artifacts generated during

video acquisition (e.g., camera noise, camera motion blur, and line/frame jittering), during video transmission

in error-prone networks (e.g., video freezing, jittering, and erroneously decoded blocks caused by packet loss and

delay), and during video post-processing and display (e.g., post deblocking and noise ﬁltering, spatial scaling,

retargeting, chromatic aberration, and pincushion distortion). Since compression is not the main cause of these

artifacts, they are beyond the major focus of the current paper.

2.1 Spatial Artifacts

Block-based video coding schemes create various spatial artifacts due to block partitioning and quantization.

These artifacts include blurring, blocking, ringing, basis pattern eﬀect, and color bleeding. They are detected

without referencing to temporally neighboring frames, and thus can be better identiﬁed when the video is paused.

Due to the complexity of modern compression techniques, these artifacts are interrelated with each other, and

the classiﬁcation here is mainly based on their visual appearance.

2.1.1 Blurring

All modern video compression methods involve a frequency transform step followed by a quantization process

that often removes small amplitude transform coeﬃcients. Since the energy of natural visual signals concentrate

at low frequencies, quantization reduces high frequency energy in such signals, resulting in signiﬁcant blurring

eﬀect in the reconstructed signals. Perceptually, blurring typically manifests itself as a loss of spatial details or

spatial

artifacts

temporal

artifacts

compression

artifacts

ringing

blocking

blurring

color bleeding

mosaicing effect

basis pattern effect

staircase effect

false edge

flickering

floating

jerkiness

edge neighborhood floating

texture floating

mosquito noise

fine-granularity flickering

coarse-granularity flickering

Figure 1. Categorization of perceptual artifacts created by video compression

sharpness at edges or texture regions in the image. Since in block-based coding schemes, frequency transformation

and quantization are usually conducted within individual image blocks, blurring caused by such processes is often

created inside the blocks.

Another source of blurring eﬀect is in-loop de-blocking ﬁltering, which is employed to reduce the blocking

artifact across block boundaries, and are adopted as options by state-of-the-art video coding standards such as

H.264/AVC and HEVC. The de-blocking operators are essentially spatially adaptive low-pass ﬁlters that smooth

the block boundaries, and thus produces perceptual blurring eﬀect.

A visual example is given in Fig. 2, where the left picture is a reference frame extracted from the original

video, and the middle and right pictures are two decoded H.264/AVC frames with the de-blocking ﬁlter turned

oﬀ and on, respectively. It can be observed that without de-blocking ﬁltering, the majority of blur occurs within

each block while the blocking artifact across the block boundaries is quite severe, for example, in the marked

rectangular region in Fig. 2(b). When the de-blocking ﬁlter is turned on, much smoother luminance transition

is observed in the same region, as shown in Fig. 2(c), but the overall appearance of the picture is more blurry.

(a)

(c)(b)

Figure 2. An example of spatial artifacts created by video compression. (a) Reference frame; (b) Compressed frame with

de-blocking ﬁlter turned oﬀ; (c) Compressed frame with de-blocking ﬁlter turned on.

2.1.2 Blocking

Blocking artifact or blockiness is a very common type of distortion frequently seen in reconstructed video produced

by video compression standards, which use blocks of various sizes as the basic units for frequency transformation,

quantization and motion estimation/compensation, thus producing false discontinuities across block boundaries.

Although all blocking eﬀects are generated because of similar reasons mentioned above, their visual appearance

may be diﬀerent, depending on the region where blockiness occurs. Therefore, here we further classify the

blocking eﬀects into three subcategories.

(a)

(b)

Figure 3. An example of blocking artifacts. (a) Reference frame; (b) Compressed frame with three types of blocking

artifacts: mosaic eﬀect (elliptical region); staircase eﬀect (rectangular region); false edge (triangular region).

• Mosaic eﬀect usually occurs when there is luminance transitions at large low-energy regions (e.g., walls,

black/white boards, and desk surfaces). Due to quantization within each block, nearly all AC coeﬃcients

are quantized to zero, and thus each block is reconstructed as a constant DC block, where the DC values

vary from block to block. When all blocks are put together, mosaic eﬀect manifests as abrupt luminance

change from one block to another across the space. The mosaic eﬀect is highly visible and annoying to

the visual system, where the visual masking eﬀect (which stands for the reduced visibility of one image

component due to the existence of another neighboring image component) is the weakest at smooth regions.

An example is shown in the marked elliptical region in Fig. 3(b).

• Staircase eﬀect typically happens along a diagonal line or curve, which, when mixed with the false

horizontal and vertical edges at block boundaries, creates fake staircase structures. In Fig. 3(b), an example

of staircase eﬀect is highlighted in the marked rectangle region.

• False edge is a fake edge that appears near a true edge. This is often created by a combination of motion

estimation/compensation based inter-frame prediction and blocking eﬀect in the previous frame, where

blockiness in the previous frame is transformed to the current frame via motion compensation as artiﬁcial

edges. An example is given in the triangle marked region in Fig. 3(b).

2.1.3 Ringing

Sharp transitions in images such as strong edges and lines are transformed to many coeﬃcients in frequency

domain representations. The quantization process results in partial loss or distortion of these coeﬃcients. When

the remaining coeﬃcients are combined to reconstruct the edges or lines, artiﬁcial wave-like or ripple structures

are created in nearby regions, known as the ringing artifacts. Such ringing artifacts are most signiﬁcant when

the edges or lines are sharp and strong, and when the regions near the edges or lines are smooth, where the

visual masking eﬀect is the weakest. Fig. 4(b) shows an example of ringing artifacts. It is worth noting that

when the ringing eﬀect is combined with object motion in consecutive video frames, a special temporal artifact

called mosquito noise is observed, which will be discussed later.

2.1.4 Basis pattern eﬀect

The origin of the basis pattern eﬀect is similar to that of the ringing eﬀect, but the spatial regions where the basis

pattern eﬀect occurs are not restricted to sharp edges or lines. More speciﬁcally, in certain texture regions with

moderate energy, when the transform coeﬃcients are quantized, there is a possibility that only one transform

coeﬃcient remains (while all other coeﬃcients are quantized to zero or nearly zero). As a result, when the

image signal is reconstructed using a single coeﬃcient, the basis pattern (e.g., a DCT basis) associated with the

coeﬃcient is created as a representation of the image structure. An example is shown in Fig. 5(b), in which the

(a) (b)

Figure 4. An example of ringing artifact. (a) Reference frame; (b) Compressed frame with ringing artifact.

basis pattern eﬀect is highlighted in the marked rectangular regions. Since the basis pattern eﬀect usually occurs

at texture regions, its visibility depends on the nature of the texture region. If the region is in the foreground

and attract visual attention, the basis pattern eﬀect will have strong impact on perceived video quality. By

contrast, if the region is in the background and does not attract visual attention, then the eﬀect is often ignored

by human observers.

(b)(a)

Figure 5. An example of basis pattern eﬀect. (a) Reference frame; (b) Compressed frame with basis pattern eﬀect.

2.1.5 Color bleeding

Color bleeding is a result of inconsistent image rendering across the luminance and chromatic channels. For

example, in the most popular YCbCr 4:2:0 video format, the color channels Cb and Cr have half resolution

of the luminance channel Y in both horizontal and vertical dimensions. After compression, all luminance and

chromatic channels exhibit various types of distortions (such as blurring, blocking and ringing described earlier),

and more importantly, these distortions are inconsistent across color channels. Moreover, because of the lower

resolution in the chromatic channels, the rendering processes inevitably involve interpolation operations, leading

to additional inconsistent color spreading in the rendering result. In the literature, it was shown that chromatic

distortion is helpful in color image quality assessment,

but how color bleeding aﬀects the overall perceptual

quality of compressed video is still an unsolved problem. An example of color bleeding is given in the highlighted

elliptical region in Fig. 6(b).

2.2 Temporal Artifacts

Temporal artifacts refer to those distortion eﬀects that are not observed when the video is paused but during

video playback. Temporal artifacts are of particular interest to us for two reasons. First, as compared to

spatial artifacts, temporal artifacts evolve more signiﬁcantly with the development of video coding techniques.

Characterizing perceptual artifacts in compressed video streams

Figures

Citations

Perceptual Flicker Visibility Prediction Model.

[Paper] Extended Joint Bilateral Filter for the Reduction of Color Bleeding in Compressed Image and Video

Does H.265 based peri and para-foveal quality flicker disrupt natural viewing patterns?

Adversarial Distortion for Learned Video Compression

Impact of Various Motion Interpolation Algorithms on 360° Video QoE

References

Image quality assessment: from error visibility to structural similarity

Overview of the H.264/AVC video coding standard

Overview of the High Efficiency Video Coding (HEVC) Standard

Multiscale structural similarity for image quality assessment

A new standardized method for objectively measuring video quality

Related Papers (5)

Overview of the High Efficiency Video Coding (HEVC) Standard

A survey of hybrid MC/DPCM/DCT video coding distortions

Image quality assessment: from error visibility to structural similarity

Predictive perceptual compression for real time video communication

Motion Consideration in H.264/AVC Compressed Video Watermarking

Frequently Asked Questions (2)

Q1. What are the contributions mentioned in the paper "Characterizing perceptual artifacts in compressed video streams" ?

Q2. What are the future works in "Characterizing perceptual artifacts in compressed video streams" ?