scispace - formally typeset
Open AccessProceedings ArticleDOI

Efficient automatic detection of 3D video artifacts

TLDR
Efficient approaches for detecting three typical artifacts, sharpness mismatch, synchronization mismatch and stereoscopic window violation, are presented and experimental results show that the algorithms have considerable robustness in detecting 3D defects.
Abstract
This paper summarizes some common artifacts in stereo video content. These artifacts lead to poor even uncomfortable 3D viewing experience. Efficient approaches for detecting three typical artifacts, sharpness mismatch, synchronization mismatch and stereoscopic window violation, are presented in detail. Sharpness mismatch is estimated by measuring the width deviations of edge pairs in depth planes. Synchronization mismatch is detected based on the motion inconsistencies of feature points between the stereoscopic channels in a short time frame. Stereoscopic window violation is detected, using connected component analysis, when objects hit the vertical frame boundaries while being in front of the virtual screen. For experiments, test sequences were created in a professional studio environment and state-of-the-art metrics were used for evaluating the proposed approaches. The experimental results show that our algorithms have considerable robustness in detecting 3D defects.

read more

Content maybe subject to copyright    Report

Liu, M., Ndjiki-Nya, P., Le Quintrec, J-C., Nikolaidis, N., & Pitas, I.
(2015). Efficient automatic detection of 3D video artifacts. In
2014
IEEE 16th International Workshop on Multimedia Signal Processing
(MMSP 2014): Proceedings of a meeting held 22-24 September 2014,
Jakarta, Indonesia
(pp. 1-6). Institute of Electrical and Electronics
Engineers (IEEE). https://doi.org/10.1109/MMSP.2014.6958787
Peer reviewed version
Link to published version (if available):
10.1109/MMSP.2014.6958787
Link to publication record in Explore Bristol Research
PDF-document
This is the author accepted manuscript (AAM). The final published version (version of record) is available online
via IEEE at http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6958787. Please refer to any applicable
terms of use of the publisher.
University of Bristol - Explore Bristol Research
General rights
This document is made available in accordance with publisher policies. Please cite only the
published version using the reference above. Full terms of use are available:
http://www.bristol.ac.uk/red/research-policy/pure/user-guides/ebr-terms/

Efficient Automatic Detection of 3D Video Artifacts
Mohan Liu
#1
, Ioannis Mademlis
2
, Patrick Ndjiki-Nya
#1
, Jean-Charles Le Quintrec
3
,
Nikos Nikolaidis
2
, Ioannis Pitas
2
#
Interactive Media - Human Factors Department, Fraunhofer Institute for Telecommunication - HHI
Einsteinufer 37, 10629 Berlin, Germany
1
{mohan.liu, patrick.ndjiki-nya}@hhi.fraunhofer.de
Department of Informatics, Aristotle University of Thessaloniki - AUTH
Box 451, 54124 Thessaloniki, Greece
2
{imademlis, nikolaid, pitas}@aiia.csd.auth.gr
ARTE G.E.I.E., Association Relative
`
a la T
´
el
´
evision Europ
´
eenne
4 Quai Du Chanoine Winterer CS 20035, 67080 Strasbourg Cedex, France
3
jean-charles.lequintrec@arte.tv
Abstract—This paper summarizes some common artifacts in
stereo video content. These artifacts lead to poor even uncom-
fortable 3D viewing experience. Efficient approaches for detect-
ing three typical artifacts, sharpness mismatch, synchronization
mismatch and stereoscopic window violation, are presented
in detail. Sharpness mismatch is estimated by measuring the
width deviations of edge pairs in depth planes. Synchronization
mismatch is detected based on the motion inconsistencies of
feature points between the stereoscopic channels in a short
time frame. Stereoscopic window violation is detected, using
connected component analysis, when objects hit the vertical
frame boundaries while being in front of the virtual screen. For
experiments, test sequences were created in a professional studio
environment and state-of-the-art metrics were used for evaluating
the proposed approaches. The experimental results show that our
algorithms have considerable robustness in detecting 3D defects.
I. INTRODUCTION
Three-dimensional (3D) videos are not only a big success in
cinemas, but also entered into ordinary households. 3D expe-
rience is highly related to the quality of the 3D content. Thus,
quality assessment of 3D videos has become a rising topic.
In comparison to two-dimensional (2D) image quality assess-
ment, 3D quality is also highly related to depth perception and
the visual comfort. Although techniques for automatic quality
assessment of 2D images have been extensively developed for
years, prior research has shown that 2D quality metrics cannot
be directly used to estimate 3D quality features [1].
There are two major ways to create 3D sequences: filming
with stereo cameras and converting from 2D videos using
depth maps. Quality control is an important task in such
workflows. Although 3D quality assessment has been widely
studied for several years, there is still no formal definition
of 3D defects. In the MSU project [2], eight measures for 3D
artifacts are proposed. Similarly, fifteen 3D quality issues have
been identified in the Certifi3D project [3].
This paper proposes offline automatic analysis methods for
the detection and assessment of 3D artifacts. Three impor-
tant artifacts, i.e. Sharpness Mismatch (SM), SYNchroniza-
tion Mismatch (SYNM) and Stereoscopic Window Violation
(SWV), are further detailed. The proposed SM detector frame-
work is based on measuring the width deviations between
edge pairs in valid depth planes. The SYNM is estimated
based on statistics on motion inconsistencies of object fea-
ture points across stereoscopic channels. SWVs happen when
objects appearing in front of the virtual screen hit the vertical
frame boundaries. Connected component analysis is used for
detecting SWV. A robust disparity estimation algorithm [4],
which computes Disparity Maps (DM) in the horizontal and
vertical directions without rectifications [4], is integrated in
the proposed algorithms. The remaining of this paper are orga-
nized as follows: Section 2 describes the proposed approaches
in detail; Section 3 discusses the experimental results; Section
4 concludes the paper and describes future work.
II. P
ROPOSED APPROACHES
For observers, stereo 3D artifacts are not only undesirable
but sometimes also painful. Common defects in real stereo 3D
videos include (similar definitions of some defects are also
introduced in [2] and [3]):
Vertical misalignment: For depth illusion, vertical dispar-
ity is unwanted. Imperfect horizontal alignment of stereo
cameras can cause this defect.
Sharpness mismatch: Sharpness mismatch can be caused,
among others, by focus/aperture setup errors, inconsistent
light environment, compression, denoising.
Colorimetric mismatch: Common causes of colorimetric
mismatches are different point-of-views, changing light
conditions, malfunctioning or non-calibrated acquisition
system or even the use of bad color grading in post-
processing.
Synchronization mismatch: Non-genlock cameras or bad
post-processing can cause asynchronism artifacts between
stereo channels.
Hyper divergence/convergence: Excessive positive or
negative parallaxes on inappropriate viewing devices can
lead to these artifacts.

Cross-talk level: Artifact caused by imperfect view sep-
aration, such that one view can be partially seen in the
other view.
Stereoscopic window violation: Objects appearing in
front of the virtual screen in theatre space and hit the
left or right frame boundary cause retinal rivalry, that is
erroneously interpreted by the viewer as occlusion.
Bent window effect: Sometimes, objects appearing in
front of the virtual screen in theatre space extend ver-
tically across the entire frame and hit both the top
and bottom frame boundaries. This is interpreted by the
brain as an occlusion cue, causing the perception of the
stereoscopic window as being bent towards the viewer.
Depth jump cut: During editing, video cuts between two
shots with very different average depth cause a temporary
loss of the viewer’s 3D perception.
There are also other less common defects, e.g. view reversal
and reflection, and 2D to 3D conversion defects, e.g. depth
mismatch and visual mismatch. In this paper, techniques
developed for sharpness mismatch, synchronization mismatch
and stereoscopic window violation detection are presented.
There is no significant connection among these three artifacts.
A. Disparity map correction
The automatically estimated DMs are usually noisy, which
must be corrected for further use. A valid disparity mapping in
the horizontal direction from the Left view to the Right view
(L2R) views can be defined as
|DM
L2R
(i, j) DM
R2L
(i, j + DM
L2R
(i, j))| δ, (1)
where δ denotes the disparity estimation error tolerance and
(i, j) denotes the pixel coordinates in the disparity map. The
validation of disparity maps from the Right view to the Left
view (R2L) is similar to eq. 1.
B. Sharpness mismatch
The proposed approach estimates the SM by analyzing
width deviations of corresponding edge pairs. According to the
Epipolar geometry, the edge widths of edge pairs in a depth
plane are consistent between LR views, when the focuses of
stereo cameras are well calibrated. SM usually leads to cross-
talk/ghost effect and unexpected blurriness, which can impair
the 3D experience for observers.
We use the Sobel filters to extract edge pixels E. They
are further segmented in depth planes based on the estimated
disparities as
˜
E
d
(i, j) = E(i, j) & (DM (i, j) = d), (2)
where d denotes an estimated disparity value.
Only the vertical edge widths are measured in this work
since the major disparities occur in the horizontal direction.
For each edge pixel, a pixel sequence centered at the target
edge pixel is then selected. The width of a pixel sequence is
set to 64 pixels given human visual activity. Human visual
activity is related to the size of the fovea region, which covers
about 2% of the visual angle [5]. The central visual field covers
approximately 30
o
[5]. The width of an activity region on a
high-definition (HD) image can be approximately calculated
as about 64 pixels at a comfortable viewing distance. The
method proposed in [6], is used to measure the edge width by
locating the pixel positions of the local minimum and the local
maximum of luminance intensities centered on the target edge
pixel within the allocated pixel sequence. The edge width is
then computed as the Euclidean distance between the positions
of the local minimum and the local maximum pixels.
The perceived blurriness is also related to the local con-
trast [7]. The edge width of just noticeable blur w
JN B
[7]
is estimated in a 64 × 64 pixels block. If the deviations of
edge widths are larger than w
JN B
, the SM artifact cannot be
noticed. The cumulative probability of noticeable SM can be
calculated as
P
sm
=
1
N
N
X
i=1
I
dw>w
J N B
, (3)
where N denotes the number of edge pixels and dw denotes
the width deviation of an edge pair between stereoscopic
views. I
dw>w
J N B
is an indicator function if the condition
dw > w
JN B
is met. However, the cumulative probability
must be corrected considering the lack of edge pixels caused
by the disparity estimation errors. Then, the Probability of
Sharpness Mismatches P SM is estimated and smoothed with
the correction coefficients k , which are calculated considering
the number of valid disparity mappings, as
P SM = 1 exp
P
2
sm
+ k
2
2 · ˜σ
, (4)
where ˜σ denotes the standard deviation between P
sm
and k.
C. Synchronization mismatch
The annoyance of the synchronization distortion depends
on the 3D scene. SYNM can be very annoying in a shot with
strong object motions. If the scene is still, SYNM is almost
imperceptible. Thus, the first step of our approach is to analyze
the perceptibility of SYNM. Furthermore, shot detection is
required for the framework.
Single-valued spatial and temporal perceptual information,
defined by ITU [8], is computed to estimate the perceptibility
of SYNM. The Spatial Information (SI) describes the level of
spatial details in textured images and is calculated as the max-
imum standard deviation of each Sobel-filtered frame within
a time duration. The Temporal Information (T I) describes
the strength of the motion in a sequence and is computed
as the maximum standard deviation of the pixel luminance
differences at the same location between two neighboring
frames. The SI and TI values for a shot are respectively
calculated for the LR channels. The reversed perceptibility
score (RP S) is defined as
RP S = max
SI
c
T I
c
, c = {L, R}. (5)
If RP S is smaller than a threshold, it is necessary to detect
the synchronization mismatch.

SYNM is detected at frame and shot levels. If a frame is syn-
chronous at time point t, the object motions in a depth plane
between the LR views are consistent. Conversely, motion in-
consistency can be observed between two asynchronous views.
The computation of motion consistencies is based on relative
displacements of feature point pairs, which are extracted and
matched using SIFT [9] as well as RANSAC [10]. The
matched feature points are segmented according to validated
depth planes. The relative displacement
˜
d of a matched feature
point fp(i) in a depth plane D
j
, j [0, 255], is determined
as
˜
d(fp(i)) = P
L
DM
L2R
(P
L
) P
R
, fp(i) D
j
, (6)
where P
L
and P
R
denote the coordinates of fp(i) in the
L and R views respectively. The variances of the relative
displacements of all feature points are calculated to describe
the motion consistencies. Please note that if there is only slight
motion in depth but no noticeable motion in the horizontal
(H) and vertical (V ) directions between neighboring frames,
the corresponding SI is significant larger than T I.
To measure the synchronization of frame f(t) at time point
t, two neighboring frames f(t 1) and f(t + 1) are required.
The synchronism probability is measured from the left view
(f
L
(t)) to the right views (f
R
(t1, t, t+1)) and from the right
view (f
R
(t)) to the left views (f
L
(t 1, t, t + 1)) respectively.
The motion related displacements of the feature points are
decomposed in HV directions. Then, HV synchronization
probabilities are respectively estimated based on the variances
of the motion displacements of the matched feature points.
Hence, six histograms of the displacement variances in all
valid depth planes are constructed to rank the synchronization
probabilities between the frame f(t) of one view and the
frames f (t 1), f(t), f(t + 1) of the other view. Z
α
,
which is computed considering a confidence level α, is used
as the threshold for the estimation of the synchronization
probability. The synchronization probability in one direction
p
r
is calculated as,
p
r
=
P
n
i=1
h
i
n
, h
i
Z
α
, r = {H, V } (7)
where n is the total number of the variance histograms
h. In consideration of the statistical accuracy, the outliers
of the variance histogram are first detected and removed.
The overall synchronization probability p is computed as the
geometric mean of p
r
, since the geometric mean is more robust
than the arithmetic mean, if outliers can be observed in test
samples [11]. A frame is judged as synchronous when the
maximum p between views occur at time point t in both
estimation direction:
(
max{p(f(t)
L
, f(t
)
R
)} = p(f (t)
L
, f(t)
R
),
max{p(f(t)
R
, f(t
)
L
)} = p(f (t)
R
, f(t)
L
),
(8)
where t
= {t 1, t, t + 1}. If most of the frames within
a shot are estimated as asynchronous, the shot is judged as
asynchronous.
D. Stereoscopic window violation
In 3D cinematography, we observe the 3D world through
the so-called Stereoscopic Window (SW) [12], namely the
TV or cinema screen. In other words, the viewer watches
objects floating in a space defined by the screen edges. If
the left disparity of a 3D point is positive/zero/negative, the
eyes converge to a point either behind the screen, on screen or
within the theatre space (in front of the screen), respectively.
Retinal rivalry occurs on the left or right frame edges, when
object regions positioned close to the left image’s left or right
border do not have correspondence (are not displayed) in the
right frame and vice versa. For objects with zero disparity, no
retinal rivalry is observed. When an object region is cut off by
the edge of the display, it results in the so-called Stereoscopic
Window Violation (SWV) and is interpreted as occlusion by
the viewer.
A SWV does not create any problems when it occurs behind
the screen (i.e., for objects with positive left disparity), because
both disparity and occlusion cues dictate that the object lies
behind the screen. However, when SWV involves objects
perceived as appearing in front of the screen (i.e. they have
negative left disparity) the occlusion cue conflicts the disparity
one. Generally, as occlusion supersedes the disparity cue, the
object is finally perceived as lying behind the screen plane
[12]. The above are true for a mild SWV, where only a small
region of the object that interferes with the left or right frame
edge is missing from the other image. In a severe SWV, the
missing object region is so extended, that the human brain
cannot fuse the images and eventually see 3D.
SWV in negative disparities is not only undesirable, but
may also prove painful. The rule regarding SWV states that
a cinematographer has to avoid breaking the stereoscopic
window, while an object is being filmed with negative left
disparity. There is one notable exception, related to object
speed [13]. Objects entering or exiting the frame in no more
than half a second cause no problem, since, by the time the
brain localizes the object in front of the screen, the entire
object is either fully visible in the frame or has disappeared,
respectively.
A simple, yet effective, algorithm that detects the Stereo-
scopic Window Violation using disparity maps has been devel-
oped in this work. We assume the existence of left and right
dense disparity maps for each stereoscopic video frame, i.e.,
DM
L2R
(u, v) and DM
R2L
(u, v), u = 0, ..W 1, v = 0, ..H
1, where W, H are the width and height of the video frame
(in pixels). At the first step of the algorithm, pixels u, v are
selected, having left disparity DM
L2R
(u, v) < T
1
and right
disparity DM
R2L
(u, v) > T
1
. In order to exclude objects that
do not appear in front of the screen, we set the threshold T
1
to a suitable value and perform connected component analysis
with an 8-point neighbourhood to extract objects (connected
components) that are displayed significantly in front of the
screen. A value of T
1
= 0.0025W worked well in our
experiments. To reduce noise, objects with small width (less
than T
w
) or height (less than T
h
) are rejected. Threshold values

of T
w
= 0. 02W and T
h
= 0. 04H have been found to work
well. The detected objects are then enclosed into rectangular
bounding boxes (Regions of Interest, ROIs). Thus, two sets of
ROIs R
r
= {R
r
1
, R
r
2
, ..., R
r
n
} and R
l
= {R
l
1
, R
l
2
, ..., R
l
k
} are
created for the left and right channel, respectively. These ROIs
are represented by their upper left and lower right coordinates
[X
j
i,min
, Y
j
i,min
]
T
and [X
j
i,max
, Y
j
i,max
]
T
, where j = {r, l}
and i is the ROI index.
Two types of disturbing SWVs can be defined. In the
first type, namely left SWV, the violation occurs on the left
frame border, since there is a region in the left image which
is missing from the right one. Its detection is performed
as follows. If one or more object ROIs R
r
i
, with disparity
characteristics such as those previously described, lie on the
left border of the right image, that is, if X
r
i,min
= 0, a SWV is
present. This is because X
l
i,min
= X
r
i,min
+DM
R2L
(i, j) > 0
and, thus, the region [0, DM
R2L
(i, j)] in the left image is not
present in the right one. In order to reduce false alarms, arising
from inaccuracies in the disparity maps, another condition is
introduced. The number of pixels that belong to the object in
the two leftmost ROI columns must be greater than a threshold
T
2
, expressed as a percentage of the ROI height, to decide
that this object signals a SWV. In our experiments T
2
is set
to 0.3h
ROI
, where h
ROI
is the ROI height.
A similar procedure is followed for the detection of a
right SWV. In this case, a region appearing in the rightmost
border of the right image is absent from the left image. Thus,
if one or more object ROIs detected in the left disparity
map R
l
i
lie on the right border of the left image, i.e., if
X
l
i,max
= W 1, a SWV is present. This is because
X
r
i,max
= X
l
i,max
+ DM
L2R
(i, j) < W 1. Therefore,
the region [W + DM
L2R
(i, j), W 1] in the right image
is not present in the left one. The false alarm reduction
approach regarding small regions (noise) is applied to right
SWV detection, as well.
When a left or right SWV of duration d
SW V
frames is
detected, the condition d
SW V
>
fps
2
is checked, where fps
is the video frame rate, to determine whether the violation is
perceived as annoying or not. The satisfaction of this condition
implies that the duration of the violation is more than half a
second.
III. E
XPERIMENTAL RESULTS
In order to evaluate the performance of the proposed ap-
proaches, several HD 3D sequences (cf. TABLE I) were used
for the experiments. The disparity maps of all test sequences
were automatically estimated by the algorithm proposed in [4].
δ in Eq. 1 was set to 1 in the experiments considering the
clipping problem between the integer and floating values.
S1 - S3 were realized in a professional studio using two
Sony HDC 1400 cameras and broadcasted by ARTE (Associ-
ation Relative
`
a la T
´
el
´
evision Europ
´
eenne). The cameras were
set in convergence mode. Focuses were manually calibrated
with a sharpness chart. Luminance and color parameters were
corrected with Remote Control Panel (RCP ) and the use of
a waveform monitor and a video monitor. Adobe Premi
`
ere
was used to remove the useless parts at the beginning and the
end of each sequence. The focus of the left camera for S2
and S3 were manually modified to generate global sharpness
mismatches. There is no modification of the right camera
setups among S1, S2 and S3. The focus of the left camera was
manually set to +2m for S2 and to +3m for S3 in comparison
to the right camera. There is no camera motion in S1, S2 and
S3. S4 - S6 contain densely meshed textures and local motion
types. S7 is a sequence containing SWVs.
TABLE I
ORIGINAL TEST SEQUENCES USED IN THE EXPERIMENTS
ID. Seq. name ID. Seq. name
S1 ARTE studio setup S2 ARTE left camera +2m
S3 ARTE left camera +3m S4 Badminton
S5 BeergardenNoFlag S6 BeergardenFlag
S7 The Magician
A. Results of sharpness mismatch estimation
The proposed SM framework was evaluated with two kinds
of experiments. One experiment used S1 - S3 to evaluate the
performance on global SMs. Fig. 1 (a) - (c) show example
views with focus mismatches. S3 contains some local object
motions. The aim of the second experiment is to measure
the performance on depth-of-field (DOF ) mismatches. Exper-
imentally, DOF mismatches were generated in the right chan-
nels of S4 - S6 by Gaussian low-pass filters from a specific
depth plane on (cf. Fig. 1 (d) - (f)) since defocus-based effects
of lens aberrations can be modeled as a Gaussian blur [14].
The standard deviations σ of the Gaussian low-pass filter was
varied from 0.0 to 6.0 with a step width of 0.4. The images are
totally distorted if σ > 6.0. The radius of Gaussian filters was
set to 3σ. Deviations (CP BD
d
) of sharpness scores, which
were estimated by the sharpness metric CPBD [15], were used
as a reference in the experiment. CPBD is a well performing
2D objective no-reference sharpness metric for Gaussian blur
distortions.
Mean SM scores of S1 - S3 are shown in TABLE II. It
can be observed that both the proposed approach (P SM) and
CP BD
d
can detect slight SMs between stereo cameras, as
the SM scores of both metrics increase with the enlargement
of focus deviations. Although S3 contains object motion, the
variances of SM scores of both metrics are very small. Thus,
local motion does not affect SM predictions estimated by the
P SM and CP BD
d
, when small variances are considered.
However, P SM is more sensitive to SMs than CPBD
d
as
can be seen in TABLE II. The range of the both metrics is [0,
1].
The results of the second experiment are shown in Fig. 2.
The SM scores of P SM, monotonically increase with the
increase of the standard deviations of Gaussian low-pass fil-
ters. Significant changes can be observed when σ = [1.2, 4.4].
However, P SM is not very sensitive for slight blur distortions
(e.g σ < 1.2). Significant modifications of SM scores of
S6 can be observed even from σ = 1.6 since S6 contains
homogenous textures and strong local motion blur. CPBD

Citations
More filters
Journal ArticleDOI

Efficient no-reference metric for sharpness mismatch artifact between stereoscopic views

TL;DR: An automatic metric is proposed to assess sharpness mismatch artifacts and the experimental results show that the proposed metric outperforms the state-of-the-art stereo 3D quality metrics on analyzingsharpness mismatch between stereoscopic views.
References
More filters
Journal ArticleDOI

Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography

TL;DR: New results are derived on the minimum number of landmarks needed to obtain a solution, and algorithms are presented for computing these minimum-landmark solutions in closed form that provide the basis for an automatic system that can solve the Location Determination Problem under difficult viewing.
Proceedings ArticleDOI

Object recognition from local scale-invariant features

TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

An Introduction To Probability Theory And Its Applications

TL;DR: A First Course in Probability (8th ed.) by S. Ross is a lively text that covers the basic ideas of probability theory including those needed in statistics.
Journal ArticleDOI

A No-Reference Objective Image Sharpness Metric Based on the Notion of Just Noticeable Blur (JNB)

TL;DR: This work presents a perceptual-based no-reference objective image sharpness/blurriness metric by integrating the concept of just noticeable blur into a probability summation model that is able to predict with high accuracy the relative amount of blurriness in images with different content.
Frequently Asked Questions (14)
Q1. What have the authors contributed in "Efficient automatic detection of 3d video artifacts" ?

This paper summarizes some common artifacts in stereo video content. 

Future work will focus on improving the performance of the proposed approaches and developing new approaches for further 3D defects. 

Deviations (CPBDd) of sharpness scores, which were estimated by the sharpness metric CPBD [15], were used as a reference in the experiment. 

In order to exclude objects that do not appear in front of the screen, the authors set the threshold T1 to a suitable value and perform connected component analysis with an 8-point neighbourhood to extract objects (connected components) that are displayed significantly in front of the screen. 

Stereoscopic window violation is detected, using connected component analysis, when objects hit the vertical frame boundaries whilebeing in front of the virtual screen. 

which is computed considering a confidence level α, is used as the threshold for the estimation of the synchronization probability. 

When a left or right SWV of duration dSWV frames is detected, the condition dSWV > fps 2is checked, where fps is the video frame rate, to determine whether the violation is perceived as annoying or not. 

Synchronization mismatch is detected based on the motion inconsistencies of feature point pairs between the LR views within a short time frame. 

Retinal rivalry occurs on the left or right frame edges, when object regions positioned close to the left image’s left or right border do not have correspondence (are not displayed) in the right frame and vice versa. 

Bent window effect: Sometimes, objects appearing infront of the virtual screen in theatre space extend vertically across the entire frame and hit both the top and bottom frame boundaries. 

• Depth jump cut: During editing, video cuts between twoshots with very different average depth cause a temporary loss of the viewer’s 3D perception. 

The width of an activity region on a high-definition (HD) image can be approximately calculated as about 64 pixels at a comfortable viewing distance. 

the Probability of Sharpness Mismatches PSM is estimated and smoothed with the correction coefficients k, which are calculated considering the number of valid disparity mappings, asPSM = 1− exp(− P 2sm + k 22 · σ̃), (4)where σ̃ denotes the standard deviation between Psm and k. 

In order to evaluate the performance of the proposed approaches, several HD 3D sequences (cf. TABLE I) were used for the experiments.