scispace - formally typeset
Open AccessProceedings ArticleDOI

Saliency detection for stereoscopic images

TLDR
A new stereoscopic saliency detection framework based on the feature contrast of color, intensity, texture, and depth, which shows superior performance over other existing ones in saliency estimation for 3D images is proposed.
Abstract
Saliency detection techniques have been widely used in various 2D multimedia processing applications. Currently, the emerging applications of stereoscopic display require new saliency detection models for stereoscopic images. Different from saliency detection for 2D images, depth features have to be taken into account in saliency detection for stereoscopic images. In this paper, we propose a new stereoscopic saliency detection framework based on the feature contrast of color, intensity, texture, and depth. Four types of features including color, luminance, texture, and depth are extracted from DC-T coefficients to represent the energy for image patches. A Gaussian model of the spatial distance between image patches is adopted for the consideration of local and global contrast calculation. A new fusion method is designed to combine the feature maps for computing the final saliency map for stereoscopic images. Experimental results on a recent eye tracking database show the superior performance of the proposed method over other existing ones in saliency estimation for 3D images.

read more

Content maybe subject to copyright    Report

HAL Id: hal-01059986
https://hal.archives-ouvertes.fr/hal-01059986
Submitted on 15 Sep 2014
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-
entic research documents, whether they are pub-
lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diusion de documents
scientiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Saliency Detection for Stereoscopic Images
Yuming Fang, Junle Wang, Manish Narwaria, Patrick Le Callet, Weisi Lin
To cite this version:
Yuming Fang, Junle Wang, Manish Narwaria, Patrick Le Callet, Weisi Lin. Saliency Detection for
Stereoscopic Images. IEEE Transactions on Image Processing, Institute of Electrical and Electronics
Engineers, 2014, 23 (6), pp.2625–2636. �10.1109/TIP.2014.2305100�. �hal-01059986�

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 6, JUNE 2014 2625
Saliency Detection for Stereoscopic Images
Yuming Fang, Member, IEEE, Junle Wang, Manish Narwaria, Patrick Le Callet, Member, IEEE,
and Weisi Lin, Senior Member, IEEE
AbstractMany saliency detection models for 2D images have
been proposed for various multimedia processing applications
during the past decades. Currently, the emerging applications
of stereoscopic display require new saliency detection models
for salient region extraction. Different from saliency detection
for 2D images, the depth feature has to be taken into account
in saliency detection for stereoscopic images. In this paper, we
propose a novel stereoscopic saliency detection framework based
on the feature contrast of color, luminance, texture, and depth.
Four types of features, namely color, luminance, texture, and
depth, are extracted from discrete cosine transform coefficients
for feature contrast calculation. A Gaussian model of the spatial
distance between image patches is adopted for consideration
of local and global contrast calculation. Then, a new fusion
method is designed to combine the feature maps to obtain the
final saliency map for stereoscopic images. In addition, we adopt
the center bias factor and human visual acuity, the important
characteristics of the human visual system, to enhance the final
saliency map for stereoscopic images. Experimental results on
eye tracking databases show the superior performance of the
proposed model over other existing methods.
Index TermsStereoscopic image, 3D image, stereoscopic
saliency detection, visual attention, human visual acuity.
I. INTRODUCTION
V
ISUAL attention is an important characteristic in the
Human Visual System (HVS) for visual information
processing. With large amount of visual information, visual
attention would selectively process the important part by
filtering out others to reduce the complexity of scene analysis.
These important visual information is also termed as salient
regions or Regions of Interest (ROIs) in natural images. There
are two different approaches in visual attention mechanism:
bottom-up and top-down. Bottom-up approach, which is data-
driven and task-independent, is a perception process for auto-
matic salient region selection for natural scenes [1]–[8], while
top-down approach is a task-dependent cognitive processing
affected by the performed tasks, feature distribution of targets,
etc. [9]–[11].
Manuscript received June 17, 2013; revised November 14, 2013 and
January 7, 2014; accepted January 26, 2014. Date of publication February 6,
2014; date of current version May 9, 2014. The associate editor coordi-
nating the review of this manuscript and approving it for publication was
Prof. Damon M. Chandler.
Y. Fang is with the School of Information Technology, Jiangxi Uni-
versity of Finance and Economics, Nanchang 330032, China (e-mail:
fa0001ng@e.ntu.edu.sg).
J. Wang, M. Narwaria, and P. Le Callet are with LUNAM Université,
Université de Nantes, Nantes Cedex 3 44306, France (e-mail:
wang.junle@gmail.com; mani0018@e.ntu.edu.sg; patrick.lecallet@
univ-nantes.fr).
W. Lin is with the School of Computer Engineering, Nanyang Technological
University, Singapore 639798 (e-mail: wslin@ntu.edu.sg).
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TIP.2014.2305100
Over the past decades, many studies have tried to pro-
pose computational models of visual attention for var-
ious multimedia processing applications, such as visual
retargeting [5], visual quality assessment [9], [13], visual
coding [14], etc. In these applications, the salient regions
extracted from saliency detection models are processed
specifically since they attract much more humans’ atten-
tion compared with other regions. Currently, many bottom-
up saliency detection models have been proposed for
2D images/videos [1]–[8].
Today, with the development of stereoscopic display, there
are various emerging applications for 3D multimedia such
as 3D video coding [31], 3D visual quality assessment
[32], [33], 3D rendering [20], etc. In the study [33], the
authors introduced the conflict met by the HVS while watching
3D-TV, how these conflicts might be limited and how visual
comfort might be improved by the visual attention model.
The study also described some other visual attention based
3D multimedia applications, which exist in different stages
of a typical 3D-TV delivery chain, such as 3D video cap-
ture, 2D to 3D conversion, reframing and depth adapta-
tion, etc. Chamaret et al. adopted ROIs for 3D rendering
in the study [20]. Overall, the emerging demand of visual
attention based applications for 3D multimedia increases the
requirement of computational saliency detection models for
3D multimedia content.
Compared with various saliency detection models proposed
for 2D images, only a few studies exploiting the 3D saliency
detection exist currently [18]–[27]. Different from saliency
detection for 2D images, the depth factor has to to be consid-
ered in saliency detection for 3D images. To achieve the depth
perception, binocular depth cues (such as binocular disparity)
are introduced and merged together with others (such as
monocular disparity) in an adaptive way based on the viewing
space conditions. However, this change of depth perception
also largely influences the human viewing behavior [39].
Therefore, how to estimate the saliency from depth cues and
how to combine the saliency from depth with those from other
2D low-level features are two important factors in designing
3D saliency detection models.
In this paper, we propose a novel saliency detection model
for 3D images based on feature contrast from color, luminance,
texture, and depth. The features of color, luminance, texture
and depth are extracted from DCT (Discrete Cosine Trans-
form) coefficients of image patches. It is well accepted that the
DCT is a superior representation for energy compaction and
most of the signal information is concentrated on a few low-
frequency components [34]. Due to its energy compactness
property, the DCT has been widely used in various signal
1057-7149 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

2626 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 6, JUNE 2014
processing applications in the past decades. Our previous study
has also demonstrated that DCT coefficients can be adopted
for effective feature representation in saliency detection [5].
Therefore, we use DCT coefficients for feature extraction for
image patches in this study.
In essence, the input stereoscopic image and depth map are
firstly divided into small image patches. Color, luminance and
texture features are extracted based on DCT coefficients of
each image patch from the original image, while depth feature
is extracted based on DCT coefficients of each image patch in
the depth map. Feature contrast is calculated based on center-
surround feature difference, weighted by a Gaussian model of
spatial distances between image patches for the consideration
of local and global contrast. A new fusion method is designed
to combine the feature maps to obtain the final saliency map
for 3D images. Additionally, inspired by the viewing influence
from centre bias and the property of human visual acuity in
the HVS, we propose to incorporate the centre bias factor
and human visual acuity into the proposed model to enhance
the saliency map. The Centre-Bias Map (CBM) calculated
based on centre bias factor and a statistical model of human
visual sensitivity in [38] are adopted to enhance the saliency
map for obtaining the final saliency map of 3D images.
Existing 3D saliency detection models usually adopt depth
information to weight the traditional 2D saliency map [19],
[20], or combine the depth saliency map and the traditional
2D saliency map simply [21], [23] to obtain the saliency
map of 3D images. Different from these existing methods,
the proposed model adopts the low-level features of color,
luminance, texture and depth for saliency calculation in a
whole framework and designs a novel fusion method to obtain
the saliency map from feature maps. Experimental results on
eye-tracking databases demonstrate the superior performance
of the proposed model over other existing methods.
The remaining of this paper is organized as follows.
Section II introduces the related work in the literature.
In Section III, the proposed model is described in detail.
Section IV provides the experimental results on eye tracking
databases. The final section concludes the paper.
II. R
ELATED WORK
As introduced in the previous section, many computa-
tional models of visual attention have been proposed for
various 2D multimedia processing applications. Itti et al.
proposed one of the earliest computational saliency detec-
tion models based on the neuronal architecture of the pri-
mates’ early visual system [1]. In that study, the saliency
map is calculated by feature contrast from color, intensity
and orientation. Later, Harel et al. extended Itti’s model
by using a more accurate measure of dissimilarity [2].
In that study, the graph-based theory is used to mea-
sure saliency from feature contrast. Bruce et al. designed
a saliency detection algorithm based on information max-
imization [3]. The basic theory for saliency detection is
Shannon’s self-information measure [3]. Le Meur et al.
proposed a computational model of visual attention based
on characteristics of the HVS including contrast sensitivity
functions, perceptual decomposition, visual masking, and
center-surround interactions [12].
Hou et al. proposed a saliency detection method by the con-
cept of Spectral Residual [4]. The saliency map is computed
by log spectra representation of images from Fourier Trans-
form. Based on Hou’s model, Guo et al. designed a saliency
detection algorithm based on phase spectrum, in which the
saliency map is calculated by Inverse Fourier Transform on
a constant amplitude spectrum and the original phase spec-
trum [14]. Yan et al. introduced a saliency detection algorithm
based on sparse coding [8]. Recently, some saliency detection
models have been proposed by patch-based contrast and obtain
promising performance for salient region extraction [5]–[7].
Goferman et al. introduced a context-aware saliency detection
model based on feature contrast from color and intensity in
image patches [7]. A saliency detection model in compressed
domain is designed by Fang et al. for the application of image
retargeting [5].
Besides 2D saliency detection models, several studies have
explored the saliency detection for 3D multimedia content.
In [18], Bruce at al. proposed a stereo attention framework by
extending an existing attention architecture to the binocular
domain. However, there is no computational model proposed
in that study [18]. Zhang et al. designed a stereoscopic visual
attention algorithm for 3D video based on multiple perceptual
stimuli [19]. Chamaret et al. built a Region of Interest (ROI)
extraction method for adaptive 3D rendering [20]. Both stud-
ies [19] and [20] adopt depth map to weight the 2D saliency
map to calculate the nal saliency map for 3D images. Another
method of 3D saliency detection model is built by incorporat-
ing depth saliency map into the traditional 2D saliency detec-
tion methods. In [21], Ouerhani et al. extended a 2D saliency
detection model to 3D saliency detection by taking depth cues
into account. Potapova introduced a 3D saliency detection
model for robotics tasks by incorporating the top-down cues
into the bottom-up saliency detection [22]. Lang et al. con-
ducted eye tracking experiments over 2D and 3D images for
depth saliency analysis and proposed 3D saliency detection
models by extending previous 2D saliency detection mod-
els [26]. Niu et al. explored the saliency analysis for stereo-
scopic images by extending a 2D image saliency detection
model [25]. Ciptadi et al. used the features of color and depth
to design a 3D saliency detection model for the application
of image segmentation [27]. Recently, Wang et al. proposed
a computational model of visual attention for 3D images by
extending the traditional 2D saliency detection methods. In
the study [23], the authors provided a public database with
ground-truth of eye-tracking data.
From the above description, the key of 3D saliency detection
model is how to adopt the depth cues besides the traditional
2D low-level features such as color, intensity, orientation,
etc. Previous studies from neuroscience indicate that the
depth feature would cause human beings’ attention focusing
on the salient regions as well as other low-level features
such as color, intensity, motion, etc. [15]–[17]. Therefore,
an accurate 3D saliency detection model should take depth
contrast into account as well as contrast from other common
2D low-level features. Accordingly, we propose a saliency

FANG et al.: SALIENCY DETECTION FOR STEREOSCOPIC IMAGES 2627
Fig. 1. The framework of the proposed model.
detection framework based on the feature contrast from low-
level features of color, luminance, texture and depth. A new
fusion method is designed to combine the feature maps for the
saliency estimation. Furthermore, the centre bias factor and the
human visual acuity are adopted to enhance the saliency map
for 3D images. The proposed 3D saliency detection model
can obtain promising performance for saliency estimation for
3D images, as shown in the experiment section.
III. T
HE PROPOSED MODEL
The framework of the proposed model is depicted as Fig. 1.
Firstly, the color, luminance, texture, and depth features are
extracted from the input stereoscopic image. Based on these
features, the feature contrast is calculated for the feature map
calculation. A fusion method is designed to combine the
feature maps into the saliency map. Additionally, we use the
centre bias factor and a model of human visual acuity to
enhance the saliency map based on the characteristics of the
HVS. We will describe each step in detail in the following
subsections.
A. Feature Extraction
In this study, the input image is divided into small image
patches and then the DCT coefficients are adopted to represent
the energy for each image patch. Our experimental results
show that the proposed model with the patch size within
the visual angle of [0.14, 0.21] (degrees) can get promising
performance. In this paper, we use the patch size of 8 × 8
(the visual angle within the range of [0.14, 0.21] degrees) for
the saliency calculation. The used image patch size is also
the same as DCT block size in JPEG compressed images. The
input RGB image is converted to YCbCr color space due to its
perceptual property. In YCbCr color space, the Y component
represents the luminance information, while Cb and Cr are
two color-opponent components. For the DCT coefficients,
DC coefficients represent the average energy over all pixels in
the image patch, while AC coefficients represent the detailed
frequency properties of the image patch. Thus, we use the
DC coefficient of Y component to represent the luminance
feature for the image patch as L = Y
DC
(Y
DC
is the DC
coefficient of Y component), while the DC coefficients of
Cb and Cr components are adopted to represent the color
features as C
1
= Cb
DC
and C
2
= Cr
DC
(Cb
DC
and Cr
DC
are
the DC coefficients from Cb and Cr components respectively).
Since the Cr and Cb components mainly include the color
information and little texture information is included in these
two channels, we use AC coefficients from only Y component
to represent the texture feature of the image patch. In DCT
block, most of the energy is included in the first several low-
frequency coefficients in the left-upper corner of the DCT
block. As there is little energy with the high-frequency coeffi-
cients in the right-bottom corner of the DCT block, we just use
several first AC coefficients to represent the texture feature of
image patches. The existing study in [35] demonstrates that the
first 9 low-frequency AC coefficients in zig-zag scanning can
represent most energy for the detailed frequency information
in one 8 ×8 image patch. Based on the study [35], we use the
first 9 low-frequency AC coefficients to represent the texture
feature for each image patch as T ={Y
AC1
, Y
AC2
,...,Y
AC9
}.
For the depth feature, we assume that a depth map provides
the information of the perceived depth for the scene. In a
stereoscopic display system, depth information is usually
represented by a disparity map which shows the parallax of
each pixel between the left-view and the right-view images.
The disparity is usually measured in unit of pixels for display
systems. In this study, the depth map M of perceived depth
information is computed based on the disparity as [23]:
M = V/(1 +
d · H
P · W
) (1)
where V represents the viewing distance of the observer;
d denotes the interocular distance; P is the disparity between
pixels; W and H represent the width (in cm) and horizontal
resolution of the display screen, respectively. We set the
parameters based on the experimental studies in [23].
Similar with feature extraction for color and luminance, we
adopt the DC coefficients of patches in depth map calculated
in Eq. (1) as D = M
DC
(M
DC
represents the DC coefficient
of the image patch in depth map M).
As described above, we can extract five features of color,
luminance, texture and depth (L, C
1
, C
2
, T, D) for the input
stereoscopic image. We will introduce how to calculate the
feature map based on these extracted features in the next
subsection.

2628 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 6, JUNE 2014
B. Feature Map Calculation
As we have explained before, salient regions in visual scenes
pop out due to their feature contrast from their surrounding
regions. Thus, a direct method to extract salient regions in
visual scenes is to calculate the feature contrast between image
patches and their surrounding patches in visual scenes. In this
study, we estimate the saliency value of each image patch
based on the feature contrast between this image path and
all the other patches in the image. Here, we use a Gaussian
model of spatial distance between image patches to weight the
feature contrast for saliency calculation. The saliency value F
k
i
of image patch i from feature k can be calculated as:
F
k
i
=
j=i
1
σ
2π
e
l
2
ij
/(2σ
2
)
U
k
ij
(2)
where k represents the feature and k ∈{L, C
1
, C
2
, T, D};
l
ij
denotes the spatial distance between image patches
i and j; U
k
ij
represents the feature difference between image
patches i and j from feature k; σ is the parameter of
the Gaussian model and it determines the degree of local
and global contrast for the saliency estimation. σ is set as
5 based on the experiments of the previous work [5]. For any
image patch i, its saliency value is calculated based on the
center-surround differences between this patch and all other
patches in the image. The weighting for the center-surround
differences is determined by the spatial distances (within the
Gaussian model) between image patches. The differences from
nearer image patches will contribute more to the saliency value
of patch i than those from farther image patches. Thus, we
consider both local and global contrast from different features
in the proposed saliency detection model.
The feature difference U
k
ij
between image patches i and j
is computed differently from features k due to the different
feature representation method. Since the color, luminance and
depth features are represented by one DC coefficient for each
image patch, the feature contrast from these features (lumi-
nance, color and depth) between two image patches i and j can
be calculated as the difference between two DC coefficients
of two corresponding image patches as follows.
U
m
ij
=
|B
m
i
B
m
j
|
B
m
i
+ B
m
j
(3)
where B
m
represents the feature and B
m
∈{L, C
1
, C
2
, D};
the denominator is used to normalize the feature contrast.
Since texture feature is represented as 9 low-frequency
AC coefficients, we calculate the feature contrast from texture
by the L2 norm. The feature contrast U
ij
from texture feature
between two image patches i and j can be computed as
follows.
U
ij
=
t
(B
t
i
B
t
j
)
2
t
(B
t
i
+ B
t
j
)
(4)
where t represents the AC coefficients and t ∈{1, 2, ..., 9};
B
represents the texture feature; the denominator is adopted
to normalize the feature contrast.
C. Saliency Estimation from Feature Map Fusion
After calculating feature maps indicated in Eq. (2), we fuse
these feature maps from color, luminance, texture and depth
to compute the final saliency map. It is well accepted that
different visual dimensions in natural scenes are competing
with each other during the combination for the final saliency
map [40], [41]. Existing studies have shown that a stimulus
from several saliency features is generally more conspicuous
than that from only one single feature [1], [41]. The differ-
ent visual features interact and contribute simultaneously to
the saliency of visual scenes. Currently, existing studies of
3D saliency detection (e.g. [23]) use simple linear combination
to fuse the feature maps to obtain the final saliency map. The
weighting of the linear combination is set as constant values
and is the same for all images. To address the drawbacks from
ad-hoc weighting of linear combination for different feature
maps, we propose a new fusion method to assign adaptive
weighting for the fusion of feature maps in this study.
Generally, the salient regions in a good saliency map should
be small and compact, since the HVS always focus on some
specific interesting regions in images. Thus, a good feature
map should detect small and compact regions in the image.
During the fusion of different feature maps, we can assign
more weighting for those feature maps with small and compact
salient regions and less weighting for others with more spread
salient regions. Here, we define the measure of compactness
by the spatial variance of feature maps. The spatial variance
υ
k
of feature map F
k
can be computed as follows.
υ
k
=
(i, j)
(i E
i,k
)
2
+ ( j E
j,k
)
2
· F
k
(i, j)
(i, j)
F
k
(i, j)
(5)
where (i, j) is the spatial location in the feature map;
k represents the feature channel and k ∈{L, C
1
, C
2
, T, D};
(E
i,k
, E
j,k
) is the average spatial location weighted by feature
response, which is calculated as:
E
i,k
=
(i, j)
i · F
k
(i, j)
(i, j)
F
k
(i, j)
(6)
E
j,k
=
(i, j)
j · F
k
(i, j)
(i, j)
F
k
(i, j)
(7)
We use the normalized υ
k
values to represent the compact-
ness property for feature maps. With larger spatial variance
values, the feature map is supposed to be less compact. We
calculate the compactness β
k
of the feature map F
k
as follows.
β
k
= 1/(e
υ
k
) (8)
where k represents the feature channel and k
{L, C
1
, C
2
, T, D}.
Based on compactness property of feature maps calculated
in Eq. (8), we fuse the feature maps for the saliency map as
follows.
S
f
=
k
β
k
· F
k
+
p=q
β
p
· β
q
· F
p
· F
q
(9)
The first term in Eq. (9) represents the linear combination
of feature maps weighted by corresponding compactness prop-
erties of feature maps; while the second term is adopted to

Figures
Citations
More filters
Proceedings ArticleDOI

Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection

TL;DR: Contrast prior is utilized, which used to be a dominant cue in none deep learning based SOD approaches, into CNNs-based architecture to enhance the depth information and is integrated with RGB features for SOD, using a novel fluid pyramid integration.
Proceedings ArticleDOI

Salient object detection in RGB-D image based on saliency fusion and propagation

TL;DR: A saliency fusion and propagation strategy based salient object detection method for RGB-D images, in which multiple cues are fused to provide high precision detection result and saliency propagation is utilized to improve the completeness of salient objects.
Journal ArticleDOI

Depth incorporating with color improves salient object detection

TL;DR: This work improves salient object detection by integrating two features: color information and depth information which are calculated from stereo images, and shows that this depth-plus-color based method significantly improves salient objects detection compared with previous color-based methods.
Proceedings ArticleDOI

Relative location for light field saliency detection

TL;DR: This work proposed to extract relative locations, which can distinguish whether the object is located before the focus plane of the main lens or not, for saliency detection, by comparing raw light field images captured by plenoptic cameras and central views of scenes.
Proceedings ArticleDOI

A Novel Saliency Model for Stereoscopic Images

TL;DR: Experimental results on two recent eye-tracking databases show that the proposed saliency model outperforms the state-of-the-art saliency models.
References
More filters
Journal ArticleDOI

A feature-integration theory of attention

TL;DR: A new hypothesis about the role of focused attention is proposed, which offers a new set of criteria for distinguishing separable from integral features and a new rationale for predicting which tasks will show attention limits and which will not.
Journal ArticleDOI

A model of saliency-based visual attention for rapid scene analysis

TL;DR: In this article, a visual attention system inspired by the behavior and the neuronal architecture of the early primate visual system is presented, where multiscale image features are combined into a single topographical saliency map.

A model of saliency-based visual attention for rapid scene analysis

Laurent Itti
TL;DR: A visual attention system, inspired by the behavior and the neuronal architecture of the early primate visual system, is presented, which breaks down the complex problem of scene understanding by rapidly selecting conspicuous locations to be analyzed in detail.
Proceedings ArticleDOI

Saliency Detection: A Spectral Residual Approach

TL;DR: A simple method for the visual saliency detection is presented, independent of features, categories, or other forms of prior knowledge of the objects, and a fast method to construct the corresponding saliency map in spatial domain is proposed.
Journal ArticleDOI

What attributes guide the deployment of visual attention and how do they do it

TL;DR: As you drive into the centre of town, cars and trucks approach from several directions, and pedestrians swarm into the intersection, the wind blows a newspaper into the gutter and a pigeon does something unexpected on your windshield.
Related Papers (5)
Frequently Asked Questions (1)
Q1. What contributions have the authors mentioned in the paper "Saliency detection for stereoscopic images" ?

In this paper, the authors propose a novel stereoscopic saliency detection framework based on the feature contrast of color, luminance, texture, and depth.