scispace - formally typeset
Open AccessProceedings ArticleDOI

Saliency Detection via Absorbing Markov Chain

Reads0
Chats0
TLDR
The appearance divergence and spatial distribution of salient objects and the background are considered and the equilibrium distribution in an ergodic Markov chain is exploited to reduce the absorbed time in the long-range smooth background regions.
Abstract:Β 
In this paper, we formulate saliency detection via absorbing Markov chain on an image graph model. We jointly consider the appearance divergence and spatial distribution of salient objects and the background. The virtual boundary nodes are chosen as the absorbing nodes in a Markov chain and the absorbed time from each transient node to boundary absorbing nodes is computed. The absorbed time of transient node measures its global similarity with all absorbing nodes, and thus salient objects can be consistently separated from the background when the absorbed time is used as a metric. Since the time from transient node to absorbing nodes relies on the weights on the path and their spatial distance, the background region on the center of image may be salient. We further exploit the equilibrium distribution in an ergodic Markov chain to reduce the absorbed time in the long-range smooth background regions. Extensive experiments on four benchmark datasets demonstrate robustness and efficiency of the proposed method against the state-of-the-art methods.

read more

Content maybe subject toΒ copyrightΒ Β Β  Report

Saliency Detection via Absorbing Markov Chain
Bowen Jiang
1
, Lihe Zhang
1
, Huchuan Lu
1
, Chuan Yang
1
, and Ming-Hsuan Yang
2
1
Dalian University of Technology
2
University of California at Merced
Abstract
In this paper, we formulate saliency detection via ab-
sorbing Markov chain on an image graph model. We joint-
ly consider the appearance divergence and spatial distri-
bution of salient objects and the background. The virtual
boundary nodes are chosen as the absorbing nodes in a
Markov chain and the absorbed time from each transient
node to boundary absorbing nodes is computed. The ab-
sorbed time of transient node measures its global similar-
ity with all absorbing nodes, and thus salient objects can
be consistently separated from the background when the
absorbed time is used as a metric. Since the time from
transient node to absorbing nodes relies on the weights on
the path and their spatial distance, the background region
on the center of image may be salient. We further exploit
the equilibrium distribution in an ergodic Markov chain to
reduce the absorbed time in the long-range smooth back-
ground regions. Extensive experiments on four benchmark
datasets demonstrate robustness and efficiency of the pro-
posed method against the state-of-the-art methods.
1. Introduction
Saliency detection in computer vision aims to find the
most informative and interesting region in a scene. It has
been effectively applied to numerous computer vision tasks
such as content based image retrieval [32] , image segmen-
tation [30], object recognition [24] and image adaptation
[21]. Existing methods are developed with bottom-up visu-
al cues [19, 10, 26, 34] or top-down models [4, 36].
All bottom-up saliency methods rely on some prior
knowledge about salient objects and backgrounds, such
as contrast, compactness, etc. Different saliency method-
s characterize the prior knowledge from different perspec-
tives. Itti et al. [16] extract center-surround contrast at mul-
tiple spatial scales to find the prominent region. Bruce et al.
[6] exploit Shannons self-information measure in local con-
text to compute saliency. However, the local contrast does
not consider the global influence and only stands out at ob-
ject boundaries. Region contrast based methods [8, 17] first
segment the image and then compute the global contrast of
those segments as saliency, which can usually highlight the
entire object. Fourier spectrum analysis has also been used
to detect visual saliency [15, 13]. Recently, Perazzi et al.
[25] unify the contrast and saliency computation into a s-
ingle high-dimensional Gaussian filtering framework. Wei
et al. [33] exploit background priors and geodesic distance
for saliency detection. Yang et al. [35] cast saliency detec-
tion into a graph-based ranking problem, which performs
label propagation on a sparsely connected graph to char-
acterize the overall differences between salient object and
background.
In this work, we reconsider the properties of Markov ran-
dom walks and their relationship with saliency detection.
Existing random walk based methods consistently use the
equilibrium distribution in an ergodic Markov chain [9, 14]
or its extensions, e.g. the site entropy rate [31] and the
hitting time [11], to compute saliency, and have achieved
success in their own aspects. However, these models stil-
l have some certain limitations. Typically, saliency mea-
sure using the hitting time often highlights some particu-
lar small regions in objects or backgrounds. In addition,
equilibrium distribution based saliency models only high-
light the boundaries of salient object while object interior
still has low saliency value. To address these issues, we in-
vestigate the properties of absorbing Markov chains in this
work. Given an image graph as Markov chain and some
absorbing nodes, we compute the expected time to absorp-
tion (i.e. the absorbed time) for each transient node. The
nodes which have similar appearance (i.e. large transition
probabilities) and small spatial distance to absorbing nodes
can be absorbed faster. As salient objects seldom occupy all
four image boundaries [33, 5] and the background regions
often have appearance connectivity with image boundaries,
when we use the boundary nodes as absorbing nodes, the
random walk starting in background nodes can easily reach
the absorbing nodes. While object regions often have great
contrast to the image background, it is difficult for a ran-
dom walk from object nodes to reach these absorbing nodes
(represented by boundary nodes). Hence, the absorbed time
starting from object nodes is longer than that from back-
ground nodes. In addition, in a long run, the absorbed time
with similar starting nodes is roughly the same. Inspired
1

Figure 1. The time property of absorbing Markov chain and ergod-
ic Markov chain. From left to right: input image with superpixels
as nodes; the minimum hitting time of each node to all boundary
nodes in ergodic Markov chain; the absorbed time of each node
into all boundary nodes in absorbing Markov chain. Each kind of
time is normalized as a saliency map respectively.
by these observations, we formulate saliency detection as a
random walk problem in the absorbing Markov chain.
The absorbed time is not always effective especially
when there are long-range smooth background regions near
the i mage center. We further explore the effect of the equi-
librium probability in saliency detection, and exploit it to
regulate the absorbed time, thereby suppressing the salien-
cy of this kind of regions.
2. Related Work
Previous works that simulate saliency detection in ran-
dom walk model include [9, 14, 11, 31]. Costa et al. [9]
identify the saliency region based on the frequency of visits
to each node at the equilibrium of the random walk. Harel
et al. [14] extend the above method by defining a dissimi-
lar measure to model the transition probability between two
nodes. In [31], Wang et al. introduce the entropy rate and
incorporate the equilibrium distribution to measure the av-
erage information transmitted from a node to the others at
one step, which is used to predict visual attention. A ma-
jor problem using the equilibrium distribution is that this
approach often only highlights the texture and boundary re-
gions rather than the entire object, as the equilibrium prob-
ability in the cluttered region is larger than in homogeneous
region when using the dissimilarity of two nodes to rep-
resent their transition probability. Furthermore, the main
objectives in [9, 14, 31] are to predict human fixations on
natural images as opposed to identifying salient regions that
correspond to objects, as illustrated in this paper.
The approach most related to ours is Gopalakrishnan et
al. [11], which exploits the hitting time on the fully con-
nected graph and the s parsely connected graph to find the
most salient seed, based on which some background seed-
s are determined again. They then use the difference of the
hitting times to the two kinds of seeds to compute the salien-
cy for each node. While they alleviate the problem of using
the equilibrium distribution to measure saliency, the iden-
tification of the salient seed is difficult, especially for the
scenes with complex salient objects. More importantly, the
hitting time based saliency measure prefers to highlight the
global rare regions and does not suppress the backgrounds
very well, thereby decreasing the overall saliency of object-
s (See Figure 1). This can be explained as follows. The
hitting time is the expected time taken to reach a node if
the Markov chain is started in another node. The ergodic
Markov chain doesn’t have a mechanism that can synthet-
ically consider the relationships between a node and mul-
tiple specific nodes (e.g. seed nodes). In [11], to describe
the relevance of a node to background seeds, they use the
minimum hitting time to take all the background seeds into
account. The minimum time itself is sensitive to some noise
regions in the image.
Different from the above methods, we consider the ab-
sorbing Markov random walk, which includes two kinds of
nodes (i.e. absorbing nodes and transient nodes), to mea-
sure saliency. For an absorbing chain started in a transien-
t node, the probability of absorption in an absorbing node
indicates the relationship between the two nodes, and the
absorption time therefore implicates the selective relation-
ships between this transient node and all the absorbing n-
odes. Since the boundary nodes usually contain the global
characteristics of the image background, by using them as
absorbing nodes, the absorbed time of each transient node
can reflect its overall similarity with the background, which
helps to distinguish salient nodes from background nodes.
Moreover, as the absorbed time is the expected time to all
the absorbing nodes, it covers the effect of all the bound-
ary nodes, which can alleviate the influence of particular re-
gions and encourage the similar nodes in a local context to
have the similar saliency, thereby overcoming the defects of
using the equilibrium distribution [9, 14, 11, 31]. Different
from [9, 14] which directly use the equilibrium distribution
to simulate human attention, we exploit it to weigh the ab-
sorbed time, thereby suppressing the saliency of long-range
background regions with homogeneous appearance.
3. Principle of Markov Chain
Given a set of states 𝑆 = {𝑠
1
,𝑠
2
,...,𝑠
π‘š
},aMarkov
chain can be completely specified by the π‘š Γ— π‘š transition
matrix P, in which 𝑝
𝑖𝑗
is the probability of moving from
state 𝑠
𝑖
to state 𝑠
𝑗
. This probability does not depend up-
on which state the chain is in before the current state. The
chain starts in some state and move from one state to anoth-
er successively.
3.1. Absorbing Markov Chain
The state 𝑠
𝑖
is absorbing when 𝑝
𝑖𝑖
=1, which means
𝑝
𝑖𝑗
=0for all 𝑖 βˆ•= 𝑗. A Markov chain is absorbing if it has
at least one absorbing state. It is possible to go from every
transient state to some absorbing state, not necessarily in
one step. Considering an absorbing chain with π‘Ÿ absorbing
2

states and 𝑑 transient states, renumber the states so that the
transient states come first, then the transition matrix P has
the following canonical form,
P β†’
ξ˜‚
QR
0I
ξ˜ƒ
, (1)
where the first 𝑑 states are transient and the last π‘Ÿ states are
absorbing. Q ∈ [0, 1]
𝑑×𝑑
contains the transition probabili-
ties between any pair of transient states, while R ∈ [0, 1]
π‘‘Γ—π‘Ÿ
contains the probabilities of moving from any transient state
to any absorbing state. 0 is the π‘Ÿ Γ— 𝑑 zero matrix and I is the
π‘Ÿ Γ— π‘Ÿ identity matrix.
For an absorbing chain, we can derive its fundamental
matrix N =(I βˆ’ Q)
βˆ’1
, where 𝑛
𝑖𝑗
can be interestingly
interpreted as the expected number of times that the chain
spends in the transient state 𝑗 given that the chain starts in
the transient state 𝑖 , and the sum
ξ˜„
𝑗
𝑛
𝑖𝑗
reveals the expect-
ed number of times before absorption (into any absorbing
state). Thus, we can compute the absorbed time for each
transient state, that is,
y = N Γ— c, (2)
where c is a 𝑑 dimensional column vector all of whose ele-
ments are 1.
3.2. Ergodic Markov Chain
An ergodic Markov chain is one in which it is possi-
ble to go from every state to every state, not necessarily
in one step. An ergodic chain with any starting state always
reaches equilibrium after a certain time, and the equilibri-
um state is characterized by the equilibrium distribution πœ‹,
which satisfies the equation
πœ‹P = πœ‹, (3)
where P is the ergodic transition matrix. πœ‹ is a strictly
positive probability vector, where πœ‹
𝑖
describes the expected
probability of the chain staying in state 𝑠
𝑖
in equilibrium.
When the chain starts in state 𝑠
𝑖
, the mean recurrent time β„Ž
𝑖
(i.e., the expected number of times to return to state 𝑠
𝑖
) can
be derived from the equilibrium distribution πœ‹. That is,
β„Ž
𝑖
=
1
πœ‹
𝑖
, (4)
where 𝑖 indexes all the states in the ergodic Markov chain.
The more states there are similar to state 𝑠
𝑖
nearby, the less
β„Ž
𝑖
is. The derivation details and proofs can be found in [12].
3.3. Saliency Measure
Given an input image represented as a Markov chain
and some background absorbing states, the saliency of each
transient state is defined as the expected number of times
Figure 2. Illustration of the absorbing nodes. The superpixels out-
side the yellow bounding box are the duplicated boundary super-
pixels, which are used as the absorbing nodes.
before being absorbed into all absorbing nodes by Eq 2. In
this work, the transition matrix is constructed on a sparse-
ly connected graph, where each node corresponds to a s-
tate. Because we compute the full resolution saliency map,
some virtual nodes are added to the graph as absorbing s-
tates, which is detailed in the next section.
In the conventional absorbing Markov chain problems,
the absorbing nodes are manually labelled with the ground-
truth. However, as absorbing nodes for saliency detection
are selected by the proposed algorithm, some of them may
be incorrect. They have insignificant effects on the final
results, which are explained in the following sections.
4. Graph Construction
We construct a single layer graph 𝐺(𝑉,𝐸) with super-
pixels [3] as nodes 𝑉 and the links between pairs of nodes
as edges 𝐸. Because the salient objects seldom occupy all
image borders [33], we duplicate the boundary superpixels
around the image borders as the virtual background absorb-
ing nodes, as shown in Figure 2. On this graph, each node
(transient or absorbing) is connected to the transient nodes
which neighbour it or share common boundaries with its
neighbouring nodes. That means that any pair of absorb-
ing nodes are unconnected. In addition, we enforce that all
the transient nodes around the image borders (i.e., bound-
ary nodes) are fully connected with each other, which can
reduce the geodesic distance between similar superpixels.
The weights of the edges encode nodal affinity such that n-
odes connected by an edge with high weight are considered
to be strongly connected and edges with low weights repre-
sent nearly disconnected nodes. In this work, the weight 𝑀
𝑖𝑗
of the edge 𝑒
𝑖𝑗
between adjacent nodes 𝑖 and 𝑗 is defined as
𝑀
𝑖𝑗
= 𝑒
βˆ’
βˆ₯π‘₯
𝑖
βˆ’π‘₯
𝑗
βˆ₯
𝜎
2
, (5)
where π‘₯
𝑖
and π‘₯
𝑗
are the mean of two nodes in the CIE LAB
color space, and 𝜎 is a constant that controls the strength of
3

the weight. We first renumber the nodes so that the first 𝑑
nodes are transient nodes and the last π‘Ÿ nodes are absorbing
nodes, then define the affinity matrix A which represents
the reverence of nodes as
π‘Ž
𝑖𝑗
=
ξ˜…
ξ˜†
ξ˜‡
𝑀
𝑖𝑗
𝑗 ∈ 𝑁(𝑖), 1 ≀ 𝑖 ≀ 𝑑
1 if 𝑖 = 𝑗
0 otherwise
(6)
where 𝑁 (𝑖) denotes the nodes connected to node 𝑖.The
degree matrix that records the sum of the weights connected
to each node is written as
D = diag(

𝑗
π‘Ž
𝑖𝑗
). (7)
Finally, the transition matrix P on the sparsely connected
graph is given as
P = D
βˆ’1
Γ— A, (8)
which is actually the r aw normalized A. As the nodes are
locally connected, P is a sparse matrix with a small number
of nonzero elements.
The sparsely connected graph restricts the random walk
to only move within a local region in each step, hence the
expected time spent to move from transient node 𝑣
𝑑
to ab-
sorbing node 𝑣
π‘Ž
is determined by two major factors. One
is the spatial distance between the two nodes. Their dis-
tance is larger, and the expected time is longer. The other is
the transition probabilities of the nodes along the different
paths from 𝑣
𝑑
to 𝑣
π‘Ž
. Large probabilities are able to shorten
the expected time to absorption. Given starting node 𝑣
𝑑
,the
shorter the time is, the larger the probability of absorption
in node 𝑣
π‘Ž
is in a long run.
5. Saliency Detection
Given the transition matrix P by Eq. 8, we can easily
extract the matrix Q by Eq. 1, based on which the funda-
mental matrix N is computed. Then, we obtain the saliency
map S by normalizing the absorbed time y computed by
Eq. 2 to the range between 0 and 1, that is
S(𝑖)=
y(𝑖) 𝑖 =1, 2,...,𝑑, (9)
where 𝑖 indexes the transient nodes on graph, and
y denotes
the normalized absorbed time vector.
Most saliency maps generated by the normalized ab-
sorbed time
y are effective, but some background nodes
near the image center may not be adequately suppressed
when they are in long-range homogeneous region, as shown
in Figure 3. That can be explained as follows. Most n-
odes in this kind of background regions have large transi-
tion probabilities, which means that the random walk may
transfer many times among these nodes before reaching the
Figure 3. Examples showing the benefits of the update process-
ing. From left to right, input images, results without and with the
update processing.
absorbing nodes. The sparse connectivity of the graph re-
sults that the background nodes near the image center have
longer absorbed time than the similar nodes near the im-
age boundaries. Consequently, the background regions n-
ear the image center possibly present comparative saliency
with salient objects, thereby decreasing the contrast of ob-
jects and backgrounds in the resulted saliency maps. To
alleviate this problem, we update the saliency map by using
a weighted absorbed time y
w
, which can be denoted as:
y
w
= N Γ— u, (10)
where u is the weighting column vector. In this work, we
use the normalized recurrent time of an ergodic Markov
chain, of which the transition matrix is the row normalized
Q, as the weight u.
The equilibrium distribution πœ‹ for the ergodic Markov
chain can be computed from the affinity matrix A as
πœ‹
𝑖
=
ξ˜„
𝑗
π‘Ž
𝑖𝑗
ξ˜„
𝑖𝑗
π‘Ž
𝑖𝑗
, (11)
where 𝑖, 𝑗 index all the transient nodes. Since we define the
edge weight 𝑀
𝑖𝑗
as the similarity between two nodes, the
nodes within the homogeneous region have large weighted
sum
ξ˜„
𝑗
π‘Ž
𝑖𝑗
. This means the recurrent time in this kind of
region is small as shown in Figure 3. For this reason, we use
the average recurrent time β„Ž
𝑗
of each node 𝑗 to weight the
corresponding element 𝑛
𝑖𝑗
(i.e., the expected time spending
in node 𝑗 before absorption given starting node 𝑖 ) in each
row of the fundamental matrix N. Precisely, given the e-
quilibrium distribution πœ‹, β„Ž
𝑗
is computed by Eq. 4 and the
weighting vector u is computed as:
𝑒
𝑗
=
β„Ž
𝑗
ξ˜„
π‘˜
β„Ž
π‘˜
, (12)
where 𝑗 and π‘˜ index all the transient nodes on graph.
By the update processing, the saliency of the long-range
homogeneous regions near the image center can be sup-
pressed as Figure 3 illustrates. However, if the kind of re-
gion belongs to salient object, its saliency will be also in-
correctly suppressed. Therefore, we define a principle to
4

Figure 4. Examples in which the salient objects appear at the image
boundaries. From top to down: input images, our saliency maps.
decide which maps need to be further updated. We find that
object regions have great global contrast to background re-
gions in good saliency maps, while it is not the case in the
defective maps as the examples in Figure 3, which consis-
tently contain a number of regions with mid-level saliency.
Hence, given a saliency map, we first calculate its gray his-
togram g with ten bins, and then define a metric π‘ π‘π‘œπ‘Ÿπ‘’ to
characterize this kind of tendency as follows:
π‘ π‘π‘œπ‘Ÿπ‘’ =
10

𝑏=1
𝑔(𝑏) Γ— min(𝑏, (11 βˆ’ 𝑏)), (13)
where 𝑏 indexes all the bins. The larger π‘ π‘π‘œπ‘Ÿπ‘’ means that
there are longer-range regions with mid-level saliency in the
saliency map.
It should be noted that the absorbing nodes may in-
clude object nodes when the salient objects touch the im-
age boundaries, as shown in Figure 4. These imprecise
background absorbing nodes may result that the object re-
gions close to t he boundary are suppressed. However, the
absorbed time considers the effect of all boundary nodes
and depends on two factors: the edge weights on the path
and the spatial distance, so the parts of object which are far
from or different from the boundary absorbing nodes can be
highlighted correctly. The main procedure of the proposed
method is summarized in Algorithm 1.
Algorithm 1 Saliency detection based on Markov random walk
Input: An image and required parameters.
1. Construct a graph 𝐺 with superpixels as nodes, and use bound-
ary nodes as absorbing nodes;
2. Compute the affinity matrix A by Eq. 6 and the transition ma-
trix P by Eq. 8;
3. Extract the matrix Q from P by Eq. 1, and compute the funda-
mental matrix N =(I βˆ’ Q)
βˆ’1
and the map S by Eq. 9;
4. Compute the π‘ π‘π‘œπ‘Ÿπ‘’ by Eq. 13, if π‘ π‘π‘œπ‘Ÿπ‘’ < 𝛾 , output S and
return;
5. Compute the recurrent time h by Eq. 11 and 4, and the weight
u by Eq. 12, then compute the saliency map S by Eq. 10 and 9;
Output: the full resolution saliency map.
6. Experimental Results
We evaluate the proposed method on four benchmark
datasets. The first one is the MSRA dataset [18] which con-
tains 5,000 images with the ground truth marked by bound-
ing boxes. The second one is the ASD dataset, a subset of
the MSRA dataset, which contains 1,000 images with accu-
rate human-labelled ground truth provided by [2]. The third
one is the SED dataset [28], which contains: the single ob-
ject sub-dataset SED1 and two objects sub-dataset SED2.
Each sub-dataset contains 100 images and have accurate
human-labelled ground truth. The fourth one is the most
challenging SOD dataset which contains 300 images from
the Berkeley segmentation dataset [22]. This dataset is first
used for salient object segmentation evaluation [23], where
seven subjects are asked to label the foreground salient ob-
ject masks. For each object mask of each subject, a consis-
tency score is computed based on the labels of the other six
subjects. We select and combine the object masks whose
consistency scores are higher than 0.7 as the final ground
truth as done in [33]. We compare our method with fifteen
state-of-the-art saliency detection algorithms: the IT [16],
MZ [20], LC [37], GB [14], SR [15], AC [1], FT [2], S-
ER [31], CA [27], RC [8], CB [17], SVO [7], SF [25], L-
R [29] and GS [33] methods.
Experimental Setup: We set the number of superpixel n-
odes 𝑁 = 250 in all the experiments. There are two param-
eters in the proposed algorithm: the edge weight 𝜎 in Eq. 5
to controls the strength of weight between a pair of nodes
and the threshold 𝛾 of π‘ π‘π‘œπ‘Ÿπ‘’ in Eq. 13 to indicate the quality
of the saliency map. These two parameters are empirically
chosen, 𝜎
2
=0.1 and 𝛾 =2for all the test images in the
experiments.
Evaluation Metrics: We evaluate all methods by precision,
recall and F-measure. The precision is defined as the ratio
of salient pixels correctly assigned to all the pixels of ex-
tracted regions. The recall is defined as the ratio of detected
salient pixels to the ground-truth number. The F-measure
is the overall performance measurement computed by the
weighted harmonic of precision and recall:
𝐹
𝛽
=
(1 + 𝛽
2
)𝑃 π‘Ÿπ‘’π‘π‘–π‘ π‘–π‘œπ‘› Γ— π‘…π‘’π‘π‘Žπ‘™π‘™
𝛽
2
𝑃 π‘Ÿπ‘’π‘π‘–π‘ π‘–π‘œπ‘› + π‘…π‘’π‘π‘Žπ‘™π‘™
. (14)
We set 𝛽
2
=0.3 to stress precision more than recall, the
same to [2, 8, 25]. Similar as previous works, two eval-
uation criteria are used in our experiments. First, we bi-
segment the saliency map using every threshold in the range
[0 : 0.05 : 1], and compute precision and recall at each val-
ue of the threshold to plot the precision-recall curve. Sec-
ond, we compute the precision, recall and F-measure with
an adaptive threshold proposed in [2], which is defined as
twice the mean saliency of the image.
5

Citations
More filters
Proceedings ArticleDOI

Bi-directional Carving Based on Saliency Map via Absorbing Markov Chain

TL;DR: This paper presents a novel image retargeting method using bi-directional seam carving technique combining scaling and guided by an important map based on a saliency map via absorbing Markov chain combining with gradient map that can effectively protect the ROIs in the image when downsizing or enlarging an image.
Proceedings ArticleDOI

Prior Knowledge Driven Energy for Saliency Detection

TL;DR: A dual-term energy to improve the inference of saliency on top of DNN estimation, where dense term smoothens salient regions in pixel scale and sparse term extracts prior knowledge to differentiate saliency and non-saliency superpixels is proposed.
Proceedings ArticleDOI

Information set based approach for salient object detection

TL;DR: The uncertainty of a window being salient or background in terms of information extracted from different color components is defined in terms a membership function, which gives the degree of association of each element to the set.
Journal ArticleDOI

Activation to Saliency: Forming High-Quality Labels for Unsupervised Salient Object Detection

TL;DR: Li et al. as mentioned in this paper proposed a two-stage Activation-to-Saliency (A2S) framework that effectively excavates saliency cues to train a robust saliency detector.
References
More filters
Journal ArticleDOI

A model of saliency-based visual attention for rapid scene analysis

TL;DR: In this article, a visual attention system inspired by the behavior and the neuronal architecture of the early primate visual system is presented, where multiscale image features are combined into a single topographical saliency map.

A model of saliency-based visual attention for rapid scene analysis

Laurent Itti
TL;DR: A visual attention system, inspired by the behavior and the neuronal architecture of the early primate visual system, is presented, which breaks down the complex problem of scene understanding by rapidly selecting conspicuous locations to be analyzed in detail.
Proceedings ArticleDOI

A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics

TL;DR: In this paper, the authors present a database containing ground truth segmentations produced by humans for images of a wide variety of natural scenes, and define an error measure which quantifies the consistency between segmentations of differing granularities.
Proceedings ArticleDOI

Frequency-tuned salient region detection

TL;DR: This paper introduces a method for salient region detection that outputs full resolution saliency maps with well-defined boundaries of salient objects that outperforms the five algorithms both on the ground-truth evaluation and on the segmentation task by achieving both higher precision and better recall.
Proceedings ArticleDOI

Global contrast based salient region detection

TL;DR: This work proposes a regional contrast based saliency extraction algorithm, which simultaneously evaluates global contrast differences and spatial coherence, and consistently outperformed existing saliency detection methods.
Related Papers (5)
Frequently Asked Questions (15)
Q1. What have the authors contributed in "Saliency detection via absorbing markov chain" ?

In this paper, the authors formulate saliency detection via absorbing Markov chain on an image graph model.Β The authors jointly consider the appearance divergence and spatial distribution of salient objects and the background.Β The authors further exploit the equilibrium distribution in an ergodic Markov chain to reduce the absorbed time in the long-range smooth background regions.Β 

Since the boundary nodes usually contain the global characteristics of the image background, by using them as absorbing nodes, the absorbed time of each transient node can reflect its overall similarity with the background, which helps to distinguish salient nodes from background nodes.Β 

Due to scrambled backgrounds and heterogeneous foregrounds most images have, and the lack of top-down prior knowledge, the overall performance of the existing bottom-up saliency detection methods is low on this dataset.Β 

as the absorbed time is the expected time to all the absorbing nodes, it covers the effect of all the boundary nodes, which can alleviate the influence of particular regions and encourage the similar nodes in a local context to have the similar saliency, thereby overcoming the defects of using the equilibrium distribution [9, 14, 11, 31].Β 

As salient objects seldom occupy all four image boundaries [33, 5] and the background regions often have appearance connectivity with image boundaries, when the authors use the boundary nodes as absorbing nodes, the random walk starting in background nodes can easily reach the absorbing nodes.Β 

In addition, equilibrium distribution based saliency models only highlight the boundaries of salient object while object interior still has low saliency value.Β 

The sparse connectivity of the graph results that the background nodes near the image center have longer absorbed time than the similar nodes near the image boundaries.Β 

the authors bisegment the saliency map using every threshold in the range [0 : 0.05 : 1], and compute precision and recall at each value of the threshold to plot the precision-recall curve.Β 

Because the authors compute the full resolution saliency map, some virtual nodes are added to the graph as absorbing states, which is detailed in the next section.Β 

By the update processing, the saliency of the long-range homogeneous regions near the image center can be suppressed as Figure 3 illustrates.Β 

The authors further explore the effect of the equilibrium probability in saliency detection, and exploit it to regulate the absorbed time, thereby suppressing the saliency of this kind of regions.Β 

Given a set of states 𝑆 = {𝑠1, 𝑠2, . . . , π‘ π‘š}, a Markov chain can be completely specified by the π‘š Γ—π‘š transition matrix P, in which 𝑝𝑖𝑗 is the probability of moving from state 𝑠𝑖 to state 𝑠𝑗 .Β 

In this work, the authors use the normalized recurrent time of an ergodic Markov chain, of which the transition matrix is the row normalized Q, as the weight u.Β 

To alleviate this problem, the authors update the saliency map by using a weighted absorbed time yw, which can be denoted as:yw = NΓ— u, (10) where u is the weighting column vector.Β 

Given an input image represented as a Markov chain and some background absorbing states, the saliency of each transient state is defined as the expected number of timesbefore being absorbed into all absorbing nodes by Eq 2.Β