scispace - formally typeset
Open AccessProceedings ArticleDOI

An efficient algorithm for Co-segmentation

TLDR
The model proposed here bypasses measurement of the histogram differences in a direct fashion and enables obtaining efficient solutions to the underlying optimization model, and can be solved to optimality in polynomial time using a maximum flow procedure on an appropriately constructed graph.
Abstract
This paper is focused on the Co-segmentation problem [1] - where the objective is to segment a similar object from a pair of images. The background in the two images may be arbitrary; therefore, simultaneous segmentation of both images must be performed with a requirement that the appearance of the two sets of foreground pixels in the respective images are consistent. Existing approaches [1, 2] cast this problem as a Markov Random Field (MRF) based segmentation of the image pair with a regularized difference of the two histograms - assuming a Gaussian prior on the foreground appearance [1] or by calculating the sum of squared differences [2]. Both are interesting formulations but lead to difficult optimization problems, due to the presence of the second (histogram difference) term. The model proposed here bypasses measurement of the histogram differences in a direct fashion; we show that this enables obtaining efficient solutions to the underlying optimization model. Our new algorithm is similar to the existing methods in spirit, but differs substantially in that it can be solved to optimality in polynomial time using a maximum flow procedure on an appropriately constructed graph. We discuss our ideas and present promising experimental results.

read more

Content maybe subject to copyright    Report

An efficient algorithm for Co-segmentation
Dorit S. Hochbaum Vikas Singh
www.ieor.berkeley.edu/hochbaum www.biostat.wisc.edu/vsingh
Haas School of Business, and Biostatistics & Medical Informatics,
Industrial Eng. and Operations Research and Computer Sciences
Univ. of California, Berkeley Univ. of Wisconsin–Madison
Abstract
This paper is focused on the Co-segmentation problem
[1] where the objective is to segment a similar object from
a pair of images. The background in the two images may be
arbitrary; therefore, simultaneous segmentation of both im-
ages must be performed with a requirement that the appear-
ance of the two sets of foreground pixels in the respective
images are consistent. Existing approaches [1, 2] cast this
problem as a Markov Random Field (MRF) based segmen-
tation of the image pair with a regularized difference of the
two histograms assuming a Gaussian prior on the fore-
ground appearance [1] or by calculating the sum of squared
differences [2]. Both are interesting formulations but lead
to difficult optimization problems, due to the presence of the
second (histogram difference) term. The model proposed
here bypasses measurement of the histogram differences in
a direct fashion; we show that this enables obtaining effi-
cient solutions to the underlying optimization model. Our
new algorithm is similar to the existing methods in spirit,
but differs substantially in that it can be solved to optimal-
ity in polynomial time using a maximum flow procedure on
an appropriately constructed graph. We discuss our ideas
and present promising experimental results.
1. Introduction
The idea of co-segmentation, first introduced in [1],
refers to the simultaneous segmentation of two images. The
problem is well illustrated by the example in Fig. 1, where
the same (or similar) object appears in two different images,
and we seek to perform a segmentation of only the similar
regions in both views. This problem was partly motivated in
[1] by the need for computing meaningful similarity mea-
sures between images of the same subject but with differ-
ent (and unrelated) backdrops in image retrieval applica-
tions [3]. A related goal was to facilitate segmentation of an
object (or a region of interest) by providing minimal addi-
tional information (such as just one additional image). The
Figure 1. A similar object in two images in rows 1-2. The his-
togram of the foreground (of row 2 images) is shown in row 3.
idea has been utilized in a number of other concurrent fore-
ground extraction tasks using multiple images [4], images
acquired with/without camera flash [5], image sequences
[6], and for identifying individuals using image collections
[7]. Later in the paper, we discuss how the idea may be
applied for pathology identification problems in biomedical
images. The purpose of this paper, however, is to investi-
gate efficient means of solving (the underlying optimization
problem of) co-segmentation.
The identification of similar objects in more than one im-
age is a fundamental problem in computer vision and has re-
lied on user annotation or construction of models [8, 9]. A
number of recent techniques [1, 10, 11, 12], however, have
preferred an unsupervised (or semi-supervised) approach to
the problem and obtained good overall performance. Co-
segmentation belongs to this second category. The key idea
adopted in [1] was to apply a MRF segmentation on both
images with an additional term that penalizes the variation
in the histograms of the foreground regions in the two im-

ages, see Fig. 1. The energy function is expressed as
min E(x
1
, x
2
) = E
1
(x
1
, x
2
) + E
global
(
ˆ
h
1
,
ˆ
h
2
), (1)
where x
1
, x
2
{0, 1} are binary variables indicating
the assignment of pixels to the background or foreground,
E
1
(·, ·) denotes the sum of the image-wise MRF ener-
gies, and E
global
(
ˆ
h
1
,
ˆ
h
2
) measures the difference between
a Gaussian model of the two foreground histograms,
ˆ
h
1
and
ˆ
h
2
. The authors [1, 2] note that efficient optimization of (1)
is not tractable. Therefore, the proposed procedure initial-
izes the segmentations, and incrementally improves one of
the segmentations keeping the other fixed (and vice-versa)
until no further improvements are possible. In subsequent
work [6], a generative model was proposed for performing
co-segmentation in image sequences, and a locally maxi-
mal marginal log posterior estimate was obtained using an
expectation maximization (EM) algorithm with certain con-
vergence criteria. Later, the authors in [4] extended many of
these ideas further by incorporating local context, i.e., pat-
terns characterizing the local color and edge configurations.
This led to improved results relative to [1, 13]. However, the
technique [4] was focused on the empirical performance,
and less effort was devoted toward better means of optimiz-
ing the co-segmentation cost function in [1]. Recently, [2]
proposes addressing some of these difficulties by replacing
the second term with the squared difference of the two his-
tograms. This approach no longer requires the histograms
to be Gaussian, and leads to a quadratic pseudoboolean op-
timization model [14]. The authors prove that their formula-
tion yields half-integral solutions (i.e., {0,
1
2
, 1}) to the opti-
mization problem. However, the problem still remains hard
(and cannot be solved optimally), and obtaining provably
good quality guarantees in a general case is difficult. On
the practical side, solving the linear program in [2] result-
ing from large images may be computationally intensive.
Carrot or Stick? The simplest interpretation of the co-
segmentation model, as discussed above, is that it en-
courages good and coherent segmentation of both images
with an additional requirement of consistency between fore-
ground histograms. To enforce this requirement, variations
between foreground histograms are penalized in [1, 4, 2],
but this leads to intractable optimization problems. On the
other hand, notice that a similar effect may also be achieved
by rewarding consistency in the two foreground histograms
(rather than explicitly penalizing their difference): the car-
rot or the stick
1
. While this seems logical once we choose a
suitable measure of histogram consistency, the key question
is: what is the benefit of adopting this second approach? In
the following sections, we will show that this modification
leads to a polynomial time algorithm for co-segmentation.
This is the primary contribution of this paper.
1
a mechanism of offering rewards (e.g., carrot) or threatening punish-
ment (e.g., stick) to induce a desired behavior.
2. Preliminaries
In the co-segmentation setting, we are given two images
for segmentation: I
(1)
and I
(2)
. The images are of the same
size consisting of n pixels each, where the j
th
pixel in the
q
th
image is denoted by I
(q )
j
, for j = 1, · · · , n and q = 1, 2.
We are also given a classification of each pixel in each im-
age into one of K ‘buckets’ in a histogram for each image.
Let the histogram buckets (each bucket corresponds to an
intensity range) be given as h
1
, h
2
, · · · , h
K
. For each im-
age I
(q )
, q = 1, 2, this may be specified in terms of a matrix
B
(q )
of size n × K such that for pixel j and bucket h
k
,
B
(q )
j,k
=
1 if I
(q )
j
H
k
;
0 otherwise.
(2)
That is, the entry B
(q )
j,k
is 1 if the intensity of pixel I
(q )
j
falls
in the intensity range of bucket h
k
, where q refers to either
the first or the second image.
A segmentation of each image will partition the set of
pixels into foreground versus background pixels. Our in-
terest is to ensure that the foreground in the two images are
similar. Toward this goal, the objective is to get (1) the num-
ber of pixels that are in the foreground, and (2) the number
of pixels in H
k
, to be approximately similar in both images.
One strategy is to define similarity between all pairs of pix-
els I
(1)
i
and I
(2)
j
. We can say that the pair i, j is similar if
both belong to H
k
and designate a similarity weight s
ij
to
be equal to 1 if that happens. Formally,
s
ij
=
1 if k such that B
(1)
i,k
= B
(2)
j,k
= 1
0 otherwise.
Let x
(q )
j
be a binary variable indicating whether pixel I
(q )
j
is classified in the foreground:
x
(q )
j
=
(
1 if I
(q )
j
is classified as foreground
0 if I
(q )
j
is classified as background.
The number of pixels in the foreground of I
(1)
that belong
to H
k
is denoted by a
k
=
P
n
j=1
B
(1)
j,k
x
(1)
j
and the number in
I
(2)
that belong to H
k
is denoted by b
k
=
P
n
j=1
B
(2)
j,k
x
(2)
j
.
Let the total number of foreground pixels in I
(1)
and I
(2)
be
F
1
and F
2
respectively. We model a measure of similarity
of the two foreground features as the optimal solution to
max
P
K
k=1
a
k
b
k
(3)
subject to
P
K
k=1
a
k
= |F
1
|
P
K
k=1
b
k
= |F
2
|.
Our rationale is that for each pixel p assigned as foreground
in the first image, we offer a reward for also selecting (as
part of the foreground in the second image) a pixel q which
is similar to p. Similarity which is specified by a binary

s
pq
depends on whether p and q belong to the same bucket,
and may be allowed to vary in [0, 1] as a function of the
likelihood of the match (p q) detected by some feature
extraction method. For |F
1
|, |F
2
| fixed, the optimization
process seeks to maximize the number of pixel pairs (one
from each image) with identical histogram buckets. We note
that treating |F
1
| = |F
2
| as normalization constants for a
and b resp., this is also similar to Hellinger affinity (see [15],
pg. 24), frequently used in computer vision [16].
3. Problem Statement
Maximizing similarity of histograms as in (3), by itself
is not sufficient to obtain meaningful segmentations. This is
because Co-segmentation must take the spatial homogene-
ity of the images into account also. This may be achieved
by introducing the adjacency relationship between neigh-
boring pixels as an additional bias into the maximization in
(3). Another option, which we adopt here, is to segment
both images while using the similarity in (3) as a bias term.
3.1. MRF segmentation
We formulate the task of segmenting both images as a
binary labeling of Markov Random Field (MRF) on the
graphs corresponding to the input images [17, 18]. That is,
in each image I, we find the assignment of values to every
pixel, as either foreground or background label. This is rep-
resented by a binary variable x
j
assigned to each pixel j and
is equal to 1 if the pixel is assigned to the foreground. The
assignment is such that the total deviation and separation
penalties are minimized. The deviation (or data) penalty,
d
j
, is charged for a pixel that is set in the foreground, al-
though there is a-priori information indicating it should be
in the background. The separation or smoothness penalty
w
pq
measures the cost of assigning different labels to two
neighboring pixels, p q. As in [7, 4], we give w
pq
as
exp(β||p q||
2
), where β is a constant. The MRF formu-
lation for one image is then:
min
P
d
j
x
j
+
P
ij
w
ij
y
ij
(4)
subject to x
i
x
j
y
ij
x
j
x
i
y
ji
.
x
i
, y
ij
binary for i, j = 1, . . . , n.
3.2. Co-segmentation
Our model attempts to simultaneously minimize the sep-
aration and deviation terms in the MRF model for each im-
age as well as maximize the similarity (rather than minimize
the difference [1, 2]) between the foreground features in the
two images as specified in (3). As these are two conflicting
and incompatible goals, we use a linear combination of the
two objectives (treating the second term as a bias). Let λ
be a coefficient expressing the relative weights of the two
objectives [1, 2]: when the value of λ is high, then similar-
ity is the most important requirement, and when it is low,
the MRF penalties are dominant. Let z
ij
be a variable equal
to 1 if I
(1)
i
F
1
and I
(2)
j
F
2
. Our objective function
minimizes a combination of the penalties incurred by the
MRF optimization in each image, and subtracts the similar-
ity measure of the number of pairs of the same histogram
buckets in the resulting two foreground features. Since we
have a minimization in (4), a high similarity in the fore-
ground features serves as a reward, exactly as desired.
In this formulation, we seek an assignment of the pixel to
the foreground or the background. So we may simplify the
notation: d
(1)
j
, d
(2)
j
are the deviation penalties charged for
placing pixel j in the foreground of image 1 and 2 respec-
tively. These penalties can be positive or negative. The min-
imization objective includes terms representing the MRF
optimization in both images,
X
d
(1)
j
x
(1)
j
+
X
ij
w
ij
y
(1)
ij
+
X
d
(2)
j
x
(2)
j
+
X
ij
w
ij
y
(2)
ij
.
Simultaneously, we also wish to maximize the benefit of
high similarities between corresponding histogram buckets
in both images represented as:
X
iI
(1)
,jI
(2)
s
ij
z
ij
.
Since s
ij
is equal to 1 only for “matching” histogram buck-
ets, this latter term can also be written as:
K
X
k=1
X
iI
(1)
H
k
,jI
(2)
H
k
z
ij
. (5)
Notice that (5) is equivalent to the requirement specified in
(3). Our formulation of the co-segmentation problem is then
a linear combination of the MRF minimization and similar-
ity maximization objectives as follows:
min
P
d
(1)
j
x
(1)
j
+
P
ij
w
ij
y
(1)
ij
+
P
d
(2)
j
x
(2)
j
+
P
ij
w
ij
y
(2)
ij
λ
P
K
k=1
P
iI
(1)
H
k
,jI
(2)
H
k
z
ij
subject to z
ij
x
(1)
i
for i I
(1)
z
ij
x
(2)
j
for j I
(2)
(Co-seg) x
(q )
i
x
(q )
j
y
(q )
ij
for q = 1, 2
x
(q )
j
x
(q )
i
y
(q )
ji
for q = 1, 2
x
(q )
i
, y
(q )
ij
, z
ij
binary for q = 1, 2,
i, j = 1, . . . , n.
Correctness. To verify correctness, observe that the first
set of constraints on z
j
ensures that the binary variable z
j
can be equal to 1 only if both pixels i and j, in the first and
second images respectively, are selected in the foreground.

The second set of constraints is to guarantee that if adjacent
pixels i and j in one of the images are assigned such that one
is in the foreground and the other is in the background, then
the separation penalty for that neighboring pair is charged.
Notice that we make use of the objective that drives z
ij
to
be as large as possible (that is 1) and y
ij
to be as small as
possible (that is 0). It is not difficult to verify that:
Property 3.1 The model of the (Co-seg) problem is defined
on monotone constraints [19] and with a totally unimodular
constraint matrix.
Due to Property 3.1, we can make use of a construction of
an s, t graph G, where the solution to the s, t-cut problem
will provide an optimal solution to the (Co-seg) problem.
4. The graph construction
We now show the construction of the s, t graph G which
will be used to solve (Co-seg): For each of the two images,
the graph contains a grid of nodes, called here pixel-nodes,
one corresponding to each pixel. To achieve only the MRF
segmentation for both images specified as (4), we can use a
graph construction similar to the one described in [17], with
either the 4-neighbor or the 8-neighbor or any other form
of neighborhood topology used to describe the adjacency
relationship between pixel-nodes. But to make it suitable
for co-segmentation, the graph will be modified, details of
which will be described shortly.
We denote the pixel-nodes in the graph by V
x
, as each
corresponds to a variable x
i
. The graph G contains the
“dummy” nodes s and t. Each pixel-node j has a weight
d
j
associated with it, as shown in (4). If d
j
> 0, then there
is an arc (j, t) of capacity d
j
. If d
j
< 0, then there is an arc
(s, j) of capacity d
j
. We partition V
x
to V
x
+
V
x
V
0
,
where for each node j in V
x
+
, d
j
> 0, and for each node
j in V
x
, d
j
< 0. For each pair of adjacent nodes i and j
there is a capacity w
ij
on both directed arcs (i, j) and (j, i).
We now outline the key modifications. In addition to the
nodes for the pixels in the two images, there is a similarity
node, or z-node, for each pair (i
1
k
, i
2
k
) so that i
1
k
I
(1)
H
k
and i
2
k
I
(2)
H
k
. This node corresponds to the variable
z
i
1
k
,i
2
k
in the (Co-seg) model. We denote the set of similarity
nodes by V
z
, and link each such node to both i
1
k
and i
2
k
with
arcs of infinite capacity. We then link this node to the source
with an arc (s, (i
1
k
, i
2
k
)) of capacity λ (weight of the bias).
The constructed graph is G = (V {s, t}, A) with V =
V
x
V
z
and A the set of arcs. The set of arcs A is the union
of: the set of adjacency arcs in I
1
, A
1
; the set of adjacency
arcs in I
2
, A
2
; the set of arcs (j, t) directed to the sink from
all nodes j V
x
+
; the set of arcs (s, j) from the source
to all nodes j V
x
; one arc (s, z) for each node z V
z
and two arcs from each z to the respective pixel nodes. An
illustration of the graph is shown in Figure 2.
s
t
V
Z
i
1
, i
2
λ
d
j
d
j
< 0
j
i
d
i
d
i
> 0
d
k
d
k
> 0
k
I
(1)
I
(2)
j
i
d
j
d
i
w
pq
w
qp
i
1
i
2
Figure 2. The construction of the graph G with two dummy nodes,
the set of pixels in the two images I
(1)
and I
(2)
, and the set of
similarity nodes V
z
. Some nodes and arcs are annotated to show
the graph structure.
5. The algorithm (Co-Seg)
For a finite cut (S {s}, T {t}) of G, we refer to the
set of nodes in V
x
+
S as S
x
+
and we let S
x
= V
x
S
and S
z
= V
z
S. The analogous notation is used for those
sets intersecting T . We can now show the following result.
Theorem 5.1 Let (S {s}, T {t}) be the minimum s, t-
cut in the graph G obtained using a max-flow algorithm.
Then the optimal solution to (Co-Seg) is achieved by setting
x
i
= 1 for each pixel node in the source set S and every
z
i
1
k
,i
2
k
= 1 for each similarity node in the sink set T .
Proof: The graph G has at least one finite capacity cut,
({s}, V {t}). Let (S {s}, T {t}) be a partition of
V {s, t} forming a finite s, t-cut in G. Such a cut corre-
sponds to a feasible solution since if z
i
1
k
,i
2
k
= 1 then also
x
i
1
k
= 1 and x
i
2
k
= 1, as otherwise an infinite capacity arc
will contribute to the capacity of the cut which violates its
finiteness. Other than satisfying this constraint any setting
of the values of the variables x is feasible. The values of the
variables y
(q )
ij
are determined so they are = 1 if the respec-
tive arc in the graph is directed from a node in the source
set (S node) to a node in the sink set (T node). Therefore,
the constraints for the y variables are satisfied as well. This
shows that a finite cut corresponds to a feasible solution to
the problem (Co-Seg). We now compute this cut’s capacity:
C(S {s}, T {t}) =
P
iS
x
+
d
i
+
P
jT
x
(d
j
)
+
P
iS
x
,jT
x
w
ij
+ λ|T
z
|.
We note that
X
jT
x
d
j
=
X
jV
x
d
j
X
jS
x
d
j

and also λ|T
z
| = λ|V
z
| λ|S
z
|. Therefore, the cut value is
C(S {s}, T {t}) =
P
jV
x
d
j
+ λ|V
z
|+
P
iS
x
d
i
+
P
iS
x
,jT
x
w
ij
λ|S
z
|.
The first two terms in the sum are constant. Thus, min-
imizing C(S {s}, T {t}) is equivalent to minimizing
P
iS
x
d
i
+
P
iS
x
,jT
x
w
ij
λ|S
z
|, which is precisely the
objective value of the (Co-Seg) problem, when setting the x
and z variables with corresponding nodes in S to 1.
6. Experimental Results
In this section, we discuss our experiments for evaluating
the performance of our algorithm qualitatively and relative
to earlier approaches. Later, we present experiments to as-
sess the running time of the algorithm on standard image
sizes, and look at the dependence of the results on some
user tunable parameters. In the experiments described here,
we used histograms derived from the image intensity values
and Gabor filter based texture features [20]. Our method
is transparent of the underlying appearance model (i.e., pa-
rameterization of its distribution), and other texture repre-
sentations [21] can be used easily, if desired. For compar-
isons with existing techniques, we used an implementation
of the algorithm in [1]: we start with a segmentation of
the two images using graph cuts, and then incrementally
force the foregrounds to be consistent with one another (in
an alternative fashion). This requires solving a sequence
of graph cuts and the process terminates once the algorithm
has converged or the number of iterations have been reached
(number of iterations was set to 10).
6.1. Qualitative and quantitative analysis
We first present results obtained by the proposed algo-
rithm on a set of images from [1] in Figs. 3-4. In the
first pair of images (stone), we see that a graph cuts seg-
mentation works well on the second image but on the first
image the lower part of the stone is not properly segmented.
The two co-segmentation algorithms, however, successfully
distinguish the object from the background in both images.
In the second set of images (banana), graph cuts overseg-
ments the second image. The first image, however, is easier
to segment, and this characteristic is exploited by the co-
segmentation algorithms to significantly reduce the number
of misclassified pixels. Similarly, the first image in the third
pair is particularly difficult to segment because of negli-
gible contrast variation between the object of interest and
the background. As a result, a graph cuts segmentation
does not perform satisfactorily. Co-segmentation exploits
the stronger discontinuity between the object and the back-
ground to correctly segment the first image also. We con-
tinue the results in Fig. 4, where graph cuts oversegments
(and undersegments) in the first (and second) image pair
respectively. Both co-segmentation algorithms can success-
fully identify the region of interest from background. The
performance of the algorithms on the remaining five images
in our dataset was similar. We found that co-segmentation
improves upon the graph-cuts segmentation by 3-8%. Ex-
pectedly, the improvements are more prominent when one
of the images is “easy” this allows the process to utilize
the additional information to segment an otherwise difficult
second image. While there were small variations in the so-
lution from our algorithm and [1] (see misclassification er-
ror in Figs. 3-4), these differences (w.r.t. accuracy) were
not significant. In general, on image pairs suitable for co-
segmentation the performance of both algorithms is compa-
rable which provides empirical evidence (follow-up to dis-
cussion in §2- 3.2) that the proposed model is suitable for
the problem. A practical advantage offered by our solution
is that it is non-iterative and requires only one max-flow
procedure (discussed in detail in Section 6.3).
Illumination and Scale.
Figure 5 shows a few additional images collected from
image hosting websites (such as Flickr) where co-
segmentation is useful. These examples illustrate that by us-
ing good histogram features, co-segmentation is relatively
invariant to moderate changes in illumination. In addition,
due to our choice of rewarding similarity in histogram fea-
tures (see (5)), small changes in scale of the object (between
images) do not have a significant impact on the empirical
performance of the algorithm.
6.2. Dependencies and variations
Bias magnitude.
In Fig. 6, we illustrate the effect of the final segmentation
as a response to introducing variation in magnitude of the
introduced bias (to favor histogram similarity). For λ too
small, a number of additional pixels are part of the fore-
ground due to a strong influence of the separation penalty.
For a larger λ value (and a large number of z-nodes, see
Fig. 2), the cumulative histogram similarity reward in as-
signing additional pixels to the foreground outweighs the
corresponding MRF penalty. In general, a “sweet spot” for
λ depends less on the specific image, and more on the num-
ber of buckets chosen to specify the histogram (i.e., number
of z-nodes). Therefore, a suitable value can be evaluated us-
ing cross-validation (in our experiments, λ = 0.001 worked
well). We note, however, that it is easy to make this pro-
cedure rigorous if desired. This involves parameterizing λ,
solving a single parametric max-flow procedure [22, 23],
and finding the correlation coefficient of the two foreground
regions for each breakpoint.
Number of histogram buckets.
The number of histogram buckets should be chosen such
that the corresponding similarity nodes (z-nodes) in Fig. 2
are sensitive as well as specific. That is, the number should

Citations
More filters
Proceedings ArticleDOI

iCoseg: Interactive co-segmentation with intelligent scribble guidance

TL;DR: iCoseg, an automatic recommendation system that intelligently recommends where the user should scribble next, is proposed, and users following these recommendations can achieve good quality cutouts with significantly lower time and effort than exhaustively examining all cutouts.
Proceedings ArticleDOI

Discriminative clustering for image co-segmentation

TL;DR: This paper combines existing tools for bottom-up image segmentation such as normalized cuts, with kernel methods commonly used in object recognition, used within a discriminative clustering framework to obtain a combinatorial optimization problem which is relaxed to a continuous convex optimization problem that can be solved efficiently for up to dozens of images.
Journal ArticleDOI

Asymptotics in Statistics–Some Basic Concepts

TL;DR: In this article, the convergence of Distri butions of Likelihood Ratio has been discussed, and the authors propose a method to construct a set of limit laws for Likelihood Ratios.
Proceedings ArticleDOI

One-Shot Learning for Semantic Segmentation

TL;DR: In this paper, a network that, given a small set of annotated images, produces parameters for a Fully Convolutional Network (FCN) to perform dense pixel-level prediction on a test image for the new semantic class.
Proceedings ArticleDOI

Unsupervised Joint Object Discovery and Segmentation in Internet Images

TL;DR: This work proposes to use dense correspondences between images to capture the sparsity and visual variability of the common object over the entire database, which enables us to ignore noise objects that may be salient within their own images but do not commonly occur in others.
References
More filters
Journal ArticleDOI

Active shape models—their training and application

TL;DR: This work describes a method for building models by learning patterns of variability from a training set of correctly annotated images that can be used for image search in an iterative refinement algorithm analogous to that employed by Active Contour Models (Snakes).
Journal ArticleDOI

Fast approximate energy minimization via graph cuts

TL;DR: This work presents two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves that allow important cases of discontinuity preserving energies.
Proceedings ArticleDOI

Fast approximate energy minimization via graph cuts

TL;DR: This paper proposes two algorithms that use graph cuts to compute a local minimum even when very large moves are allowed, and generates a labeling such that there is no expansion move that decreases the energy.
Journal ArticleDOI

SIMPLIcity: semantics-sensitive integrated matching for picture libraries

TL;DR: SIMPLIcity (semantics-sensitive integrated matching for picture libraries), an image retrieval system, which uses semantics classification methods, a wavelet-based approach for feature extraction, and integrated region matching based upon image segmentation to improve retrieval.
Book ChapterDOI

SIMPLIcity: Semantics-sensitive Integrated Matching for Picture Libraries

TL;DR: The SIMPLIcity system represents an image by a set of regions, roughly corresponding to objects, which are characterized by color, texture, shape, and location, which classifies images into categories intended to distinguish semantically meaningful differences.
Related Papers (5)
Frequently Asked Questions (14)
Q1. What are the contributions in "An efficient algorithm for co-segmentation" ?

This paper is focused on the Co-segmentation problem [ 1 ] – where the objective is to segment a similar object from a pair of images. The model proposed here bypasses measurement of the histogram differences in a direct fashion ; the authors show that this enables obtaining efficient solutions to the underlying optimization model. The authors discuss their ideas and present promising experimental results. 

Their objective function minimizes a combination of the penalties incurred by the MRF optimization in each image, and subtracts the similarity measure of the number of pairs of the same histogram buckets in the resulting two foreground features. 

Their rationale is that for each pixel p assigned as foreground in the first image, the authors offer a reward for also selecting (as part of the foreground in the second image) a pixel q which is similar to p. Similarity which is specified by a binaryspq depends on whether p and q belong to the same bucket, and may be allowed to vary in [0, 1] as a function of the likelihood of the match (p → q) detected by some feature extraction method. 

The simplest interpretation of the cosegmentation model, as discussed above, is that it encourages good and coherent segmentation of both images with an additional requirement of consistency between foreground histograms. 

For |F1|, |F2| fixed, the optimization process seeks to maximize the number of pixel pairs (one from each image) with identical histogram buckets. 

This involves parameterizing λ, solving a single parametric max-flow procedure [22, 23], and finding the correlation coefficient of the two foreground regions for each breakpoint. 

Then the optimal solution to (Co-Seg) is achieved by setting xi = 1 for each pixel node in the source set S and every zi1k,i2k = 1 for each similarity node in the sink set T .Proof: 

Due to Property 3.1, the authors can make use of a construction of an s, t graph G, where the solution to the s, t-cut problem will provide an optimal solution to the (Co-seg) problem. 

In addition, due to their choice of rewarding similarity in histogram features (see (5)), small changes in scale of the object (between images) do not have a significant impact on the empirical performance of the algorithm. 

This requires solving a sequence of graph cuts and the process terminates once the algorithm has converged or the number of iterations have been reached (number of iterations was set to 10). 

which is precisely the objective value of the (Co-Seg) problem, when setting the x and z variables with corresponding nodes in S to 1. 

To achieve only the MRF segmentation for both images specified as (4), the authors can use a graph construction similar to the one described in [17], with either the 4-neighbor or the 8-neighbor or any other form of neighborhood topology used to describe the adjacency relationship between pixel-nodes. 

the authors in [4] extended many of these ideas further by incorporating local context, i.e., patterns characterizing the local color and edge configurations. 

Their approach is motivated from the carrot or stick philosophy – where rather than penalize the difference (distance) of the two foreground histograms, the authors reward their similarity (affinity).