scispace - formally typeset
Open AccessProceedings ArticleDOI

Fast approximate energy minimization via graph cuts

Reads0
Chats0
TLDR
This paper proposes two algorithms that use graph cuts to compute a local minimum even when very large moves are allowed, and generates a labeling such that there is no expansion move that decreases the energy.
Abstract
In this paper we address the problem of minimizing a large class of energy functions that occur in early vision. The major restriction is that the energy function's smoothness term must only involve pairs of pixels. We propose two algorithms that use graph cuts to compute a local minimum even when very large moves are allowed. The first move we consider is an /spl alpha/-/spl beta/-swap: for a pair of labels /spl alpha/,/spl beta/, this move exchanges the labels between an arbitrary set of pixels labeled a and another arbitrary set labeled /spl beta/. Our first algorithm generates a labeling such that there is no swap move that decreases the energy. The second move we consider is an /spl alpha/-expansion: for a label a, this move assigns an arbitrary set of pixels the label /spl alpha/. Our second algorithm, which requires the smoothness term to be a metric, generates a labeling such that there is no expansion move that decreases the energy. Moreover, this solution is within a known factor of the global minimum. We experimentally demonstrate the effectiveness of our approach on image restoration, stereo and motion.

read more

Content maybe subject to copyright    Report

Proceedings of “Internation Conference on Computer Vision”, K erkyra, Greece, September 1999 vol.I, p.377
Fast Approximate Energy Minimization via Graph Cuts
Yuri Boykov Olga Veksler Ramin Zabih
Computer Science Department
Cornell University
Ithaca, NY 14853
Abstract
In this paper we address the problem of minimizing
a large class of energy functions that occur in early
vision. The major restriction is that the energy func-
tion’s sm oothness term must only involve pairs of pix-
els. We propose two algorithms that use graph cuts to
compute a local minimum even when very large moves
are allowed. The first move we consider is an α-β-
swap: for a pair of labels α, β, this move exchanges
the labels between an arbitrary set of pixels labeled α
and another arbitrary set labeled β. Our first algo-
rithm generates a labeling such that there is no swap
move that decreases the energy. The second move we
consider is an α-expansion: for a label α, this move
assigns an arbitrary set of pixels the label α. Our sec-
ond algorithm, which requires the smoothness term to
be a metric, generates a labeling such that t here is n o
expansion move that decreases the energy. Moreover,
this solution is within a known factor of the global min-
imum. We experimentally demonstrate the effective-
ness of our approach on image restoration, stereo and
motion.
1 Energy minimization in early vision
Many early vision problems require estimating
some spatially varying q uantity (such as intensity or
disparity) from no isy measurements. Such quantities
tend to be piecewise smooth; they vary smoothly at
most points, but change dramatically at object bound-
aries. Every pixel p P must be assigned a label in
some set L; for motion or stereo, the labels are dispar-
ities, while for image restoration they represent inten-
sities. The goal is to find a labeling f that assigns each
pixel p P a lab e l f
p
L, where f is both piecewise
smooth and consistent with the obse rved data.
These vision problems can be naturally formulated
in terms of energy minimization. In this fr amework,
one seeks the labeling f that minimizes the energy
E(f) = E
smooth
(f) + E
data
(f).
Here E
smooth
measures the extent to which f is not
piecewise smooth, while E
data
measures the disag ree-
ment between f and the observed data. Many differ-
ent energy functions have be e n proposed in the liter -
ature. The form of E
data
is typically
E
data
(f) =
X
p∈P
D
p
(f
p
),
where D
p
measures how appropriate a label is for the
pixel p given the observed data. In image restoration,
for e xample, D
p
(f
p
) is typically (f
p
i
p
)
2
, where i
p
is
the observed intensity of the pixel p.
The choice of E
smooth
is a critica l issue, and
many different functions have been proposed. For
example, in standard regularization-based vision
[6], E
smooth
makes f smooth everywhere. This
leads to poor results at object boundaries. En-
ergy functions that do not have this problem are
called discontinu ity-preserving. A large number of
discontinuity-preserving energy functions have be e n
proposed (see for e xample [7]). Geman and Geman’s
seminal paper [3] gave a Bayesian interpretation of
many energy functions, and proposed a discontinuity-
preserving energy function based on Markov Random
Fields (MRF’s).
The major difficulty with energy minimization for
early vision lies in the enormous computational costs.
Typically these energy functions have many local min-
ima (i.e., they are non-convex). Worse still, the space
of possible labelings has dimension |P|, which is many
thousands. T here have bee n numerous attempts to
design fast algorithms for energy minimization. Simu-
lated annealing was popularized in co mputer visio n by
[3], and is widely us e d since it can optimize an arbi-
trary energy function. Unfortunately, minimizing an
arbitrary energy function requires exponential time,
and as a consequence simulated annealing is very slow.
In practice, annealing is inefficient partly be cause at
each step it changes the value of a single pixel.
The energy functions that we cons ider in this pa-
per ar ise in a variety of different contexts, including
the Bayesian labeling of MRF’s. We allow D
p
to be

Proceedings of “Internation Conference on Computer Vision”, Kerkyra, Greece, September 1999 vol.I, p.378
arbitrary, and c onsider smoothing terms of the form
E
smooth
=
X
{p,q}∈N
V
{p,q}
(f
p
, f
q
), (1)
where N is the set of pairs of adjacent pixels. In spe-
cial cases such energies can be minimized exactly. If
the number of possible labels is |L| = 2 then the exact
solution can be found in polynomial time by comput-
ing a minimum cost cut on a certain graph [4]. If
L is a finite 1D set and the interaction potential is
V (f
p
, f
q
) = |f
p
f
q
| then the exact minimum can also
be found efficiently via graph cuts [5, 2]. In general,
however, the problem is NP- hard [8].
In this paper we develop algorithms that approx-
imately minimize energy E(f) for an arbitrary finite
set of labels L under two fairly general classes of in-
teraction p otentials V : semi-metric and metric. V is
called a semi-metric on the spa c e o f labe ls L if for
any pair of labels α, β L it satisfies two properties:
V (α, β) = V (β, α) 0 and V (α, β) = 0 α = β.
If V also satisfies the triangle inequality
V (α, β) V (α, γ) + V (γ, β) (2)
for any α, β, γ in L then V is called a metric. Note
that both semi-metric and metric include impor-
tant cases of discontinuity-preserving interaction po-
tentials. For example, the truncated L
2
distance
V (α, β) = min(K, ||α β||) and the Potts interaction
penalty V (α, β) = δ(α 6= β) are both metrics.
The algorithms described in this paper generalize
the approach that we originally developed for the case
of the Potts model [2]. In particular, we compute a la-
beling which is a local minimum even when very large
moves are allowed. We begin w ith an over view of our
energy minimization algorithms, which are based on
graph cuts. Our first a lgorithm, described in section 3,
is based on α-β-swap moves and works for any semi-
metric V
{p,q}
’s. Our second algorithm, described in
section 4, is based on more interesting α-ex pansion
moves but works only for metric V
{p,q}
’s (i.e., the addi-
tional triangle inequality constraint is req uired). Note
that α- expansion moves produce a solution within a
known factor of the global minimum of E. A proof of
this can b e found in [8].
2 Energy minimization via graph cuts
The most impor tant property of these methods is
that they produce a local minimum even when large
moves are allowed. In this section, we discuss the
moves we allow, which are best described in terms of
partitions. We sketch the algorithms and list their ba-
sic properties. We then formally introduce the notion
of a graph cut, which is the bas is for our methods.
1. Star t with a n arbitrar y la beling f
2. Set success := 0
3. For each pair of labels { α, β} L
3.1. Fi nd
ˆ
f = arg min E(f
0
) among f
0
within
one α-β swap of f (see Section 3)
3.2. If E(
ˆ
f) < E(f), set f :=
ˆ
f
and success : = 1
4. If succe ss = 1 goto 2
5. Retu rn f
1. Star t with a n arbitrar y la beling f
2. Set success := 0
3. For each label α L
3.1. Fi nd
ˆ
f = arg min E(f
0
) among f
0
within
one α-expansi on of f (see Section 4)
3.2. If E(
ˆ
f) < E(f), set f :=
ˆ
f
and success : = 1
4. If succe ss = 1 goto 2
5. Retu rn f
Figure 1: Our swap move algorithm (top) and expan-
sion move algorithm (bottom).
2.1 Partitions and move spaces
Any labeling f can b e uniquely represented by a
partition of image pixels P = {P
l
| l L} where P
l
=
{p P | f
p
= l} is a subset of pixels assigned label l.
Since there is an obvious one to one correspondence
between labelings f and partitions P, we can use these
notions interchangingly.
Given a pair of labels α, β, a move from a partition
P (labe ling f) to a new partition P
0
(labe ling f
0
) is
called an α-β swap if P
l
= P
0
l
for any label l 6= α, β.
This means that the only difference between P and P
0
is that some pixels that were labeled α in P are now
labeled β in P
0
, and some pixels that were labe led β
in P are now labeled α in P
0
.
Given a label α, a move from a partition P (labeling
f) to a new partition P
0
(labe ling f
0
) is called an α-
expansion if P
α
P
0
α
and P
0
l
P
l
for any label l 6= α.
In other words, an α-expansion move allows any set of
image pixels to change their labels to α.
Note that a move which gives an arbitrary label α to
a single pixel is both an α-β swap and an α-expansion.
As a consequence, the standard move space used in
annealing is a special case of our move spaces.
2.2 Algorithms and properties
We have develope d two energy minimization algo-
rithms, which are shown in figure 1. The structure of

Proceedings of “Internation Conference on Computer Vision”, Kerkyra, Greece, September 1999 vol.I, p.379
the algorithms is quite s imila r. We will call a single
execution of steps 3.1–3.2 an iteration, and an execu-
tion of steps 2–4 a cycle. In each cycle, the algorithm
performs an iteration for every label (expansion move
algorithm) or for every pair of labels (swap move al-
gorithm), in a certain or der that can be fixed or ran-
dom. A cycle is successful if a strictly better labeling
is found at any iteration. The algor ithm stops after
the first unsuccessful cycle since no further improve-
ment is possible. Obviously, a cycle in the swap move
algorithm takes |L|
2
iterations, and a cycle in the ex-
pansion move algorithm takes |L| iterations.
These algorithms have several important proper-
ties. First, the algorithms are guaranteed to terminate
in a finite number of cycles; in fact, under fairly gen-
eral assumptions we can prove termination in O(|P|)
cycles [8]. However, in the experiments we report in
section 5, the algorithm stops after a few cycles and
most of the improvements occur during the first cycle.
Second, once the algo rithm has terminated, the en-
ergy o f the resulting labeling is a local minimum with
respect to a swap or an expansion move. Finally, the
expansion move algo rithm produces a labeling f such
that E(f
) E(f) 2k ·E(f
) where f
is the global
minimum and k =
max{V (α,β) : α6=β}
min{V (α,β) : α6=β}
(see [8]).
2.3 Graph cuts
The key part of each algorithm is step 3.1, where
graph cuts are used to efficiently find
ˆ
f. Let G = hV, Ei
be a weighted graph with two distinguished vertices
called the terminals. A cut C E is a set of edges
such that the terminals are separ ated in the induced
graph G(C) = hV, E Ci. In addition, no proper subset
of C separ ates the terminals in G(C). The cost of the
cut C, denoted |C|, equals the sum of its edge weights.
The minimum cut problem is to find the cut with
smallest cost. There are many algorithms for this
problem with low-order polynomial complexity [1]; in
practice they run in near-linear time for our graphs.
Step 3.1 uses a single minimum cut on a graph
whose size is O (|P|). The graph is dynamically up-
dated after each iteration. The details of this mini-
mum cut are quite different for the swap move and
the expansion move algorithms, as described in the
next two sections .
3 Finding the optimal swap move
Given an input labeling f (partition P) and a pair
of labels α, β, we wish to find a labeling
ˆ
f that min-
imizes E over all labe lings within one α-β swap of f .
This is the critical step in the algorithm given at the
top of Figure 1. Our technique is based on comput-
ing a lab e ling co rresponding to a minimum cut on a
P
β
α
r
P
t
α
q
p
...
q
w
t
r
P
β
α
t
e
{r,s}
{p,q}
e
p
β
s
α
t
s
p
α
t
α
β
Figure 2: An example of the gr aph G
αβ
for a 1D image .
The set of pixels in the image is P
αβ
= P
α
P
β
where
P
α
= {p, r, s} and P
β
= {q, . . . , w}.
graph G
αβ
= hV
αβ
, E
αβ
i. The structure of this graph
is dynamically determined by the current par tition P
and by the labels α, β.
This section is organized as follows. First we de-
scribe the construction of G
αβ
for a given f (or P).
We show that cuts C on G
αβ
correspond in a natural
way to labelings f
C
which are within one α-β swap
move of f . Theorem 1 shows that the cost of a cut
is |C| = E(f
C
) plus a constant. A corollary from this
theorem states our main result that the desired label-
ing
ˆ
f equals f
C
where C is a minimum cut on G
αβ
.
The structure of the graph is illustrated in Figure 2.
For legibility, this figure shows the case of 1D image.
For any image the structure of G
αβ
will be as follows.
The set of vertices includes the two terminals α and β,
as well as image pixels p in the sets P
α
and P
β
(that
is f
p
{α, β}). Thus, the set of vertices V
αβ
consists
of α, β, and P
αβ
= P
α
P
β
. Each pixel p P
αβ
is
connected to the terminals α and β by edges t
α
p
and
t
β
p
, respectively. For brevity, we will re fer to these
edges as t-links (terminal links). Each pair of pixels
{p, q} P
αβ
which are neighbors (i.e. {p , q} N ) is
connected by a n edge e
{p,q}
which we will call an n- link
(neighbor link). The s e t of edges E
αβ
thus consists of
S
p∈P
αβ
{t
α
p
, t
β
p
} (the t-links) and
S
{p,q}∈N
p,q∈P
αβ
e
{p,q}
(the
n-links). The weights assigned to the edges are
edge weight for
t
α
p
D
p
(α) +
P
q∈N
p
q6∈P
αβ
V
{p,q}
(α, f
q
) p P
αβ
t
β
p
D
p
(β) +
P
q∈N
p
q6∈P
αβ
V
{p,q}
(β, f
q
) p P
αβ
e
{p,q}
V
{p,q}
(α, β)
{p,q}∈N
p,q∈P
αβ

Proceedings of “Internation Conference on Computer Vision”, Kerkyra, Greece, September 1999 vol.I, p.380
Any cut C on G
αβ
must sever (include) exactly one t-
link for any pixel p P
αβ
: if neither t-link were in C,
there would be a path between the terminals; while if
both t-links were cut, then a proper subset of C would
be a cut. Thus, any cut leaves each pixel in P
αβ
with
exactly one t-link. This defines a natural labeling f
C
corresponding to a cut C on G
αβ
,
f
C
p
=
α if t
α
p
C for p P
αβ
β if t
β
p
C for p P
αβ
f
p
for p P , p / P
αβ
.
(3)
In other words, if the pixel p is in P
αβ
then p is as-
signed label α when the cut C separa tes p from the
terminal α; similarly, p is assigned label β when C
separates p from the terminal β. If p is not in P
αβ
then we keep its initial label f
p
. This implies
Lemma 1 A labeling f
C
corresponding to a cut C on
G
αβ
is one α-β swap away from the initial labeling f.
It is easy to show that a cut C s evers an n-link
e
{p,q}
between neighboring pixels on G
αβ
if and only
if C leaves the pixels p and q connected to different
terminals. Formally
Property 1 For any cut C and for any n-link e
{p,q}
:
a) If t
α
p
, t
α
q
C then e
{p,q}
6∈ C.
b) If t
β
p
, t
β
q
C t hen e
{p,q}
6∈ C.
c) If t
β
p
, t
α
q
C then e
{p,q}
C.
d) If t
α
p
, t
β
q
C then e
{p,q}
C.
These properties are illustrated in figure 3. The next
lemma is a conseq uence of property 1 and equation 3.
Lemma 2 For any cu t C and for any n-link e
{p,q}
|C e
{p,q}
| = V
{p,q}
(f
C
p
, f
C
q
).
Lemmas 1 and 2 plus property 1 yield
Theorem 1 There is a one to one correspondence be-
tween cuts C on G
αβ
and labelings that are one α-β
swap from f. Moreover, the cost of a cut C on G
αβ
is
|C| = E(f
C
) plus a constant .
Proof: The first part follows from the fact that the
severed t-links uniquely determine the labels assigned
to pixels p and n-links that must to be cut. We now
compute the cost of a cut C, which is
|C| =
X
p∈P
αβ
|C {t
α
p
, t
β
p
}| +
X
{p,q}∈N
{p,q}⊂P
αβ
|C e
{p,q}
|. (4)
α
p
e
q
t
{p,q}
p
t
α
q
t
p
β
t
q
β
α
β
cut
t
α
p
t
{p,q}
α
p
β
e
q
β
t
β
q
p
α
q
t
cut
t
α
p
t
{p,q}
α
p
β
e
q
β
t
β
q
p
α
q
t
cut
Property 1(a) Proper ty 1(b) Property 1(c,d)
Figure 3: Properties of a cut C on G
αβ
for two pixels
p, q N co nnected by an n-link e
{p,q}
. Dotted lines
show the edges cut by C and solid lines show the edges
remaining in the induced graph G(C) = hV, E Ci.
Note that for p P
αβ
we have
|C {t
α
p
, t
β
p
}| = D
p
(f
C
p
) +
X
q∈N
p
q6∈P
αβ
V
{p,q}
(f
C
p
, f
q
).
Lemma 2 g ives the second term in (4). Thus, the total
cost of a cut C is
|C| =
X
p∈P
αβ
D
p
(f
C
p
) +
X
{p,q}∈N
p or q ∈P
αβ
V
{p,q}
(f
C
p
, f
C
q
).
This can be rewritten as |C| = E(f
C
) K where
K =
X
p6∈P
αβ
D
p
(f
p
) +
X
{p,q}∈N
{p,q}∩P
αβ
=
V
{p,q}
(f
p
, f
q
)
is the same constant for all cuts C.
Corollary 1 The optimal α-β swap from f is
ˆ
f = f
C
where C is the minimum cut on G
αβ
.
4 Finding the optimal expansion move
Given an input labeling f (partition P) and a la-
bel α, we wish to find a labeling
ˆ
f that minimizes E
ove r all labelings within one α-expansion of f. This is
the critical step in the algorithm given at the bottom
of Figure 1. In this section we describe a technique
that solves the problem assuming that each V
{p,q}
is
a metric, and thus satisfies the triangle inequality (2).
Some important examples of metrics are given in the
introduction. Our technique is based on computing a
labeling corresponding to a minimum cut on a graph
G
α
= hV
α
, E
α
i. The structure of this graph is deter-
mined by the current partition P and by the label α.

Proceedings of “Internation Conference on Computer Vision”, Kerkyra, Greece, September 1999 vol.I, p.381
α
t
a
α
t
q
t
{p,a}
e
t
α
q
t
α
s
α
t
r
2
P
e
{q,r}
e
{a,q}
P
α
1
P
s
b
r
q
a
α
α
α
p
p
Figure 4: An example of G
α
for a 1D image. The set of
pixels in the image is P = {p, q, r, s} and the current
partition is P = {P
1
, P
2
, P
α
} where P
1
= {p}, P
2
=
{q, r}, and P
α
= {s}. Two auxiliary nodes a = a
{p,q}
,
b = a
{r,s}
are introduced between neighbor ing pixels
separated in the current partition. Auxiliary nodes
are added at the boundary of sets P
l
.
As b e fo re, the graph dynamically changes after each
iteration.
This section is organized as follows. First we de-
scribe the construction of G
α
for a given f (or P)
and α. We show that cuts C on G
α
correspond in
a natural way to labelings f
C
which are within one
α-expansion move of f. Then, ba sed on a number of
simple properties, we define a class of elementary cuts.
Theorem 2 shows that elementary cuts are in one to
one correspondence with labelings that are within one
α-expansion of f, and also that the cost of an elemen-
tary cut is |C| = E(f
C
). A corollary from this theo-
rem states our main result that the desired labeling
ˆ
f
equals f
C
where C is a minimum cut on G
α
.
The structure of the graph is illustrated in Figure 4.
For legibility, this figure shows the case of 1D image.
The set of vertices includes the two terminals α and ¯α,
as well as all image pixels p P. In addition, for each
pair of neighboring pixels {p, q} N separated in the
current partition (i.e. f
p
6= f
q
) we create an auxiliary
vertex a
{p,q}
. Auxiliary nodes are introduced at the
boundaries between partition sets P
l
for l L. Thus,
the set of vertices is
V
α
= { α , ¯α , P ,
[
{p,q}∈N
f
p
6=f
q
a
{p,q}
}.
Each pixel p P is c onnected to the terminals α a nd
¯α by t-links t
α
p
and t
¯α
p
, correspondingly. Each pair
of neighboring pixels {p, q} N which are not sepa-
rated by the partition P (i.e. f
p
= f
q
) is connected by
an n-link e
{p,q}
. For ea ch pair of neighboring pixels
{p, q} N such that f
p
6= f
q
we create a triplet of
edges E
{p,q}
=
e
{p,a}
, e
{a,q}
, t
¯α
a
where a = a
{p,q}
is the corresponding auxiliary node. The edges e
{p,a}
and e
{a,q}
connect pixels p and q to a
{p,q}
and the
t-link t
¯α
a
connects the auxiliary node a
{p,q}
to the ter-
minal ¯α. So we can write the set of all edges as
E
α
= {
[
p∈P
{t
α
p
, t
¯α
p
} ,
[
{p,q}∈N
f
p
6=f
q
E
{p,q}
,
[
{p,q}∈N
f
p
=f
q
e
{p,q}
}.
The weights assigned to the edges are
edge weight for
t
¯α
p
p P
α
t
¯α
p
D
p
(f
p
) p 6∈ P
α
t
α
p
D
p
(α) p P
e
{p,a}
V
{p,q}
(f
p
, α)
e
{a,q}
V
{p,q}
(α, f
q
) {p, q} N , f
p
6= f
q
t
¯α
a
V
{p,q}
(f
p
, f
q
)
e
{p,q}
V
{p,q}
(f
p
, α) {p, q} N , f
p
= f
q
As in section 3, any cut C o n G
α
must sever (in-
clude) exac tly one t-link for a ny pixel p P. This
defines a natural labeling f
C
corresponding to a cut C
on G
α
. Formally,
f
C
p
=
(
α if t
α
p
C
f
p
if t
¯α
p
C
p P. (5)
In other words, a pixel p is assigned label α if the cut
C separ ates p from the terminal α a nd, p is assigned
its old label f
p
if C separates p from ¯α. Note that for
p 6∈ P
α
the terminal ¯α represents labels assigned to
pixels in the initial labeling f. Clearly we have
Lemma 3 A labeling f
C
corresponding to a cut C on
G
α
is one α-expansion away from the initial labeling f.
It is also easy to show tha t a cut C severs an n-
link e
{p,q}
between neighboring pixe ls {p, q} N such
that f
p
= f
q
if and only if C leaves the pixels p and
q connected to different terminals. In other words,
Property 1 holds when we substitute ¯α for β. We
will refer to this as Property 1(¯α). Analogously, we
can show that Property 1(¯α) and equation (5) estab-
lish Lemma 2 for the n-links e
{p,q}
on G
α
.

Citations
More filters
Proceedings ArticleDOI

Going deeper with convolutions

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

Pattern Recognition and Machine Learning

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Book

Machine Learning : A Probabilistic Perspective

TL;DR: This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach, and is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.
Journal ArticleDOI

The Split Bregman Method for L1-Regularized Problems

TL;DR: This paper proposes a “split Bregman” method, which can solve a very broad class of L1-regularized problems, and applies this technique to the Rudin-Osher-Fatemi functional for image denoising and to a compressed sensing problem that arises in magnetic resonance imaging.
Book

Computer Vision: Algorithms and Applications

TL;DR: Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images and takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene.
References
More filters
Journal ArticleDOI

Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images

TL;DR: The analogy between images and statistical mechanics systems is made and the analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations, creating a highly parallel ``relaxation'' algorithm for MAP estimation.
Journal ArticleDOI

Snakes : Active Contour Models

TL;DR: This work uses snakes for interactive interpretation, in which user-imposed constraint forces guide the snake near features of interest, and uses scale-space continuation to enlarge the capture region surrounding a feature.
Journal ArticleDOI

Normalized cuts and image segmentation

TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.
Proceedings ArticleDOI

Normalized cuts and image segmentation

TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.
Journal ArticleDOI

Determining optical flow

TL;DR: In this paper, a method for finding the optical flow pattern is presented which assumes that the apparent velocity of the brightness pattern varies smoothly almost everywhere in the image, and an iterative implementation is shown which successfully computes the Optical Flow for a number of synthetic image sequences.
Related Papers (5)
Frequently Asked Questions (8)
Q1. What are the contributions in "Fast approximate energy minimization via graph cuts" ?

In this paper the authors address the problem of minimizing a large class of energy functions that occur in early vision. The authors propose two algorithms that use graph cuts to compute a local minimum even when very large moves are allowed. The first move the authors consider is an α-βswap: for a pair of labels α, β, this move exchanges the labels between an arbitrary set of pixels labeled α and another arbitrary set labeled β. The second move the authors consider is an α-expansion: for a label α, this move assigns an arbitrary set of pixels the label α. The authors experimentally demonstrate the effectiveness of their approach on image restoration, stereo and motion. In this framework, one seeks the labeling f that minimizes the energy E ( f ) = Esmooth ( f ) + Edata ( f ). 

minimizing an arbitrary energy function requires exponential time, and as a consequence simulated annealing is very slow. 

Many early vision problems require estimating some spatially varying quantity (such as intensity or disparity) from noisy measurements. 

The goal is to find a labeling f that assigns each pixel p ∈ P a label fp ∈ L, where f is both piecewise smooth and consistent with the observed data. 

The cost of an elementary cut C is|C| = ∑p∈P|C ∩ {tαp , t ᾱ p }| (6)+ ∑{p,q}∈N fp=fq|C ∩ e{p,q}| + ∑{p,q}∈N fp 6=fq|C ∩ E{p,q}|. 

This example demonstrates the need for non-Potts energy functions, as minimizing E2 gives significant “banding” problems (shown in the second image). 

It is easy to show that a cut C severs an n-link e{p,q} between neighboring pixels on Gαβ if and only if C leaves the pixels p and q connected to different terminals. 

the expansion move algorithm produces a labeling f such that E(f∗) ≤ E(f) ≤ 2k ·E(f∗) where f∗ is the global minimum and k = max{V (α,β) : α6=β} min{V (α,β) : α6=β} (see [8]).