scispace - formally typeset

Book ChapterDOI

Multi-label MRF Optimization via a Least Squares s - t Cut

26 Nov 2009-pp 1055-1066

Abstract: We approximate the k -label Markov random field optimization by a single binary (s - t ) graph cut. Each vertex in the original graph is replaced by only ceil (log 2 (k )) new vertices and the new edge weights are obtained via a novel least squares solution approximating the original data and label interaction penalties. The s - t cut produces a binary "Gray" encoding that is unambiguously decoded into any of the original k labels. We analyze the properties of the approximation and present quantitative and qualitative image segmentation results, one of the several computer vision applications of multi label-MRF optimization.
Topics: Maximum cut (61%), Cut (60%), Markov random field (53%), Image segmentation (52%)

Content maybe subject to copyright    Report

Multi-label MRF Optimization
via a Least Squares s t Cut
Ghassan Hamarneh
School of Computing Science, Simon Fraser University, Canada
Abstract. We approximate the k-label Markov random field optimiza-
tion by a single binary (st) graph cut. Each vertex in the original graph
is replaced by only ceil(log
2
(k)) new vertices and the new edge weights
are obtained via a novel least squares solution approximating the origi-
nal data and label interaction penalties. The s t cut produces a binary
“Gray” encoding that is unambiguously decoded into any of the original
k labels. We analyze the properties of the approximation and present
quantitative and qualitative image segmentation results, one of the sev-
eral computer vision applications of multi label-MRF optimization.
1 Introduction
Many visual computing tasks can be formulated as graph labeling problems,
e.g. segmentation and stereo-reconstruction [1], in which one out of k labels is
assigned to each graph vertex. This may be formulated as a k-way cut problem:
Given graph G(V,E)with|V | vertices v
j
V and |E| edges e
v
i
,v
j
= e
ij
E
V × V with weights w(e
ij
)=w
ij
> 0, find an optimal k-cut C
E with min-
imal cost |C
| = argmin
C
|C|,where|C| =
e
ij
C
w
ij
, such that E\C breaks
the graph into k groups of labelled vertices. This k-cut formulation encodes the
semantics of the problem at hand (e.g. segmentation) into w
ij
. However, if the
optimal label assigned to a vertex depends on the labels assigned to other vertices
(e.g. to regularize the label field), setting w
ij
i, j becomes less straightforward.
The Markov random field (MRF) formulation captures this desired label inter-
action via an energy ξ(l) to be minimized with respect to the vertex labels l.
ξ(l)=
v
i
V
D
i
(l
i
)+λ
(v
i
,v
j
)E
V
ij
(l
i
,l
j
,d
i
,d
j
)(1)
where D
i
(l
i
) penalizes labeling v
i
with l
i
,andV
ij
, aka prior, penalizes assigning
labels (l
i
,l
j
) to neighboring vertices
1
. V
ij
may be influenced by the data value
d
i
at v
i
(e.g. image intensity). λ controls the relative importance of D
i
and V
ij
.
For labeling a P -pixel image, typically a graph G is constructed with |V | = P .
To encode D
i
(l
i
), G may be augmented with k new terminal vertices {t
j
}
k
j=1
;
each representing one of the k labels (Figure 2(a)) and w
v
i
,t
j
set inversely pro-
portional to D
i
(l
j
). When V
ij
= V
ij
(d
i
,d
j
), i.e. independent of l
i
and l
j
, V
ij
may be encoded by w
v
i
,v
j
V
ij
(d
i
,d
j
). The random walker [2] globally solves a
1
Higher order priors, e.g. 3
rd
order V
ijk
(l
i
,l
j
,l
k
), are also possible.
G. Bebis et al. (Eds.): ISVC 2009, Part I, LNCS 5875, pp. 1055–1066, 2009.
c
Springer-Verlag Berlin Heidelberg 2009

1056 G. Hamarneh
labeling problem of this type, i.e. disregarding label interaction. Solving multi-
label MRF optimization for any interaction penalty remains an active research
area. In [3], the globally optimal binary (k=2) labeling is found using min-cut
max-flow. For k>2 with convex prior, the global minimizer is attained by
replacing each single k-label variable with k [4] or by using k 1 [5] boolean
variables. However, convex priors tend to over-smooth the label field. For k>2
with metric or semi-metric priors, Boykov et al. performed range moves using
binary cuts to expand or swap labels [1]. Other range moves were proposed in
[6,7]. More recent approaches to multi-label MRF optimization were proposed
based on linear programming relaxation using primal-dual [8], message passing
or belief propagation [9], and partial optimality [10] (see [11] for a recent survey).
In this paper, we focus on optimal encoding of the k-label MRF energy solely
into the edge weights of a graph. We impose no restrictions on k, or on the order
(2
nd
or higher) or type (e.g. non-convex, non-metric, or spatially varying) of the
label interaction penalty. The calculated edge weights are optimal in the sense
that they minimize the least squares (LS) error when solving a linear system of
equations capturing the original MRF penalties. Further, we transform the multi-
labelling problem to a binary st cut, in which each vertex in the original graph is
replaced by the most compact boolean representation; only ceil(log
2
(k)) vertices
represent each k-label variable. In [12], a general framework for converting multi-
label problems to binary ones is presented. In contrast to our work, [12] solved a
system of equations to find the boolean encoding function (not the edge weights),
they did not use LS, and their resulting binary problem can still include label
interaction. We perform a single (non-iterative and initialization-independent)
s t cut to obtain a “Gray” binary encoding, which is then unambiguously
decoded into the k labels. Besides its optimality features, LS enables offline pre-
computation of pseudoinverse matrices that can be re-used for different graphs.
2Method
2.1 Reformulating the Multi-label MRF as an s t Cut
Given a graph G(V,E), the objective is to label each vertex v
i
V with a label
l
i
∈L
k
= {l
0
,l
1
, ..., l
k1
}. Rather than labeling v
i
with l
i
∈L
k
, we replace v
i
with b vertices (v
ij
)
b
j=1
, and binary-label them with (l
ij
)
b
j=1
, i.e. l
ij
∈L
2
=
{l
0
,l
1
}. b is chosen such that 2
b
k or b = ceil(log
2
(k)), i.e. alongenough
sequence of bits to be decoded into l
i
∈L
k
2
. To this end, we transform G(V,E)
into a new graph G
2
(V
2
,E
2
) with additional source s and sink t nodes, i.e.|V
2
| =
b|V | +2. E
2
includes terminal links E
tlinks
2
= E
t
2
E
s
2
where |E
t
2
| = |E
s
2
| = |V
2
|;
neighborhood links E
nlinks
2
= E
ns
2
E
nf
2
where |E
nlinks
2
| = b
2
|E|, |E
ns
2
| = b|E|,
and |E
nf
2
| =(b
2
b)|E|; and intra-links E
intra
2
where |E
intra
2
| =
b
2
|V |.Figure
1 shows these different types of edges. Following an s t cut on G
2
, vertices v
ij
that remain connected to s are assigned label 0, and the remaining are connected
2
We distinguish between the decimal (base 10) and binary (base 2) encoding of the
labels using the notation (l
i
)
10
and (l
i
)
2
=(l
i1
,l
i2
, ··· ,l
ib
)
2
, respectively.

Multi-label MRF Optimization via a Least Squares s t Cut 1057
t
s
v
21
v
22
v
23
v
24
v
11
v
12
v
13
v
14
v
41
v
42
v
43
v
44
v
31
v
32
v
33
v
34
v
51
v
52
v
53
v
54
E
2
t
E
2
s
E
2
intra
E
2
ns
v
61
v
62
v
63
v
64
v
71
v
72
v
73
v
74
E
2
nf
Fig. 1. Edge types in the s t graph. Shown are seven groups of vertex quadruplets,
b=4, and only sample edges from E
t
2
,E
s
2
,E
ns
2
,E
nf
2
, and E
intra
2
.
t
s
t
0
t
1
t
2
t
3
v
1
v
2
v
3
v
4
v
5
v
21
v
22
v
31
v
32
v
41
v
42
v
51
v
52
l
0
l
1
l
2
l
3
v
1
v
2
v
3
v
4
v
5
00
01 10
11
v
11
v
12
(a)
(b)
(c)
l
0
l
1
l
2
l
3
Fig. 2. Reformulating the multi-label problem as an s t cut. (a) Labeling vertices
{v
i
}
5
i=1
with labels {l
j
}
3
j=0
(only t-links are shown). (b) New graph with 2 terminal
nodes {s, t}, b =2newvertices(v
i1
and v
i2
inside the dashed circles) replacing each
v
i
in (a), and 2 terminal edges for each v
ij
.Ans t cut on (b) is depicted as the green
curve. (c) Labeling v
i
in (a) is based on the s t cut in (b): Pairs of (v
i1
,v
i2
) assigned
to (s, s) are labeled with binary string 00, (s, t) with 01, (t, s) with 10, and (t, t)with
11. The binary encodings {00,01,10,11} in turn reflect the original 4 labels {l
j
}
3
j=0
.
to t and assigned label 1. The string of b binary labels l
ij
∈L
2
assigned to v
ij
are
then decoded back into a decimal number indicating the label l
i
∈L
k
assigned
to v
i
(Figure 2).
It is important to set the edge weights of E
2
in such a way that decoding the
binary labels resulting from the s t cut of G
2
results in optimal (or close to
optimal) labels for the original multi-label problem. To achieve this, we derive a
system of linear equations capturing the relation between the original multi-label
MRF penalties and the s t cut cost incurred when generating different label
configurations. We then calculate the weights of E
2
as the LS error solution to
these equations. The next sections expose the details.
2.2 Data Term Penalty: Severing T-Links and Intra-Links
The 1
st
order penalty D
i
(l
i
) in (1) is the cost of assigning l
i
to v
i
in G,which
entails assigning a corresponding sequence of binary labels (l
ij
)
b
j=1
to (v
ij
)
b
j=1
in G
2
. To assign (l
i
)
2
toastringofb vertices, appropriate terminal links must
be cut. To assign a 0 (resp. 1) label to v
ij
, the edge connecting v
ij
to t (resp.

1058 G. Hamarneh
11
01
10
00
100100
11
t
s
t
s
t
s
t
s
t
s
t
s
t
s
t
s
000 001 010 011 100 101 110 111
s
t
s
t
s
t
s
t
s
t
Fig. 3. The 2
b
ways of cutting through {v
ij
}
b
j=1
for b = 2 (left) and b = 3 (right) with
the resulting binary codes {00, 01, 10, 11} and {000, 001, ··· , 111}
s) must be severed (Figure 3). Therefore, the cost of severing t-links in G
2
to
assign l
i
to vertex v
i
in G is calculated as
D
tlinks
i
(l
i
)=
b
j=1
l
ij
w
v
ij
,s
+
¯
l
ij
w
v
ij
,t
(2)
where
¯
l
ij
denotes the unary complement (NOT) of l
ij
.TheG
2
s t cut severing
the t-links, as per (2), will also result in severing edges in E
intra
2
(Figure 1). In
particular, e
im,in
E
intra
2
will be severed iff the st cut leaves v
im
connected to
one terminal, say s (resp. t), while v
in
remains connected to the other terminal
t (resp. s). If this condition holds, then w
v
im
,v
in
will contribute to the cost.
Therefore, the cost of severing intra-links in G
2
to assign l
i
to vertex v
i
in G is
D
intra
i
(l
i
)=
b
m=1
b
n=m+1
(l
im
l
in
) w
v
im
,v
in
(3)
where denotes binary XOR. The total data penalty is the sum of (2) and (3),
D
i
(l
i
)=D
tlinks
i
(l
i
)+D
intra
i
(l
i
). (4)
2.3 Prior Term Penalty: Severing N-Links
The interaction penalty V
ij
(l
i
,l
j
,d
i
,d
j
) for assigning l
i
to v
i
and l
j
to neighboring
v
j
in G must equal the cost of assigning a sequence of binary labels (l
im
)
b
m=1
to
(v
im
)
b
m=1
and (l
jn
)
b
n=1
to (v
in
)
b
n=1
in G
2
. The cost of this cut can be calculated
as (Figure 4)
V
ij
(l
i
,l
j
,d
i
,d
j
)=
b
m=1
b
n=1
(l
im
l
jn
) w
v
im
,v
jn
. (5)
This effectively adds the edge weight between v
im
and v
jn
to the cut cost iff the
cut results in one vertex of the edge connected to one terminal (s or t) while
the other vertex connected to the other terminal (t or s). Note that we impose
no restrictions on the left hand side of (5), e.g. it could reflect non-convex or
non-metric priors, spatially-varying, or even higher order label interaction.

Multi-label MRF Optimization via a Least Squares s t Cut 1059
v
i
v
j
v
i
v
j
v
i1
v
j1
v
i2
v
j2
v
i3
v
j3
v
i1
v
j1
v
i2
v
j2
00 00
000 000
01 10 11 10 11 11
011 100 111 110
Fig. 4. Severing n-links between neighboring vertices v
i
and v
j
for b = 2 (four examples
are shown in the top row) and b = 3 (three examples in the bottom row). The cut is
depicted as a red curve. In the last two examples for b = 3, the colored vertices are
translated while maintaining the n-links in order to clearly show that the severed n-links
for each case follow (5).
2.4 Edge Weight Approximation with Least Squares
Equations (4) and (5) dictate the relationship between the penalty terms (D
i
and V
ij
) of the original multi-label problem and the severed edge weights w
ij,mn
;
e
ij,mn
E
2
of the s t graph G
2
. What remains missing before applying the
s t cut, however, is to find these edge weights.
Edge weights of t-links and intra-links. For b =1(i.e. binary labelling),
(3) simplifies to D
intra
i
(l
i
) = 0 and (4) simplifies to D
i
(l
i
)=l
i1
w
v
i1
,s
+
¯
l
i1
w
v
i1
,t
.
With l
i
= l
i1
for b = 1, substituting the two possible values for l
i
= {l
0
,l
1
},we
obtain
l
i
= l
0
D
i
(l
0
)=l
0
w
v
i1
,s
+
¯
l
0
w
v
i1
,t
=0w
v
i1
,s
+1w
v
i1
,t
l
i
= l
1
D
i
(l
1
)=l
1
w
v
i1
,s
+
¯
l
1
w
v
i1
,t
=1w
v
i1
,s
+0w
v
i1
,t
(6)
which can be written in matrix form A
1
X
i
1
= B
i
1
as
01
10
w
v
i1
,s
w
v
i1
,t
=
D
i
(l
0
)
D
i
(l
1
)
where X
i
1
is the vector of unknown edge weights connecting vertex v
i1
to s and t,
B
i
1
is the data penalty for v
i
,andA
1
is the matrix of coefficients. The subscript
1inA
1
,X
i
1
, and B
i
1
indicates that this matrix equation is for b = 1. Clearly, the
solution is trivial and expected: w
v
i1
,s
= D
i
(l
1
)andw
v
i1
,t
= D
i
(l
0
)
For b = 2, we address multi-label problems of k = {3, 4},or2
b1
=2<k
2
b
= 4 labels. Substituting the 2
b
= 4 possible label values, ((0,0),(0,1),(1,0),
and (1,1)), of (l
i
)
2
=(l
i1
,l
i2
) in (4) we obtain
(0, 0) D
i
(l
0
)=0w
v
i1
,s
+1w
v
i1
,t
+0w
v
i2
,s
+1w
v
i2
,t
+0w
v
i1
,v
i2
(0, 1) D
i
(l
1
)=0w
v
i1
,s
+1w
v
i1
,t
+1w
v
i2
,s
+0w
v
i2
,t
+1w
v
i1
,v
i2
(1, 0) D
i
(l
2
)=1w
v
i1
,s
+0w
v
i1
,t
+0w
v
i2
,s
+1w
v
i2
,t
+1w
v
i1
,v
i2
(1, 1) D
i
(l
3
)=1w
v
i1
,s
+0w
v
i1
,t
+1w
v
i2
,s
+0w
v
i2
,t
+0w
v
i1
,v
i2
(7)
which can be written in matrix form A
2
X
i
2
= B
i
2
as

Figures (8)
Citations
More filters

Journal ArticleDOI
Masoud Nosrati1, Ghassan Hamarneh1Institutions (1)
TL;DR: This paper augments the level set framework with the ability to handle these two intuitive geometric relationships, containment and exclusion, along with a distance constraint between boundaries of multi-region objects, and compared this framework with its counterpart methods in the discrete domain.
Abstract: Incorporating prior knowledge into image segmentation algorithms has proven useful for obtaining more accurate and plausible results. Two important constraints, containment and exclusion of regions, have gained attention in recent years mainly due to their descriptive power. In this paper, we augment the level set framework with the ability to handle these two intuitive geometric relationships, containment and exclusion, along with a distance constraint between boundaries of multi-region objects. Level set's important property of automatically handling topological changes of evolving contours/surfaces enables us to segment spatially-recurring objects (e.g., multiple instances of multi-region cells in a large microscopy image) while satisfying the two aforementioned constraints. In addition, the level set approach gives us a very simple and natural way to compute the distance between contours/surfaces and impose constraints on it. The downside, however, is a local optimization framework in which the final segmentation solution depends on the initialization. In fact, here, we sacrifice the optimizability (local instead of global solution) in exchange for lower space complexity (less memory usage) and faster runtime (especially for large microscopic images) as well as no grid artifacts. Nevertheless, the result from validating our method on several biomedical applications showed the utility and advantages of this augmented level set framework (even with rough initialization that is distant from the desired boundaries). We also compared our framework with its counterpart methods in the discrete domain and reported the pros and cons of each of these methods in terms of metrication error and efficiency in memory usage and runtime.

23 citations


Proceedings ArticleDOI
Guillaume Charpiat1Institutions (1)
20 Jun 2011-
TL;DR: All possible ways of building graphs and the associated energies minimized, leading to the exhaustive family of energies minimizable exactly by a graph cut are studied, including energies that do not satisfy the submodularity condition.
Abstract: Graph cuts are widely used in many fields of computer vision in order to minimize in small polynomial time complexity certain classes of energies. These specific classes depend on the way chosen to build the graphs representing the problems to solve. We study here all possible ways of building graphs and the associated energies minimized, leading to the exhaustive family of energies minimizable exactly by a graph cut. To do this, we consider the issue of coding pixel labels as states of the graph, i.e. the choice of state interpretations. The family obtained comprises many new classes, in particular energies that do not satisfy the submodularity condition, including energies that are even not permuted-submodular. A generating subfamily is studied in details, in particular we propose a canonical form to represent Markov random fields, which proves useful to recognize energies in this subfamily in linear complexity almost surely, and then to build the associated graph in quasilinear time. A few experiments are performed, to illustrate the new possibilities offered.

6 citations


References
More filters

Book
01 Jan 1983-

34,706 citations


Journal ArticleDOI
Stuart Geman1, Donald Geman2Institutions (2)
TL;DR: The analogy between images and statistical mechanics systems is made and the analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations, creating a highly parallel ``relaxation'' algorithm for MAP estimation.
Abstract: We make an analogy between images and statistical mechanics systems. Pixel gray levels and the presence and orientation of edges are viewed as states of atoms or molecules in a lattice-like physical system. The assignment of an energy function in the physical system determines its Gibbs distribution. Because of the Gibbs distribution, Markov random field (MRF) equivalence, this assignment also determines an MRF image model. The energy function is a more convenient and natural mechanism for embodying picture attributes than are the local characteristics of the MRF. For a range of degradation mechanisms, including blurring, nonlinear deformations, and multiplicative or additive noise, the posterior distribution is an MRF with a structure akin to the image model. By the analogy, the posterior distribution defines another (imaginary) physical system. Gradual temperature reduction in the physical system isolates low energy states (``annealing''), or what is the same thing, the most probable states under the Gibbs distribution. The analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations. The result is a highly parallel ``relaxation'' algorithm for MAP estimation. We establish convergence properties of the algorithm and we experiment with some simple pictures, for which good restorations are obtained at low signal-to-noise ratios.

18,328 citations


Journal ArticleDOI
Jianbo Shi1, Jitendra Malik2Institutions (2)
TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.
Abstract: We propose a novel approach for solving the perceptual grouping problem in vision. Rather than focusing on local features and their consistencies in the image data, our approach aims at extracting the global impression of an image. We treat image segmentation as a graph partitioning problem and propose a novel global criterion, the normalized cut, for segmenting the graph. The normalized cut criterion measures both the total dissimilarity between the different groups as well as the total similarity within the groups. We show that an efficient computational technique based on a generalized eigenvalue problem can be used to optimize this criterion. We applied this approach to segmenting static images, as well as motion sequences, and found the results to be very encouraging.

13,025 citations


Journal ArticleDOI
01 Jul 1945-Ecology

9,129 citations


Journal ArticleDOI
Yuri Boykov1, Olga Veksler1, Ramin Zabih2Institutions (2)
TL;DR: This work presents two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves that allow important cases of discontinuity preserving energies.
Abstract: Many tasks in computer vision involve assigning a label (such as disparity) to every pixel. A common constraint is that the labels should vary smoothly almost everywhere while preserving sharp discontinuities that may exist, e.g., at object boundaries. These tasks are naturally stated in terms of energy minimization. The authors consider a wide class of energies with various smoothness constraints. Global minimization of these energy functions is NP-hard even in the simplest discontinuity-preserving case. Therefore, our focus is on efficient approximation algorithms. We present two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves. These moves can simultaneously change the labels of arbitrarily large sets of pixels. In contrast, many standard algorithms (including simulated annealing) use small moves where only one pixel changes its label at a time. Our expansion algorithm finds a labeling within a known factor of the global minimum, while our swap algorithm handles more general energy functions. Both of these algorithms allow important cases of discontinuity preserving energies. We experimentally demonstrate the effectiveness of our approach for image restoration, stereo and motion. On real data with ground truth, we achieve 98 percent accuracy.

7,060 citations


Performance
Metrics
No. of citations received by the Paper in previous years
YearCitations
20141
20111