scispace - formally typeset
Search or ask a question
Journal ArticleDOI

What energy functions can be minimized via graph cuts

01 Jan 2004-Vol. 26, Iss: 2, pp 147-159
TL;DR: This work gives a precise characterization of what energy functions can be minimized using graph cuts, among the energy functions that can be written as a sum of terms containing three or fewer binary variables.
Abstract: In the last few years, several new algorithms based on graph cuts have been developed to solve energy minimization problems in computer vision. Each of these techniques constructs a graph such that the minimum cut on the graph also minimizes the energy. Yet, because these graph constructions are complex and highly specific to a particular energy function, graph cuts have seen limited application to date. In this paper, we give a characterization of the energy functions that can be minimized by graph cuts. Our results are restricted to functions of binary variables. However, our work generalizes many previous constructions and is easily applicable to vision problems that involve large numbers of labels, such as stereo, motion, image restoration, and scene reconstruction. We give a precise characterization of what energy functions can be minimized using graph cuts, among the energy functions that can be written as a sum of terms containing three or fewer binary variables. We also provide a general-purpose construction to minimize such an energy function. Finally, we give a necessary condition for any energy function of binary variables to be minimized by graph cuts. Researchers who are considering the use of graph cuts to optimize a particular energy function can use our results to determine if this is possible and then follow our construction to create the appropriate graph. A software implementation is freely available.

Summary (3 min read)

1 Introduction and summary of results

  • Many of the problems that arise in early vision can be naturally expressed in terms of energy minimization.
  • Researchers typically use general purpose global optimization techniques such as simulated annealing [3, 11] , which is extremely slow in practice.
  • The experimental results produced by these algorithms are also quite good; for example, two recent evaluations of stereo algorithms using real imagery with ground truth found that a graph cut method gave the best overall performance [23, 25] .
  • Minimizing an energy function via graph cuts, however, remains a technically difficult problem.
  • The authors results provide a significant generalization of the energy minimization methods used in [4-6, 8, 13, 17, 24] , and show how to minimize an interesting new class of energy functions.

1.1 Summary of our results

  • The main result in this paper is a precise characterization of the functions in F 3 that can be minimized using graph cuts, together with a graph construction for minimizing such functions.
  • Note that in this paper the authors only consider binary-valued variables.
  • As an example, the authors will show in section 4.1 how to use their results to solve the pixel-labeling problem, even though the pixels have many possible labels.
  • The authors also identify an interesting class of class of energy functions that have not yet been minimized using graph cuts.
  • In the language of Markov Random Fields [11, 19] , these methods consider first-order MRF's.

1.2 Organization

  • Section 5 contains their main theorems for other classes.
  • Detailed proofs of their theorems, together with the graph constructions, are deferred to section 6.

2 Overview of graph cuts

  • The minimum s-t-cut problem is to find a cut C with the smallest cost.
  • Due to the theorem of Ford and Fulkerson [10] this is equivalent to computing the maximum flow from the source to sink.
  • There are many algorithms which solve this problem in polynomial time with small constants [1, 12] .

3 Defining graph representability

  • Each cut on G has some cost; therefore, G represents the energy function mapping from all cuts on G to the set of nonnegative real numbers.
  • Thus a natural question to ask is what is the class of energy functions for which the authors can construct a graph that represents it.
  • Above the authors used each node (except the source and the sink) for encoding one binary variable.
  • The authors will summarize graph constructions that they allow in the following definition.

4.1 Example: pixel-labeling via expansion moves

  • In this section the authors show how to apply this theorem to solve the pixel-labeling problem.
  • The authors will show how their method can be used to derive the expansion move algorithm developed in [8] .
  • Note that the key technical step in this algorithm can be naturally expressed as minimizing an energy function involving binary variables.
  • In their paper, it is not clear whether this is an accidental property of the construction (i.e., they leave open the possibility that a more clever graph cut construction may overcome this restriction).
  • Using their results, the authors can easily show this is not the case.

6.2 Proof of theorems 3 and 6: the constructive part

  • In this section the authors will give the constructive part of the proof: given a regular energy function from class F 3 they will show how to construct a graph which represents it.
  • First the authors will consider regular functions of two variables, then regular functions of three variables and finally regular functions of the form as in the theorem 6.
  • This will also prove the constructive part of the theorem 3.
  • Indeed, suppose a function is from the class F 2 and each term in the sum satisfies the condition given in the theorem 3 (i.e. regular).
  • Then each term is graph-implementable (as the authors will show in this section) and, hence, the function is graph-implementable as well according to the lemma 10.

Functions of two variables Let E(x 1 , x 2 ) be a function of two variables represented by a table

  • Now the authors can easily constuct a graph G which represents this function.
  • Note that the authors did not introduce any additional nodes for representing binary interactions of binary variables.
  • This is in contrast to the construction in [8] which added auxiliary nodes for representing energies that the authors just considered.
  • The authors construction yields a smaller graph and, thus, the minimum cut can potentially be computed faster.

Functions of three variables Now let us consider a regular function E of three variables. Let us represent it as a table

  • It's easy to check that these transformations preserve the functional π.
  • The authors need to show that all terms here are graph-representable, then lemma 10 will imply that E is graph-representable as well.
  • The first three terms are regular functions depending only on two variables and thus are graph-representable as was shown in the previous section.
  • The graph G that represents this term can be constructed as follows.

Functions of many variables Finally let us consider a regular function E

  • Each term in the sum need not necessarily be regular.
  • This can be done using the following lemma and a trivial induction argument.
  • Therefore the authors did not introduce any nonregular projections for these terms.

6.3 Proof of theorem 7

  • Hence, all terms E i,j are regular, i.e. they satisfy the condition in the theorem 3.
  • The following sequence of operations shows one possible way to push the maximum flow through this graph.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

What Energy Functions can be Minimized via
Graph Cuts?
Vladimir Kolmogorov and Ramin Zabih
Computer Science Department, Cornell University, Ithaca, NY 14853
vnk@cs.cornell.edu, rdz@cs.cornell.edu
Abstract. Many problems in computer vision can be naturally phrased
in terms of energy minimization. In the last few years researchers have
developed a powerful class of energy minimization methods based on
graph cuts. These techniques construct a specialized graph, such that
the minimum cut on the graph also minimizes the energy. The mini-
mum cut in turn is efficiently computed by max flow algorithms. Such
methods have been successfully applied to a number of important vision
problems, including image restoration, motion, stereo, voxel occupancy
and medical imaging. However, each graph construction to date has been
highly specific for a particular energy function. In this paper we address
a much broader problem, by characterizing the class of energy functions
that can be minimized by graph cuts, and by giving a general-purpose
construction that minimizes any energy function in this class. Our results
generalize several previous vision algorithms based on graph cuts, and
also show how to minimize an interesting new class of energy functions.
1 Introduction and summary of results
Many of the problems that arise in early vision can be naturally expressed in
terms of energy minimization. The computational task of minimizing the energy
is usually quite difficult, as it generally requires minimizing a non-convex func-
tion in a space with thousands of dimensions. If the functions have a special
form they can be solved efficiently using dynamic programming [2]. However,
researchers typically use general purpose global optimization techniques such as
simulated annealing [3, 11], which is extremely slow in practice.
In the last few years, however, researchers have developed a new approach
based on graph cuts. The basic technique is to construct a specialized graph
for the energy function to be minimized, such that the minimum cut on the
graph in turn minimizes the energy. The minimum cut in turn can be computed
very efficiently by max flow algorithms. These methods have been successfully
used for a wide variety of vision problems including image restoration [7, 8, 16,
13], stereo and motion [4, 7, 8, 15, 18, 21, 22], voxel occupancy [24] and medical
imaging [6, 5, 17]. The output of these algorithms is generally a solution with
some interesting theoretical quality guarantee. In some cases [7, 15, 16, 13, 21] it
is the global minimum, in other cases a local minimum in a strong sense [8] that is
within a known factor of the global minimum. The experimental results produced

by these algorithms are also quite good; for example, two recent evaluations of
stereo algorithms using real imagery with ground truth found that a graph cut
method gave the best overall performance [23, 25].
Minimizing an energy function via graph cuts, however, remains a techni-
cally difficult problem. Each paper constructs its own graph specifically for its
individual energy function, and in some of these cases (especially [18, 8]) the
construction is fairly complex. One consequence is that researchers sometimes
use heuristic methods for optimization, even in situations where the exact global
minimum can be computed via graph cuts [14, 20, 9]. The goal of this paper is
to precisely characterize the class of energy functions that can be minimized via
graph cuts, and to give a general-purpose graph construction that minimizes any
energy function in this class. Our results provide a significant generalization of
the energy minimization methods used in [4–6, 8, 13, 17, 24], and show how to
minimize an interesting new class of energy functions.
1.1 Summary of our results
In this paper we consider two classes of energy functions. Let {x
1
,...,x
n
}, x
i
{0, 1} be a set of binary-valued variables. We define the class F
2
to be functions
of the form
E(x
1
,...,x
n
)=
i
E
i
(x
i
)+
i<j
E
i,j
(x
i
,x
j
). (1)
We define the class F
3
to be functions of the form
E(x
1
,...,x
n
)=
i
E
i
(x
i
)+
i<j
E
i,j
(x
i
,x
j
)+
i<j<k
E
i,j,k
(x
i
,x
j
,x
k
). (2)
Obviously, the class F
2
is a strict subset of the class F
3
.
The main result in this paper is a precise characterization of the functions in
F
3
that can be minimized using graph cuts, together with a graph construction
for minimizing such functions. Moreover, we give a necessary condition for all
other classes which must be met for a function to be minimized via graph cuts.
Note that in this paper we only consider binary-valued variables. Most of the
previous work with graph cuts cited above considers energy functions that involve
variables with more than 2 possible values. For example, the work on stereo, mo-
tion and image restoration described in [8] addresses the standard pixel-labeling
problem in early vision. In these pixel-labeling problems, the variables represent
individual pixels, and the possible values for an individual variable represent its
possible displacements or intensities. However, many of the graph cut methods
that handle multiple possible values actually consider a pair of labels at a time.
As a consequence, even though we only address binary-valued variables, our re-
sults generalize the algorithms given in [4–6, 8, 13, 17, 24]. As an example, we will
show in section 4.1 how to use our results to solve the pixel-labeling problem,
even though the pixels have many possible labels.
We also identify an interesting class of class of energy functions that have not
yet been minimized using graph cuts. All of the previous work with graph cuts
2

involves a neighborhood system that is defined on pairs of pixels. In the language
of Markov Random Fields [11, 19], these methods consider first-order MRF’s.
The associated energy functions lie in F
2
. Our results allow for the minimization
of energy functions in the larger class F
3
, and thus for neighborhood systems
involve triples of pixels.
1.2 Organization
The rest of the paper is organized as follows. In section 2 we give an overview of
graph cuts. In section 3 we formalize the problem that we want to solve. Section 4
contains our main theorem for the class of functions F
2
and shows how it can be
used. Section 5 contains our main theorems for other classes. Detailed proofs of
our theorems, together with the graph constructions, are deferred to section 6.
A summary of the actual graph constructions given in the appendix.
2 Overview of graph cuts
Suppose G =(V, E) is a directed graph with two special vertices (terminals),
namely the source s and the sink t.Ans-t-cut (or just a cut as we will refer to
it later) C = S, T is a partition of vertices in V into two disjoint sets S and T ,
such that s S and t T . The cost of the cut is the cut is the sum of costs of
all edges that go from S to T :
c(S, T )=
uS,vT,(u,v)∈E
c(u, v).
The minimum s-t-cut problem is to find a cut C with the smallest cost. Due
to the theorem of Ford and Fulkerson [10] this is equivalent to computing the
maximum flow from the source to sink. There are many algorithms which solve
this problem in polynomial time with small constants [1, 12].
It is convenient to denote a cut C = S, T by a labeling f mapping from the
set of the nodes V−{s, t} to {0, 1} where f(v)=0meansthatv S,and
f(v)=1meansthatv T . We will use this notation later.
3 Defining graph representability
Let us consider a graph G =(V, E) with terminals s and t,thusV = {v
1
,...,v
n
,s,t}.
Each cut on G has some cost; therefore, G represents the energy function map-
ping from all cuts on G to the set of nonnegative real numbers. Any cut can be
described by n binary variables x
1
,...,x
n
corresponding to nodes in G (exclud-
ing the source and the sink): x
i
=0whenv
i
S,andx
i
=1whenv
i
T .
Therefore, the energy E that G represents can be viewed as a function of n
binary variables: E(x
1
,...,x
n
) is equal to the cost of the cut defined by the
configuration x
1
,...,x
n
(x
i
∈{0, 1}).
3

We can efficiently minimize E by computing the minimum s-t-cut on G.Thus
a natural question to ask is what is the class of energy functions for which we
can construct a graph that represents it.
We can also generalize our construction. Above we used each node (except the
source and the sink) for encoding one binary variable. Instead we can specify a
subset V
0
= {v
1
,...,v
k
}⊂V−{s, t} and introduce variables only for the nodes
in this set. Then there may be several cuts corresponding to a configuration
x
1
,...,x
k
. If we define the energy E(x
1
,...,x
k
) as the minimum among costs
of all such cuts then the minimum s-t-cut on G will again yield the configuration
which minimizes E.
Finally, note that the configuration that minimizes E will not change if we
add a constant to E.
We will summarize graph constructions that we allow in the following defi-
nition.
Definition 1. AfunctionE of n binary variables is called graph-representable
if there exists a graph G =(V, E) with terminals s and t and a subset of nodes
V
0
= {v
1
,...,v
n
}⊂V{s, t} such that for any configuration x
1
,...,x
n
the value
of the energy E(x
1
,...,x
n
) is equal to a constant plus the cost of the minimum
s-t-cut among all cuts C = S, T in which v
i
S,ifx
i
=0,andv
i
T ,ifx
i
=1
(1 i n). We say that E is exactly represented by G, V
0
if this constant is
zero.
The following lemma is an obvious consequence of this definition.
Lemma 2. Suppose the energy function E is graph-representable by a graph G
and a subset V
0
. Then it is possible to find the exact minimum of E in polynomial
time by computing the minimum s-t-cut on G.
In this paper we will give a complete characterization of the classes F
2
and
F
3
in terms of graph representability, and show how to construct graphs for
minimizing graph-representable energies within these classes. Moreover, we will
give a necessary condition for all other classes which must be met for a function
to be graph-representable. Note that it would be suffice to consider only the class
F
3
since F
2
⊂F
3
. However, the condition for F
2
is simpler so we will consider
it separately.
4TheclassF
2
Our main result for the class F
2
is the following theorem.
Theorem 3. Let E be a function of n binary variables from the class F
2
, i.e.
itcanbewrittenasthesum
E(x
1
,...,x
n
)=
i
E
i
(x
i
)+
i<j
E
i,j
(x
i
,x
j
).
4

Then E is graph-representable if and only if each term E
i,j
satisfies the inequality
E
i,j
(0, 0) + E
i,j
(1, 1) E
i,j
(0, 1) + E
i,j
(1, 0).
4.1 Example: pixel-lab eling via expansion moves
In this section we show how to apply this theorem to solve the pixel-labeling
problem. In this problem, are given the set of pixels P and the set of labels L.
The goal is to find a labeling l (i.e. a mapping from the set of pixels to the set
of labels) which minimizes the energy
E(l)=
p∈P
D
p
(l
p
)+
p,q∈N
V
p,q
(l
p
,l
q
)
where N⊂P×Pis a neighborhood system on pixels. Without loss of generality
we can assume that N contains only ordered pairs p, q for which p<q(since
we can combine two terms V
p,q
and V
q,p
into one term). We will show how our
method can be used to derive the expansion move algorithm developed in [8].
This problem is NP-hard if |L| > 2 [8]. [8] gives an approximation algorithm
for minimizing this energy. A single step of this algorithm is an operation called
an α-expansion. Suppose that we have some current configuration l
0
,andwe
are considering a label α ∈L. During the α-expansion operation a pixel p is
allowed either to keep its old label l
0
p
or to switch to a new label α: l
p
= l
0
p
or
l
p
= α. The key step in the approximation algorithm presented in [8] is to find
the optimal expansion operation, i.e. the one that leads to the largest reduction
in the energy E. This step is repeated until there is no choice of α where the
optimal expansion operation reduces the energy.
[8] constructs a graph which contains nodes corresponding to pixels in P.
The following encoding is used: if f(p) = 0 (i.e., the node p is in the source set)
then l
p
= l
0
p
;iff (p) = 1 (i.e., the node p is in the sink set) then l
p
= α.
Note that the key technical step in this algorithm can be naturally expressed
as minimizing an energy function involving binary variables. The binary variables
correspond to pixels, and the energy we wish to minimize can be written formally
as
E(x
p
1
,...,x
p
n
)=
p∈P
D
p
(l
p
(x
p
)) +
p,q∈N
V
p,q
(l
p
(x
p
),l
q
(x
q
)), (3)
where
p ∈P l
p
(x
p
)=
l
0
p
,x
p
=0
α, x
p
=1.
We can demonstrate the power of our results by deriving an important re-
striction on this algorithm. In order for the graph cut construction of [8] to
work, the function V
p,q
is required to be a metric. In their paper, it is not clear
whether this is an accidental property of the construction (i.e., they leave open
5

Citations
More filters
Journal ArticleDOI
01 Aug 2004
TL;DR: A more powerful, iterative version of the optimisation of the graph-cut approach is developed and the power of the iterative algorithm is used to simplify substantially the user interaction needed for a given quality of result.
Abstract: The problem of efficient, interactive foreground/background segmentation in still images is of great practical importance in image editing. Classical image segmentation tools use either texture (colour) information, e.g. Magic Wand, or edge (contrast) information, e.g. Intelligent Scissors. Recently, an approach based on optimization by graph-cut has been developed which successfully combines both types of information. In this paper we extend the graph-cut approach in three respects. First, we have developed a more powerful, iterative version of the optimisation. Secondly, the power of the iterative algorithm is used to simplify substantially the user interaction needed for a given quality of result. Thirdly, a robust algorithm for "border matting" has been developed to estimate simultaneously the alpha-matte around an object boundary and the colours of foreground pixels. We show that for moderately difficult examples the proposed method outperforms competitive tools.

5,670 citations

Journal ArticleDOI
TL;DR: This paper compares the running times of several standard algorithms, as well as a new algorithm that is recently developed that works several times faster than any of the other methods, making near real-time performance possible.
Abstract: Minimum cut/maximum flow algorithms on graphs have emerged as an increasingly useful tool for exactor approximate energy minimization in low-level vision. The combinatorial optimization literature provides many min-cut/max-flow algorithms with different polynomial time complexity. Their practical efficiency, however, has to date been studied mainly outside the scope of computer vision. The goal of this paper is to provide an experimental comparison of the efficiency of min-cut/max flow algorithms for applications in vision. We compare the running times of several standard algorithms, as well as a new algorithm that we have recently developed. The algorithms we study include both Goldberg-Tarjan style "push -relabel" methods and algorithms based on Ford-Fulkerson style "augmenting paths." We benchmark these algorithms on a number of typical graphs in the contexts of image restoration, stereo, and segmentation. In many cases, our new algorithm works several times faster than any of the other methods, making near real-time performance possible. An implementation of our max-flow/min-cut algorithm is available upon request for research purposes.

4,463 citations


Cites background or methods from "What energy functions can be minimi..."

  • ...The question of what energy functions can be minimized via graph cuts was addressed in [25]....

    [...]

  • ...After [15, 31, 19, 8, 25, 5] minimum cut/maximum flow algorithms on graphs emerged as an increasingly useful tool for exact or approximate energy minimization in low-level vision....

    [...]

  • ...However, the results in [25] apply only to energy functions of binary variables with double and triple cliques....

    [...]

Journal ArticleDOI
TL;DR: The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) as mentioned in this paper was organized in conjunction with the MICCAI 2012 and 2013 conferences, and twenty state-of-the-art tumor segmentation algorithms were applied to a set of 65 multi-contrast MR scans of low and high grade glioma patients.
Abstract: In this paper we report the set-up and results of the Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) organized in conjunction with the MICCAI 2012 and 2013 conferences Twenty state-of-the-art tumor segmentation algorithms were applied to a set of 65 multi-contrast MR scans of low- and high-grade glioma patients—manually annotated by up to four raters—and to 65 comparable scans generated using tumor image simulation software Quantitative evaluations revealed considerable disagreement between the human raters in segmenting various tumor sub-regions (Dice scores in the range 74%–85%), illustrating the difficulty of this task We found that different algorithms worked best for different sub-regions (reaching performance comparable to human inter-rater variability), but that no single algorithm ranked in the top for all sub-regions simultaneously Fusing several good algorithms using a hierarchical majority vote yielded segmentations that consistently ranked above all individual algorithms, indicating remaining opportunities for further methodological improvements The BRATS image data and manual annotations continue to be publicly available through an online evaluation system as an ongoing benchmarking resource

3,699 citations

Proceedings Article
12 Dec 2011
TL;DR: This paper considers fully connected CRF models defined on the complete set of pixels in an image and proposes a highly efficient approximate inference algorithm in which the pairwise edge potentials are defined by a linear combination of Gaussian kernels.
Abstract: Most state-of-the-art techniques for multi-class image segmentation and labeling use conditional random fields defined over pixels or image regions. While region-level models often feature dense pairwise connectivity, pixel-level models are considerably larger and have only permitted sparse graph structures. In this paper, we consider fully connected CRF models defined on the complete set of pixels in an image. The resulting graphs have billions of edges, making traditional inference algorithms impractical. Our main contribution is a highly efficient approximate inference algorithm for fully connected CRF models in which the pairwise edge potentials are defined by a linear combination of Gaussian kernels. Our experiments demonstrate that dense connectivity at the pixel level substantially improves segmentation and labeling accuracy.

3,233 citations


Cites methods from "What energy functions can be minimi..."

  • ... CRFs on these images [17]. The MCMC procedure was run for 36 hours and only partially converged for the bottom image. We have also experimented with graph cut inference in the fully connected models [11], but it did not converge within 72 hours. In contrast, a single-threaded implementation of our algorithm produces a detailed pixel-level labeling in 0.2 seconds, as shown in Figure 1(e). A quantitati...

    [...]

Book ChapterDOI
03 Sep 2001
TL;DR: The goal of this paper is to provide an experimental comparison of the efficiency of min-cut/max flow algorithms for applications in vision, comparing the running times of several standard algorithms, as well as a new algorithm that is recently developed.
Abstract: After [10, 15, 12, 2, 4] minimum cut/maximum flow algorithms on graphs emerged as an increasingly useful tool for exact or approximate energy minimization in low-level vision. The combinatorial optimization literature provides many min-cut/max-flow algorithms with different polynomial time complexity. Their practical efficiency, however, has to date been studied mainly outside the scope of computer vision. The goal of this paper is to provide an experimental comparison of the efficiency of min-cut/max flow algorithms for energy minimization in vision. We compare the running times of several standard algorithms, as well as a new algorithm that we have recently developed. The algorithms we study include both Goldberg-style "push-relabel" methods and algorithms based on Ford-Fulkerson style augmenting paths. We benchmark these algorithms on a number of typical graphs in the contexts of image restoration, stereo, and interactive segmentation. In many cases our new algorithm works several times faster than any of the other methods making near real-time performance possible.

3,099 citations

References
More filters
Journal ArticleDOI
TL;DR: The analogy between images and statistical mechanics systems is made and the analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations, creating a highly parallel ``relaxation'' algorithm for MAP estimation.
Abstract: We make an analogy between images and statistical mechanics systems. Pixel gray levels and the presence and orientation of edges are viewed as states of atoms or molecules in a lattice-like physical system. The assignment of an energy function in the physical system determines its Gibbs distribution. Because of the Gibbs distribution, Markov random field (MRF) equivalence, this assignment also determines an MRF image model. The energy function is a more convenient and natural mechanism for embodying picture attributes than are the local characteristics of the MRF. For a range of degradation mechanisms, including blurring, nonlinear deformations, and multiplicative or additive noise, the posterior distribution is an MRF with a structure akin to the image model. By the analogy, the posterior distribution defines another (imaginary) physical system. Gradual temperature reduction in the physical system isolates low energy states (``annealing''), or what is the same thing, the most probable states under the Gibbs distribution. The analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations. The result is a highly parallel ``relaxation'' algorithm for MAP estimation. We establish convergence properties of the algorithm and we experiment with some simple pictures, for which good restorations are obtained at low signal-to-noise ratios.

18,761 citations


"What energy functions can be minimi..." refers background in this paper

  • ...However, researchers typically have needed to rely on general purpose optimization techniques such as simulated annealing [3], [16], which requires exponential time in theory and is extremely slow in practice....

    [...]

  • ...Energy functions of the form (3) can be justified on Bayesian grounds using the wellknown Markov Random Fields (MRF) formulation [16], [31]....

    [...]

Book
01 Jan 1993
TL;DR: In-depth, self-contained treatments of shortest path, maximum flow, and minimum cost flow problems, including descriptions of polynomial-time algorithms for these core models are presented.
Abstract: A comprehensive introduction to network flows that brings together the classic and the contemporary aspects of the field, and provides an integrative view of theory, algorithms, and applications. presents in-depth, self-contained treatments of shortest path, maximum flow, and minimum cost flow problems, including descriptions of polynomial-time algorithms for these core models. emphasizes powerful algorithmic strategies and analysis tools such as data scaling, geometric improvement arguments, and potential function arguments. provides an easy-to-understand descriptions of several important data structures, including d-heaps, Fibonacci heaps, and dynamic trees. devotes a special chapter to conducting empirical testing of algorithms. features over 150 applications of network flows to a variety of engineering, management, and scientific domains. contains extensive reference notes and illustrations.

8,496 citations


"What energy functions can be minimi..." refers background in this paper

  • ...Energy functions of the form (3) can be justified on Bayesian grounds using the wellknown Markov Random Fields (MRF) formulation [16], [31]....

    [...]

Journal ArticleDOI
TL;DR: This paper has designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms.
Abstract: Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame stereo methods designed to assess the different components and design decisions made in individual stereo algorithms. Using this taxonomy, we compare existing stereo methods and present experiments evaluating the performance of many different variants. In order to establish a common software platform and a collection of data sets for easy evaluation, we have designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can be easily extended to include new algorithms. We have also produced several new multiframe stereo data sets with ground truth, and are making both the code and data sets available on the Web.

7,458 citations

Journal ArticleDOI
TL;DR: This work presents two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves that allow important cases of discontinuity preserving energies.
Abstract: Many tasks in computer vision involve assigning a label (such as disparity) to every pixel. A common constraint is that the labels should vary smoothly almost everywhere while preserving sharp discontinuities that may exist, e.g., at object boundaries. These tasks are naturally stated in terms of energy minimization. The authors consider a wide class of energies with various smoothness constraints. Global minimization of these energy functions is NP-hard even in the simplest discontinuity-preserving case. Therefore, our focus is on efficient approximation algorithms. We present two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves. These moves can simultaneously change the labels of arbitrarily large sets of pixels. In contrast, many standard algorithms (including simulated annealing) use small moves where only one pixel changes its label at a time. Our expansion algorithm finds a labeling within a known factor of the global minimum, while our swap algorithm handles more general energy functions. Both of these algorithms allow important cases of discontinuity preserving energies. We experimentally demonstrate the effectiveness of our approach for image restoration, stereo and motion. On real data with ground truth, we achieve 98 percent accuracy.

7,413 citations

Book
01 Jan 1962
TL;DR: Ford and Fulkerson as mentioned in this paper set the foundation for the study of network flow problems and developed powerful computational tools for solving and analyzing network flow models, and also furthered the understanding of linear programming.
Abstract: In this classic book, first published in 1962, L. R. Ford, Jr., and D. R. Fulkerson set the foundation for the study of network flow problems. The models and algorithms introduced in Flows in Networks are used widely today in the fields of transportation systems, manufacturing, inventory planning, image processing, and Internet traffic. The techniques presented by Ford and Fulkerson spurred the development of powerful computational tools for solving and analyzing network flow models, and also furthered the understanding of linear programming. In addition, the book helped illuminate and unify results in combinatorial mathematics while emphasizing proofs based on computationally efficient construction. Flows in Networks is rich with insights that remain relevant to current research in engineering, management, and other sciences. This landmark work belongs on the bookshelf of every researcher working with networks.

4,341 citations