scispace - formally typeset
Open AccessProceedings ArticleDOI

Comparison of graph cuts with belief propagation for stereo, using identical MRF parameters

TLDR
This work compares the belief propagation algorithm and the graph cuts algorithm on the same MRF's, which have been created for calculating stereo disparities, and finds that the labellings produced by the two algorithms are comparable.
Abstract
Recent stereo algorithms have achieved impressive results by modelling the disparity image as a Markov Random Field (MRF). An important component of an MRF-based approach is the inference algorithm used to find the most likely setting of each node in the MRF. Algorithms have been proposed which use graph cuts or belief propagation for inference. These stereo algorithms differ in both the inference algorithm used and the formulation of the MRF. It is unknown whether to attribute the responsibility for differences in performance to the MRF or the inference algorithm. We address this through controlled experiments by comparing the belief propagation algorithm and the graph cuts algorithm on the same MRF's, which have been created for calculating stereo disparities. We find that the labellings produced by the two algorithms are comparable. The solutions produced by graph cuts have a lower energy than those produced with belief propagation, but this does not necessarily lead to increased performance relative to the ground truth.

read more

Content maybe subject to copyright    Report

Comparison of Graph Cuts with Belief Propagation for Stereo, using Identical
MRF Parameters
Marshall F. Tappen William T. Freeman
Computer Science and Artificial Intelligence Laboratory
Massachusetts Institute of Technology
Cambridge, MA 02139
{mtappen,billf}@ai.mit.edu
Abstract
Recent stereo algorithms have achieved impressive re-
sults by modelling the disparity image as a Markov Random
Field (MRF). An important component of an MRF-based
approach is the inference algorithm used to find the most
likely setting of each node in the MRF. Algorithms have
been proposed which use Graph Cuts or Belief Propaga-
tion for inference. These stereo algorithms differ in both the
inference algorithm used and the formulation of the MRF.
It is unknown whether to attribute the responsibility for dif-
ferences in performance to the MRF or the inference algo-
rithm. We address this through controlled experiments by
comparing the Belief Propagation algorithm and the Graph
Cuts algorithm on the same MRF’s, which have been cre-
ated for calculating stereo disparities. We find that the la-
bellings produced by the two algorithms are comparable.
The solutions produced by Graph Cuts have a lower energy
than those produced with Belief Propagation, but this does
not necessarily lead to increased performance relative to
the ground-truth.
1. Introduction
Two of the more exciting recent results in computational
vision have been the development of fast algorithms for ap-
proximate inference in Markov Random Fields (MRF’s):
Graph Cuts [5] and Belief Propagation [16]. Papers on
both graph cuts and belief propagation have won recent
academic recognition [8, 9, 16] and have been applied to
a number of problems [6, 7]. In the realm of stereo, the
top contenders for the best stereo shape estimation, on the
most common comparison data, either use Belief Propaga-
tion [11] or Graph Cuts [3, 5]. Both algorithms allow fast,
approximate solutions to MRF’s, which are powerful tools
for modelling vision problems, but intractable to solve with
reasonable speed until recently. These algorithms may be-
come the basis for new and powerful vision algorithms, so it
is important to know how they compare against each other.
The stereo problem provides a well-understood test-bed for
comparison.
Unfortunately, the competing stereo algorithms use both
a different inference algorithm and a different formulation
of the MRF. This raises the question of how to understand
differences in systems’ performance. Labelling an MRF has
been shown to be NP-hard, so both Graph Cuts and Be-
lief Propagation approximate the optimal solution. Should
one system’s improvement over the other be attributed to its
choice of an inference algorithm? Alternatively, does most
of the improvement belong to the authors’ unique formula-
tion of the MRF?
The answer to these questions is important because ad-
vancing the field of computer vision and building on these
two systems requires understanding what makes these al-
gorithms different and how these differences affect the sys-
tems’ performance. To answer this question, we show a
controlled comparison of the Belief Propagation and Graph
Cuts algorithms. The two algorithms are examined on iden-
tical MRF’s, allowing us to measure the quality of the solu-
tions produced by the two algorithms and isolate the effects
of the inference algorithms on system performance.
In Section 2 we discuss how the MRF model can be used
to calculate stereo disparities. Section 3 explains the formu-
lation of the MRF’s used in our tests and the implementation
of the Belief Propagation and Graph Cuts algorithms. The
results of our comparison are presented in Section 4 and are
discussed in Section 5.
2. MRF Model for Stereo
Given a rectified stereo pair of images, the goal is to
find the disparity of each pixel in the reference image. In
1

[10], Scharstein and Szeliski point out that most stereo al-
gorithms perform four basic steps:
1. Matching cost computation
2. Cost (or support) aggregation
3. Disparity optimization
4. Disparity refinement
In this section, we discuss how steps 1, 2, and 3 can be
accomplished by modelling the disparity image as a Markov
Random Field.
2.1. Matching Cost Computation
The true disparity of each pixel in the disparity image is
a random variable, denoted x
p
for the variable at pixel lo-
cation p. Each variable can take one of N discrete states,
which represent the possible disparities at that point. For
each possible disparity value, there is a cost associated with
matching the pixel to the corresponding pixel in the other
stereo image at that disparity value. Typically, this cost
is based on the intensity differences between the two pix-
els, y
p
. This cost is reflected in the compatibility func-
tion, Φ(x
p
, y
p
), which relates how compatible a disparity
value is with the intensity differences observed in the im-
age. Smaller intensity differences will correspond to higher
compatibilities and vice-versa.
2.2. Support Aggregation
The next step is to aggregate support for the candidate
disparities. A standard sum-of-squared-differences algo-
rithm accomplishes this by assuming a constant disparity
over a small window surrounding each point and finding the
best matching cost [10]. A MRF approach aggregates sup-
port by introducing a second compatibility function, Ψ(·).
This function expresses the compatibility between neigh-
boring variables. Traditionally, only variables adjacent to a
particular variable are considered its neighbors. Therefore,
every Ψ(·) is of the form Ψ(x
p
, x
n
), where the location n is
adjacent to p. This is known as a pair-wise Markov Random
Field. Typically only pairwise Markov Random Fields are
used for stereo problems because considering more neigh-
bors quickly makes inference on the field computationally
intractable. Although the compatibility functions only con-
sider adjacent variables, each variable is still able to influ-
ence every other variable in the field via these pair-wise con-
nections.
2.3. Disparity Optimization
With the compatibility functions defined, the joint prob-
ability of the MRF can be written as [1]:
P (x
1
,x
2
, . . . , x
N
, y
1
, y
2
, . . . , y
N
) =
Y
(i,j)
Ψ(x
i
, x
j
)
Y
p
Φ(x
p
, y
p
)
(1)
where N is the number of nodes, (i, j) represent a pair of
neighboring nodes, x
n
is the variable at location n, and y
n
is the variable representing the intensity differences. The y
variables are observed and therefore fized during optimiza-
tion.
The disparity optimization step requires choosing an es-
timator for x
1
. . . x
N
. The two most common estimators
are the Minimum Mean Squared Error (MMSE) estimator
and Maximum A Posteriori (MAP) estimator. The MMSE
estimate of each x
i
is the mean of the marginal distribu-
tion of x
i
. The MAP estimate is the labelling of x
1
. . . x
N
that maximizes Equation 1. For this comparison, we use
the MAP estimator because the Graph Cuts algorithm is de-
signed to compute the MAP estimator. In Section 5.2, we
discuss the advantages of using the MMSE estimator.
2.4. Equivalence to Energy Minimization
As posed above, the best disparities are found by max-
imizing a probability. Taking the log of Equation 1, we
see finding the MAP estimate is equivalent to minimizing
a function of the form
E(x
1
,x
2
, . . . , x
N
, y
1
, y
2
, . . . , y
N
) =
X
(i,j)
log Ψ(x
i
, x
j
) +
X
p
log Φ(x
p
, y
p
)
(2)
In [5], this equation is expressed as
E(x
1
,x
2
, . . . , x
N
, y
1
, y
2
, . . . , y
N
) =
X
(i,j)
V (x
i
, x
j
) +
X
p
D(x
p
, y
p
)
(3)
The functions V (·) and D(·) are energy functions. The
fact that maximizing the probability in Equation 1 is equiv-
alent to minimizing the energy in Equation 3 is important
because it means that the Belief Propagation and the Graph
Cuts algorithms are attempting to solve the same problem.
Once the MRF has been formulated, one algorithm can be
substituted for the other in the stereo algorithms.
3. MRF Formulation
To determine whether using one algorithm presents
a clear advantage over the other for the stereo prob-
lem, we compared Graph Cuts and Belief Propaga-
tion on identical MRF’s. The comparison was made

using the stereo framework created by Scharstein and
Szeliski to compare a number of different stereo al-
gorithms [10]. This framework can be found at
http://www.middlebury.edu/stereo. In or-
der to facilitate further experimentation, our implemen-
tation of the Belief Propagation algorithm and modi-
fications to the stereo framework will be available at
http://www.ai.mit.edu/˜mtappen.
The MRF is defined in terms of energy functions,
rather than compatibilities. The energy function D(x
p
, y
p
),
which corresponds to the matching cost computation and
Φ(x
p
, y
p
), is computed using the Birchfield-Tomasi match-
ing cost [2]. The cost function between nodes V (x
i
, x
j
),
which determines how support is aggregated and corre-
sponds to Ψ(x
i
, x
j
), is computed in the same fashion as
[12]:
V (x
i
, x
j
) =
(
0 if x
i
= x
j
ρ
I
(∆I) otherwise
(4)
This type of energy function is known as a Potts model.
The function ρ
I
(·) is defined in terms of the image gra-
dient between the pixels i and j, which is denoted at I:
ρ
I
(∆I) =
(
P × s if I < T
s Otherwise
(5)
where T is a threshold, s is a penalty term for violating the
smoothness constraint and P is a penalty term that increases
the penalty when the gradient has a small magnitude. Note
that T , P , and s are constant over the whole image.
To use belief propagation, a cost C can be converted into
compatibility by calculating e
C
. For numerical reasons,
the cost is converted into a compatibility using e
C/D
,
where D is a constant.
3.1. Choice of Belief Propagation Algorithm
To implement the Belief Propagation algorithm, two de-
cisions must be made. First, either the sum-product algo-
rithm or the max-product algorithm must be chosen. The
sum-product algorithm computes the marginal distributions
of each node, while the max-product algorithm computes
the MAP estimate of the whole MRF. More information on
these algorithms can be found in [6, 14, 16]. We use the
max-product algorithm to find the MAP estimate for com-
parison with the Graph Cuts algorithm, which also com-
putes the MAP estimate.
The second choice is the message update schedule. At
each iteration, each node uses the messages it has received
in the previous iteration from neighboring nodes to calcu-
late messages to send to those neighbors. If node i is to
the right of node j, node i sends a message to j at each
iteration of the algorithm. This message contains node is
belief about each possible state of node j. This message
is computed from the messages that i has received from its
neighbors. The message from i to j, denoted as m
right
(x
j
)
because it is the message that j is receiving from its right,
is:
m
right
(x
j
) max
x
i
Ψ(x
i
, x
j
)Φ(x
i
, y
i
)×
m
right
(x
i
)m
up
(x
i
)m
down
(x
i
)
(6)
where m
right
(x
i
), m
up
(x
i
), and m
down
(x
i
) are the mes-
sages received by i from the nodes above, below, and to its
right.
The message update schedule determines when a mes-
sage sent to a node will be used by that node to compute
messages for the node’s neighbors. In a synchronous update
schedule, each node first computes the message for each
neighbor. Once every node has computed the messages, the
messages are delivered to each node and used to compute
the next round of messages.
An alternative schedule is to propagate messages in one
direction and update each node immediately. For instance
the first node in a row, i would send a message to the node
at its right, i + 1. Node i + 1 would then use this message
immediately, along with the messages it had previously re-
ceived from above and below, to compute a message to node
i +2. Once this has been completed for every row, the same
procedure occurs in the up, down, and left direction. We
refer to this style of updating as “accelerated” updating.
The advantage of this method is that information is
quickly propagated across the field. For a synchronous up-
date schedule on an image with width W , it would take
W iterations for information from one side of the image to
reach the other. The alternative schedule would only require
one iteration for this information to be propagated. This fea-
ture of the “up-down-left-right” message passing schedule
causes the Belief Propagation algorithm to converge very
quickly.
When the max-product algorithms converges on a graph
with loops, it returns an approximate solution for the most
likely labelling of the graph. The probability of this so-
lution is guaranteed to be greater than all other solutions
in a large neighborhood around that solution [15]. Upper
bounds on the difference between the probability of the true
MAP solution and the approximate solution returned by Be-
lief Propagation are shown in [13].
3.2. Graph Cuts Algorithm
We used the Graph Cuts algorithm provided in
Scharstein and Szeliski’s package. In particular, the pack-
age implements the “swap” algorithm described in [5]. Like
the Belief Propagation algorithm, the Graph Cuts algorithm

(a) Map Image (b) Graph Cuts (c) Synchronous BP (d) Accelerated BP
Figure 1. Results produced by the three algorithms on the map image. The parameters used to
generate this field were s = 50, T = 4, P = 2. Graph Cuts returns the smoothest solution because it
is able to find a lower-energy labelling than the two Belief Propagation algorithms.
Energy of MRF Labelling Returned (×10
3
)
Synchronous % Energy from Occluded
Image Ground-Truth Graph Cuts Belief Prop Matching Costs
Map 757 383 442 61%
Sawtooth 6591 1652 1713 79%
Tsukuba 1852 663 775 61%
Venus 5739 1442 1501 76%
Figure 2. Field Energies for the MRF labelled using ground-truth data compared to the energies for
the fields labelled using Graph Cuts and Belief Propagation. Notice that the solutions returned by
the algorithms consistently have a much lower energy than the labellings produced from the ground-
truth, showing a mismatch between the MRF formulation and the ground-truth. The final column
contains the percentage of each ground-truth solution’s energy that comes from matching costs of
occluded pixels.
finds a local minimum by making local improvements. The
“swap” algorithm makes local improvements by choosing
two of the possible states, α and β, then finding those nodes
labelled α whose label should be change to β, or vice-versa,
in order minimize the energy in the field as much as possi-
ble. Using the min-cut/max-flow formulation, the optimal
swap for the entire graph can be computed.
4. Comparing Belief Propagation and Graph
Cuts
We compared the Graph Cuts algorithm with the max-
product Belief Propagation algorithm, using both syn-
chronous updates and accelerated updates. For each of the
four images used in [10], we generated 10 MRF fields by
varying the T , s, and P parameters of Equation 5. We then
used the Graph Cuts algorithm and the Belief Propagation
Algorithms to estimate the MAP solution of the field. To
compare the two algorithms, we collected the three statis-
tics reported in [11] plus an additional statistic:
B
¯
O
The percentage of pixels in non-occluded areas
of the image with a disparity error greater than 1.
B
¯
T
The percentage of pixels in textureless areas of
the image with a disparity error greater than 1.
B
D
The percentage of pixels near discontinuities in
the image with a disparity error greater than 1.
E The energy of the solution.
4.1. Results for Map Image
The table in Figure 8 summarizes the results of the three
algorithms on the map image. The performance in terms of
B
¯
O
, B
¯
T
, and B
D
is nearly identical; neither algorithm has
a clear advantage.
However, it is useful to examine the energy of the solu-
tion returned by each algorithm. When the error penalty, s,
is 20, the energy of the solutions returned by Belief Prop-
agation and Graph Cuts nearly equal, although Graph Cuts
consistently returns a smaller field energy. After s is raised
to 50, the difference between the two solutions increases.
The reason for this can be seen in Figure 1. The regions on
the left side of the plane are smoother in the results returned
by Graph Cuts than those returned by Belief Propagation.
However, this extra smoothness does not translate into
better performance in terms of the ground-truth data. That

(a) Tsukuba Image (b) Graph Cuts (c) Synchronous BP (d) Accelerated BP
Figure 3. Results produced by the three algorithms on the Tsukuba image. The parameters used to
generate this field were s = 50, T = 4, P = 2. Again, Graph Cuts produces a much smoother solution.
Belief Propagation does maintain some structures that are lost in the Graph Cuts solution, such as
the camera and the face in the foreground.
(a) Sawtooth Image (b) Graph Cuts (c) Synchronous BP (d) Accelerated BP
Figure 4. Results produced by the three algorithms on the sawtooth image. The parameters used to
generate this field were s = 50, T = 4, P = 2. For this image, the output of the three algorithms is
comparable.
is because the ground-truth solution actually has a higher
energy than either of the solutions returned by Belief Prop-
agation or Graph Cuts. In Figure 2, the energy of the
ground-truth solution for each image is shown for a spe-
cific setting of the parameters of ρ
I
(·). The ground-truth
labelling was produced by choosing the disparity level clos-
est to the ground-truth disparity of each point. The ener-
gies of the labelling produced by Graph Cuts and Belief
Propagation are significantly lower than the energy of the
ground-truth labelling. The large energies for the ground-
truth solution are caused by inaccurate matching costs in
occluded areas. Since occluded pixels have no counterpart
in the other image, the pixel at the correct disparity of an
occluded pixel will likely have a different intensity, lead-
ing to a large matching cost. The significant effect of these
matching costs can be observed in the last column of Fig-
ure 2. This column lists the percentage of the final energy
for each of the solutions shown which can be attributed to
matching costs for occluded pixels. These matching costs
are a significant majority of the final costs.
4.2. Results for Tsukuba Image
The table in Figure 8 lists the results of the three algo-
rithms on the Tsukuba image. For this image, Graph Cuts
is superior. The primary reason for this superiority appears
to be that the Belief Propagation algorithm assigns portions
of the background to have very small disparity. An example
of this can be seen in Figure 3. On the other hand, when
the penalty, P , is higher, Belief Propagation does preserve
some structures that Graph Cuts does not.
4.3. Results for Sawtooth Image
Figure 4 shows the output of the algorithm on the saw-
tooth image. In general, the results for the two algorithms
on this image were comparable.
4.4. Results for Venus Image
Figure 5 shows a sample of the output of the algorithm on
the venus image. Again, the Graph Cuts algorithm seemed
to produce smoother results.

Citations
More filters
Book

Computer Vision: Algorithms and Applications

TL;DR: Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images and takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene.
Journal ArticleDOI

Efficient Belief Propagation for Early Vision

TL;DR: Algorithmic techniques are presented that substantially improve the running time of the loopy belief propagation approach and reduce the complexity of the inference algorithm to be linear rather than quadratic in the number of possible labels for each pixel, which is important for problems such as image restoration that have a large label set.
Book

Image Alignment and Stitching: A Tutorial

TL;DR: In this article, the basic motion models underlying alignment and stitching algorithms are described, and effective direct (pixel-based) and feature-based alignment algorithms, and blending algorithms used to produce seamless mosaics.
Proceedings Article

Convergent tree-reweighted message passing for energy minimization.

TL;DR: This paper develops a modification of the recent technique proposed by Wainwright et al. (Nov. 2005), called sequential tree-reweighted message passing, which outperforms both the ordinary belief propagation and tree- reweighted algorithm in both synthetic and real problems.
Journal ArticleDOI

Convergent Tree-Reweighted Message Passing for Energy Minimization

TL;DR: The sequential tree-reweighted message passing (STE-TRW) algorithm as discussed by the authors is a modification of Tree-Reweighted Maximum Product Message Passing (TRW), which was proposed by Wainwright et al.
References
More filters
Journal ArticleDOI

A taxonomy and evaluation of dense two-frame stereo correspondence algorithms

TL;DR: This paper has designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms.
Journal ArticleDOI

Fast approximate energy minimization via graph cuts

TL;DR: This work presents two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves that allow important cases of discontinuity preserving energies.
Journal ArticleDOI

An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision

TL;DR: This paper compares the running times of several standard algorithms, as well as a new algorithm that is recently developed that works several times faster than any of the other methods, making near real-time performance possible.
Proceedings ArticleDOI

Fast approximate energy minimization via graph cuts

TL;DR: This paper proposes two algorithms that use graph cuts to compute a local minimum even when very large moves are allowed, and generates a labeling such that there is no expansion move that decreases the energy.
Related Papers (5)
Frequently Asked Questions (8)
Q1. What are the contributions mentioned in the paper "Comparison of graph cuts with belief propagation for stereo, using identical mrf parameters" ?

The authors address this through controlled experiments by comparing the Belief Propagation algorithm and the Graph Cuts algorithm on the same MRF ’ s, which have been created for calculating stereo disparities. 

For synchronous updates, the number of iterations must be may need to be as large as the largest dimension of the image in order to pass information from one side of the image to the other. 

For a synchronous update schedule on an image with width W , it would take W iterations for information from one side of the image to reach the other. 

The sum-product algorithm computes the marginal distributions of each node, while the max-product algorithm computes the MAP estimate of the whole MRF. 

When the max-product algorithms converges on a graph with loops, it returns an approximate solution for the most likely labelling of the graph. 

This effect of having large flat regions with sudden jumps is caused because the MAP estimator must assign a single discrete disparity level to each point. 

The authors use the max-product algorithm to find the MAP estimate for comparison with the Graph Cuts algorithm, which also computes the MAP estimate. 

With the compatibility functions defined, the joint probability of the MRF can be written as [1]:P (x1,x2, . . . , xN , y1, y2, . . . , yN ) = ∏(i,j)Ψ(xi, xj) ∏pΦ(xp, yp) (1)where N is the number of nodes, (i, j) represent a pair of neighboring nodes, xn is the variable at location n, and yn is the variable representing the intensity differences.