How many iterations did the Graph Cuts algorithm take to pass information from one side?

For synchronous updates, the number of iterations must be may need to be as large as the largest dimension of the image in order to pass information from one side of the image to the other.

Why is the effect of having large flat regions with sudden jumps caused?

This effect of having large flat regions with sudden jumps is caused because the MAP estimator must assign a single discrete disparity level to each point.

What is the compatibility function of the MRF?

With the compatibility functions defined, the joint probability of the MRF can be written as [1]:P (x1,x2, . . . , xN , y1, y2, . . . , yN ) = ∏(i,j)Ψ(xi, xj) ∏pΦ(xp, yp) (1)where N is the number of nodes, (i, j) represent a pair of neighboring nodes, xn is the variable at location n, and yn is the variable representing the intensity differences.

(Open Access) Comparison of graph cuts with belief propagation for stereo, using identical MRF parameters (2003) | Tappen

Q: What are the contributions mentioned in the paper "Comparison of graph cuts with belief propagation for stereo, using identical mrf parameters" ?

The authors address this through controlled experiments by comparing the Belief Propagation algorithm and the Graph Cuts algorithm on the same MRF ’ s, which have been created for calculating stereo disparities.

Q: How many iterations would it take for information to reach the other side?

For a synchronous update schedule on an image with width W , it would take W iterations for information from one side of the image to reach the other.

Q: What is the MAP estimate of the whole MRF?

The sum-product algorithm computes the marginal distributions of each node, while the max-product algorithm computes the MAP estimate of the whole MRF.

Q: What is the way to calculate the probability of the likely labelling of the graph?

When the max-product algorithms converges on a graph with loops, it returns an approximate solution for the most likely labelling of the graph.

Q: What is the MAP estimate for the Graph Cuts algorithm?

The authors use the max-product algorithm to find the MAP estimate for comparison with the Graph Cuts algorithm, which also computes the MAP estimate.

Comparison of Graph Cuts with Belief Propagation for Stereo, using Identical

MRF Parameters

Marshall F. Tappen William T. Freeman

Computer Science and Artiﬁcial Intelligence Laboratory

Massachusetts Institute of Technology

Cambridge, MA 02139

{mtappen,billf}@ai.mit.edu

Abstract

Recent stereo algorithms have achieved impressive re-

sults by modelling the disparity image as a Markov Random

Field (MRF). An important component of an MRF-based

approach is the inference algorithm used to ﬁnd the most

likely setting of each node in the MRF. Algorithms have

been proposed which use Graph Cuts or Belief Propaga-

tion for inference. These stereo algorithms differ in both the

inference algorithm used and the formulation of the MRF.

It is unknown whether to attribute the responsibility for dif-

ferences in performance to the MRF or the inference algo-

rithm. We address this through controlled experiments by

comparing the Belief Propagation algorithm and the Graph

Cuts algorithm on the same MRF’s, which have been cre-

ated for calculating stereo disparities. We ﬁnd that the la-

bellings produced by the two algorithms are comparable.

The solutions produced by Graph Cuts have a lower energy

than those produced with Belief Propagation, but this does

not necessarily lead to increased performance relative to

the ground-truth.

1. Introduction

Two of the more exciting recent results in computational

vision have been the development of fast algorithms for ap-

proximate inference in Markov Random Fields (MRF’s):

Graph Cuts [5] and Belief Propagation [16]. Papers on

both graph cuts and belief propagation have won recent

academic recognition [8, 9, 16] and have been applied to

a number of problems [6, 7]. In the realm of stereo, the

top contenders for the best stereo shape estimation, on the

most common comparison data, either use Belief Propaga-

tion [11] or Graph Cuts [3, 5]. Both algorithms allow fast,

approximate solutions to MRF’s, which are powerful tools

for modelling vision problems, but intractable to solve with

reasonable speed until recently. These algorithms may be-

come the basis for new and powerful vision algorithms, so it

is important to know how they compare against each other.

The stereo problem provides a well-understood test-bed for

comparison.

Unfortunately, the competing stereo algorithms use both

a different inference algorithm and a different formulation

of the MRF. This raises the question of how to understand

differences in systems’ performance. Labelling an MRF has

been shown to be NP-hard, so both Graph Cuts and Be-

lief Propagation approximate the optimal solution. Should

one system’s improvement over the other be attributed to its

choice of an inference algorithm? Alternatively, does most

of the improvement belong to the authors’ unique formula-

tion of the MRF?

The answer to these questions is important because ad-

vancing the ﬁeld of computer vision and building on these

two systems requires understanding what makes these al-

gorithms different and how these differences affect the sys-

tems’ performance. To answer this question, we show a

controlled comparison of the Belief Propagation and Graph

Cuts algorithms. The two algorithms are examined on iden-

tical MRF’s, allowing us to measure the quality of the solu-

tions produced by the two algorithms and isolate the effects

of the inference algorithms on system performance.

In Section 2 we discuss how the MRF model can be used

to calculate stereo disparities. Section 3 explains the formu-

lation of the MRF’s used in our tests and the implementation

of the Belief Propagation and Graph Cuts algorithms. The

results of our comparison are presented in Section 4 and are

discussed in Section 5.

2. MRF Model for Stereo

Given a rectiﬁed stereo pair of images, the goal is to

ﬁnd the disparity of each pixel in the reference image. In

[10], Scharstein and Szeliski point out that most stereo al-

gorithms perform four basic steps:

1. Matching cost computation

2. Cost (or support) aggregation

3. Disparity optimization

4. Disparity reﬁnement

In this section, we discuss how steps 1, 2, and 3 can be

accomplished by modelling the disparity image as a Markov

Random Field.

2.1. Matching Cost Computation

The true disparity of each pixel in the disparity image is

a random variable, denoted x

for the variable at pixel lo-

cation p. Each variable can take one of N discrete states,

which represent the possible disparities at that point. For

each possible disparity value, there is a cost associated with

matching the pixel to the corresponding pixel in the other

stereo image at that disparity value. Typically, this cost

is based on the intensity differences between the two pix-

els, y

. This cost is reﬂected in the compatibility func-

tion, Φ(x

, y

), which relates how compatible a disparity

value is with the intensity differences observed in the im-

age. Smaller intensity differences will correspond to higher

compatibilities and vice-versa.

2.2. Support Aggregation

The next step is to aggregate support for the candidate

disparities. A standard sum-of-squared-differences algo-

rithm accomplishes this by assuming a constant disparity

over a small window surrounding each point and ﬁnding the

best matching cost [10]. A MRF approach aggregates sup-

port by introducing a second compatibility function, Ψ(·).

This function expresses the compatibility between neigh-

boring variables. Traditionally, only variables adjacent to a

particular variable are considered its neighbors. Therefore,

every Ψ(·) is of the form Ψ(x

, x

), where the location n is

adjacent to p. This is known as a pair-wise Markov Random

Field. Typically only pairwise Markov Random Fields are

used for stereo problems because considering more neigh-

bors quickly makes inference on the ﬁeld computationally

intractable. Although the compatibility functions only con-

sider adjacent variables, each variable is still able to inﬂu-

ence every other variable in the ﬁeld via these pair-wise con-

nections.

2.3. Disparity Optimization

With the compatibility functions deﬁned, the joint prob-

ability of the MRF can be written as [1]:

P (x

, . . . , x

, y

, . . . , y

) =

(i,j)

Ψ(x

, x

)

Φ(x

, y

)

(1)

where N is the number of nodes, (i, j) represent a pair of

neighboring nodes, x

is the variable at location n, and y

is the variable representing the intensity differences. The y

variables are observed and therefore ﬁzed during optimiza-

tion.

The disparity optimization step requires choosing an es-

timator for x

. . . x

. The two most common estimators

are the Minimum Mean Squared Error (MMSE) estimator

and Maximum A Posteriori (MAP) estimator. The MMSE

estimate of each x

is the mean of the marginal distribu-

tion of x

. The MAP estimate is the labelling of x

. . . x

that maximizes Equation 1. For this comparison, we use

the MAP estimator because the Graph Cuts algorithm is de-

signed to compute the MAP estimator. In Section 5.2, we

discuss the advantages of using the MMSE estimator.

2.4. Equivalence to Energy Minimization

As posed above, the best disparities are found by max-

imizing a probability. Taking the log of Equation 1, we

see ﬁnding the MAP estimate is equivalent to minimizing

a function of the form

E(x

, . . . , x

, y

, . . . , y

) =

(i,j)

− log Ψ(x

, x

) +

− log Φ(x

, y

)

(2)

In [5], this equation is expressed as

E(x

, . . . , x

, y

, . . . , y

) =

(i,j)

V (x

, x

) +

D(x

, y

)

(3)

The functions V (·) and D(·) are energy functions. The

fact that maximizing the probability in Equation 1 is equiv-

alent to minimizing the energy in Equation 3 is important

because it means that the Belief Propagation and the Graph

Cuts algorithms are attempting to solve the same problem.

Once the MRF has been formulated, one algorithm can be

substituted for the other in the stereo algorithms.

3. MRF Formulation

To determine whether using one algorithm presents

a clear advantage over the other for the stereo prob-

lem, we compared Graph Cuts and Belief Propaga-

tion on identical MRF’s. The comparison was made

using the stereo framework created by Scharstein and

Szeliski to compare a number of different stereo al-

gorithms [10]. This framework can be found at

http://www.middlebury.edu/stereo. In or-

der to facilitate further experimentation, our implemen-

tation of the Belief Propagation algorithm and modi-

ﬁcations to the stereo framework will be available at

http://www.ai.mit.edu/˜mtappen.

The MRF is deﬁned in terms of energy functions,

rather than compatibilities. The energy function D(x

, y

which corresponds to the matching cost computation and

Φ(x

, y

), is computed using the Birchﬁeld-Tomasi match-

ing cost [2]. The cost function between nodes V (x

, x

which determines how support is aggregated and corre-

sponds to Ψ(x

, x

), is computed in the same fashion as

[12]:

V (x

, x

) =

(

0 if x

= x

(∆I) otherwise

(4)

This type of energy function is known as a Potts model.

The function ρ

(·) is deﬁned in terms of the image gra-

dient between the pixels i and j, which is denoted at ∆I:

(∆I) =

(

P × s if ∆I < T

s Otherwise

(5)

where T is a threshold, s is a penalty term for violating the

smoothness constraint and P is a penalty term that increases

the penalty when the gradient has a small magnitude. Note

that T , P , and s are constant over the whole image.

To use belief propagation, a cost C can be converted into

compatibility by calculating e

−C

. For numerical reasons,

the cost is converted into a compatibility using e

−C/D

where D is a constant.

3.1. Choice of Belief Propagation Algorithm

To implement the Belief Propagation algorithm, two de-

cisions must be made. First, either the sum-product algo-

rithm or the max-product algorithm must be chosen. The

sum-product algorithm computes the marginal distributions

of each node, while the max-product algorithm computes

the MAP estimate of the whole MRF. More information on

these algorithms can be found in [6, 14, 16]. We use the

max-product algorithm to ﬁnd the MAP estimate for com-

parison with the Graph Cuts algorithm, which also com-

putes the MAP estimate.

The second choice is the message update schedule. At

each iteration, each node uses the messages it has received

in the previous iteration from neighboring nodes to calcu-

late messages to send to those neighbors. If node i is to

the right of node j, node i sends a message to j at each

iteration of the algorithm. This message contains node i’s

belief about each possible state of node j. This message

is computed from the messages that i has received from its

neighbors. The message from i to j, denoted as m

right

)

because it is the message that j is receiving from its right,

is:

right

) ← max

Ψ(x

, x

)Φ(x

, y

)×

right

down

)

(6)

where m

right

), m

), and m

down

) are the mes-

sages received by i from the nodes above, below, and to its

right.

The message update schedule determines when a mes-

sage sent to a node will be used by that node to compute

messages for the node’s neighbors. In a synchronous update

schedule, each node ﬁrst computes the message for each

neighbor. Once every node has computed the messages, the

messages are delivered to each node and used to compute

the next round of messages.

An alternative schedule is to propagate messages in one

direction and update each node immediately. For instance

the ﬁrst node in a row, i would send a message to the node

at its right, i + 1. Node i + 1 would then use this message

immediately, along with the messages it had previously re-

ceived from above and below, to compute a message to node

i +2. Once this has been completed for every row, the same

procedure occurs in the up, down, and left direction. We

refer to this style of updating as “accelerated” updating.

The advantage of this method is that information is

quickly propagated across the ﬁeld. For a synchronous up-

date schedule on an image with width W , it would take

W iterations for information from one side of the image to

reach the other. The alternative schedule would only require

one iteration for this information to be propagated. This fea-

ture of the “up-down-left-right” message passing schedule

causes the Belief Propagation algorithm to converge very

quickly.

When the max-product algorithms converges on a graph

with loops, it returns an approximate solution for the most

likely labelling of the graph. The probability of this so-

lution is guaranteed to be greater than all other solutions

in a large neighborhood around that solution [15]. Upper

bounds on the difference between the probability of the true

MAP solution and the approximate solution returned by Be-

lief Propagation are shown in [13].

3.2. Graph Cuts Algorithm

We used the Graph Cuts algorithm provided in

Scharstein and Szeliski’s package. In particular, the pack-

age implements the “swap” algorithm described in [5]. Like

the Belief Propagation algorithm, the Graph Cuts algorithm

(a) Map Image (b) Graph Cuts (c) Synchronous BP (d) Accelerated BP

Figure 1. Results produced by the three algorithms on the map image. The parameters used to

generate this ﬁeld were s = 50, T = 4, P = 2. Graph Cuts returns the smoothest solution because it

is able to ﬁnd a lower-energy labelling than the two Belief Propagation algorithms.

Energy of MRF Labelling Returned (×10

)

Synchronous % Energy from Occluded

Image Ground-Truth Graph Cuts Belief Prop Matching Costs

Map 757 383 442 61%

Sawtooth 6591 1652 1713 79%

Tsukuba 1852 663 775 61%

Venus 5739 1442 1501 76%

Figure 2. Field Energies for the MRF labelled using ground-truth data compared to the energies for

the ﬁelds labelled using Graph Cuts and Belief Propagation. Notice that the solutions returned by

the algorithms consistently have a much lower energy than the labellings produced from the ground-

truth, showing a mismatch between the MRF formulation and the ground-truth. The ﬁnal column

contains the percentage of each ground-truth solution’s energy that comes from matching costs of

occluded pixels.

ﬁnds a local minimum by making local improvements. The

“swap” algorithm makes local improvements by choosing

two of the possible states, α and β, then ﬁnding those nodes

labelled α whose label should be change to β, or vice-versa,

in order minimize the energy in the ﬁeld as much as possi-

ble. Using the min-cut/max-ﬂow formulation, the optimal

swap for the entire graph can be computed.

4. Comparing Belief Propagation and Graph

Cuts

We compared the Graph Cuts algorithm with the max-

product Belief Propagation algorithm, using both syn-

chronous updates and accelerated updates. For each of the

four images used in [10], we generated 10 MRF ﬁelds by

varying the T , s, and P parameters of Equation 5. We then

used the Graph Cuts algorithm and the Belief Propagation

Algorithms to estimate the MAP solution of the ﬁeld. To

compare the two algorithms, we collected the three statis-

tics reported in [11] plus an additional statistic:

• B

– The percentage of pixels in non-occluded areas

of the image with a disparity error greater than 1.

• B

– The percentage of pixels in textureless areas of

the image with a disparity error greater than 1.

• B

– The percentage of pixels near discontinuities in

the image with a disparity error greater than 1.

• E – The energy of the solution.

4.1. Results for Map Image

The table in Figure 8 summarizes the results of the three

algorithms on the map image. The performance in terms of

, B

, and B

is nearly identical; neither algorithm has

a clear advantage.

However, it is useful to examine the energy of the solu-

tion returned by each algorithm. When the error penalty, s,

is 20, the energy of the solutions returned by Belief Prop-

agation and Graph Cuts nearly equal, although Graph Cuts

consistently returns a smaller ﬁeld energy. After s is raised

to 50, the difference between the two solutions increases.

The reason for this can be seen in Figure 1. The regions on

the left side of the plane are smoother in the results returned

by Graph Cuts than those returned by Belief Propagation.

However, this extra smoothness does not translate into

better performance in terms of the ground-truth data. That

(a) Tsukuba Image (b) Graph Cuts (c) Synchronous BP (d) Accelerated BP

Figure 3. Results produced by the three algorithms on the Tsukuba image. The parameters used to

generate this ﬁeld were s = 50, T = 4, P = 2. Again, Graph Cuts produces a much smoother solution.

Belief Propagation does maintain some structures that are lost in the Graph Cuts solution, such as

the camera and the face in the foreground.

(a) Sawtooth Image (b) Graph Cuts (c) Synchronous BP (d) Accelerated BP

Figure 4. Results produced by the three algorithms on the sawtooth image. The parameters used to

generate this ﬁeld were s = 50, T = 4, P = 2. For this image, the output of the three algorithms is

comparable.

is because the ground-truth solution actually has a higher

energy than either of the solutions returned by Belief Prop-

agation or Graph Cuts. In Figure 2, the energy of the

ground-truth solution for each image is shown for a spe-

ciﬁc setting of the parameters of ρ

(·). The ground-truth

labelling was produced by choosing the disparity level clos-

est to the ground-truth disparity of each point. The ener-

gies of the labelling produced by Graph Cuts and Belief

Propagation are signiﬁcantly lower than the energy of the

ground-truth labelling. The large energies for the ground-

truth solution are caused by inaccurate matching costs in

occluded areas. Since occluded pixels have no counterpart

in the other image, the pixel at the correct disparity of an

occluded pixel will likely have a different intensity, lead-

ing to a large matching cost. The signiﬁcant effect of these

matching costs can be observed in the last column of Fig-

ure 2. This column lists the percentage of the ﬁnal energy

for each of the solutions shown which can be attributed to

matching costs for occluded pixels. These matching costs

are a signiﬁcant majority of the ﬁnal costs.

4.2. Results for Tsukuba Image

The table in Figure 8 lists the results of the three algo-

rithms on the Tsukuba image. For this image, Graph Cuts

is superior. The primary reason for this superiority appears

to be that the Belief Propagation algorithm assigns portions

of the background to have very small disparity. An example

of this can be seen in Figure 3. On the other hand, when

the penalty, P , is higher, Belief Propagation does preserve

some structures that Graph Cuts does not.

4.3. Results for Sawtooth Image

Figure 4 shows the output of the algorithm on the saw-

tooth image. In general, the results for the two algorithms

on this image were comparable.

4.4. Results for Venus Image

Figure 5 shows a sample of the output of the algorithm on

the venus image. Again, the Graph Cuts algorithm seemed

to produce smoother results.

Comparison of graph cuts with belief propagation for stereo, using identical MRF parameters

Figures

Citations

Computer Vision: Algorithms and Applications

Efficient Belief Propagation for Early Vision

Image Alignment and Stitching: A Tutorial

Convergent tree-reweighted message passing for energy minimization.

Convergent Tree-Reweighted Message Passing for Energy Minimization

References

A taxonomy and evaluation of dense two-frame stereo correspondence algorithms

Fast approximate energy minimization via graph cuts

Spatial Interaction and the Statistical Analysis of Lattice Systems

An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision

Fast approximate energy minimization via graph cuts

Related Papers (5)

Fast approximate energy minimization via graph cuts

A taxonomy and evaluation of dense two-frame stereo correspondence algorithms

Efficient Belief Propagation for Early Vision

What energy functions can be minimized via graph cuts

Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images

Frequently Asked Questions (8)

Q1. What are the contributions mentioned in the paper "Comparison of graph cuts with belief propagation for stereo, using identical mrf parameters" ?

Q2. How many iterations did the Graph Cuts algorithm take to pass information from one side?

Q3. How many iterations would it take for information to reach the other side?

Q4. What is the MAP estimate of the whole MRF?

Q5. What is the way to calculate the probability of the likely labelling of the graph?

Q6. Why is the effect of having large flat regions with sudden jumps caused?

Q7. What is the MAP estimate for the Graph Cuts algorithm?

Q8. What is the compatibility function of the MRF?