scispace - formally typeset
Open AccessProceedings ArticleDOI

Evaluation of Cost Functions for Stereo Matching

Reads0
Chats0
TLDR
This paper evaluates the insensitivity of different matching costs with respect to radiometric variations of the input images with a local, a semi-global, and a global stereo method.
Abstract
Stereo correspondence methods rely on matching costs for computing the similarity of image locations. In this paper we evaluate the insensitivity of different matching costs with respect to radiometric variations of the input images. We consider both pixel-based and window-based variants and measure their performance in the presence of global intensity changes (e.g., due to gain and exposure differences), local intensity changes (e.g., due to vignetting, non-Lambertian surfaces, and varying lighting), and noise. Using existing stereo datasets with ground-truth disparities as well as six new datasets taken under controlled changes of exposure and lighting, we evaluate the different costs with a local, a semi-global, and a global stereo method.

read more

Content maybe subject to copyright    Report

Evaluation of Cost Functions for Stereo Matching
Heiko Hirschm¨uller
Institute of Roboti cs and Mechatronics Oberpfaffenhofen
German Aerospace Center (DLR)
heiko.hirschmueller@dlr.de
Daniel Scharstein
Middlebury College
Middlebury, VT, USA
schar@middlebury.edu
Abstract
Stereo correspondence methods rely on matching costs
for computing the similarity of image locations. In this pa-
per we evaluate the insensitivity of differen t matching costs
with respect to radiometric variations of the input images.
We consider both pixel-based and window-based variants
and measure their perfo rmance in the presence of global
intensity changes (e.g., due to gain and exposure differ-
ences), loc al intensity changes (e.g., due to vignetting, non-
Lambertian surfaces, and varying lighting), and noise. Us-
ing existing stereo datasets with ground-truth disparities as
well as six new datasets taken under controlled changes of
exposure and lighting, we evaluate the different costs with a
local, a semi-global, and a global stereo method.
1. Introduction and Related Work
All stereo correspondence algorithms have a way of
measuring the similarity of image locations. Typically, a
matching cost is computed at each pixel for all dispari-
ties under con sideration. The simplest matching c osts as-
sume constant intensities at matching image lo c ations, but
more robust costs model (explicitly or implicitly) certain
radiometric c hanges and/o r noise. Common pixel-based
matching c osts inc lude absolute differences, sq uared dif -
ferences, sampling-insensitive absolute differences [2], or
truncated versions, both on gray and color images. Com-
mon window-based matching costs inc lude the sum of abso-
lute or squared differences (SAD / SSD), normalized cross-
correlation (NCC), and rank and census transforms [23].
Some wind ow-based costs can be implemented efficiently
using filters. For example, the rank transform can be com-
puted using a rank filter followed by absolute differences of
the filter results. Similarly, there are other filters that try to
remove bias or gain changes, e.g., LoG and mean filters.
More co mplicated similarity m e asures are p ossible, in-
cluding mutual information [7, 9, 11] and approximative
segment-wise mutual information as used in the layered
stereo approach of Zitnick et al. [24].
Recent stereo surveys [5, 17] and the Middlebury online
evaluation [14] compar e state-of-the-a rt stereo method s on
test data with complex ge ometries an d varied texture. Other
evaluations focus on certain aspects like aggregation meth-
ods for real-time matching [21]. However, the insensitivity
of m atching costs is not evaluated since the stereo test sets
are typically pa irs of radiometr ically very similar images.
The term radiometrically similar means that pixels that
correspo nd to the same scene point have similar or ideally
the sam e values in the images. Radiometric differences can
be caused by the ca mera(s) due to slightly different settings,
vignetting, image noise, etc. Further differences may be due
to non- Lambertian surfaces, which make the amount of re-
flected light dependent on the viewing angle. Finally, the
strength or positions of the light sources may change when
images of a static scene are acquired at different times, as
is the case when matching aerial or satellite images. In a ll
cases, methods are required that can handle radiometric dif-
ferences.
The scope of this paper is the evaluation and compari-
son of some widely used stereo matching costs on images
with several common radiometric differences. The focus
is o n matching costs that explicitely or implicitly handle
radiometric differences. Th is excludes popular methods
like the correlation- based weighting according to proxim-
ity and color similarity [22], as this is an a ggregation ap-
proach that uses th e truncated absolute difference as match-
ing cost. Furthermore, only methods that work on a single
stereo pair with unkn own radiometric distortions and light
sources are evaluated, according to the considered applica-
tions. This excludes methods that explicitly handle non-
Lambertian sur faces by taking at least two stereo images
with different illuminations [6] or methods that require cal-
ibrated lig ht sources.
2. Matching Costs and Stereo Methods
It is important to distinguish between matching costs and
methods that use these costs. In this paper we compare 6
costs and 3 stereo methods. We consider all possible com-
binations to fully evaluate the insensitivity of each cost.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Minneapolis, Minneso ta, USA, June 18-23, 2007.

2.1. Matching Costs
Our first cost function is the com monly-used absolute
difference, which assumes brightness constancy for corre-
sponding pixels, and which serves as a baseline perfor-
mance measure of our evaluation. Local stereo me thods
usually aggregate the sum of absolute differences (SAD)
over a window, while global methods use the differences
pixel-wise. In both cases we use the sampling-insensitive
calculation of Birchfield and Tomasi ( BT) [2].
Our next three cost fu nctions can be implemented as fil-
ters that are applied separately to the input images. The
transformed images are then matched using the absolute dif-
ference. The first filter is the Laplacian of Gaussian (LoG),
which is often u sed in local m ethods for removing noise and
changes in bias [10, 13]. Here we use a LoG filter with a
standard deviation of 1 p ixel, which is a pplied by convolu-
tion with a 5× 5 kernel. The second filter is th e rank filter,
which replaces the intensity of a pixel with its rank among
all pixels within a certain neighborhood. It was originally
proposed [23] for robustness to outliers within the neigh-
borhood, which typically occur near depth discontinuities
and leads to blurred object borders. Since the method only
depends on the ordering of intensities and not their values,
it compensates for all radiom e tric distortions that p reserve
this ordering. Here we use a rank filter with a square win-
dow of 15×15 pixels centered at the pixel of interest. While
there are other rank-based matching methods [1, 16], we
chose the ran k tr ansform since it can be efficiently imple-
mented as filter, without changing the ster e o method itself.
The third filter is a mean lter, which aims to com pensate
a change in bias by subtr acting the mean intensity of a cer-
tain neighborhood. We again use a squa re window of size
15×15 th at is centered at the pixel of interest.
Our next matching c ost is mutual information (MI), a
powerful method for handling complex radiometric rela-
tionships between two ima ges [20]. The MI of two images
is calculated by sum ming the entro py of the histograms of
the overlapping parts of each image and subtracting the en-
tropy of the join t histogram of pixel-wise correspondences.
The MI value directly expresses how well images are regis-
tered. This follows from the observation that the joint his-
togram of well-registered images has just a few high peaks
in contrast to poorly registered images where th e joint his-
togram is rather flat. Thus, for well-registered images, the
entropy of the joint histogram is low, while the entropy of
the individual histog rams changes little. MI has been used
for local [7] and global [11] stereo methods. In the lat-
ter case, its calculation is changed by Taylor expansion for
getting a pixel-wise ma tc hing cost. The costs are stored
for each combination of intensities in a cost matrix. This
lookup table is required for matching, but can only be cre-
ated from known correspondences. The solution is an itera-
tive de sign in which the disparity image of the previous loop
serves for creating the cost matr ix for matching intensities
in the next loop [11]. The p rocess is started with a random
disparity image and r e quires typically only 3 to 4 iterations.
In this paper we use the efficient Hierarchical MI (HMI)
method of [9], which works as follows. First, both input
images are downscaled by factor 16 and MI is calculated
by matching the stereo images using a random disparity im-
age. The process is itera te d a few times before the disparity
is upscaled for serving as initial guess for matching at
1
8
th of
the full resolution. Upscaling and matching is repeated un-
til the full resolutio n is reached. It should be noted that the
disparity image of the lower-resolution level is used only
for calculating the matching costs of the highe r-resolution
level, but not for restricting the disparity range. The h ie r-
archical calculation has a runtime overhead of just 14% if
the runtime of the stereo meth od depends linearly on the
number of p ixels and d isp arities [9].
Finally, we also include normalized cross-correlatio n
(NCC) in our evaluation. NCC is a standard method for
matching two windows around a pixel of interest. The nor-
malization within the window compensates differences in
gain and bias. NCC is statistically the optimal method for
compen sating G aussian noise. However, NCC tends to blur
depth discon tinuities more than many other matching costs,
because outliers lead to high errors within the NCC calcu-
lation. MNCC has be en introduce d as a common variant
by Moravec [15]. We selected th e standard NCC as MNCC
gave slightly inferior results in our exper iments. In contrast
to all other matching costs we consider here, NCC can only
be used with local methods due to its window-based design.
In all of the above costs, we only use the image inten sity
(luminance) and not the color for matchin g. The reason is
that several of the considered costs (e.g., rank and MI) are
naturally defined on intensity images, and for fairness we
want to compare a ll costs on the same input data. However,
we also found that that those costs that easily extend to colo r
only perform marginally better on our data sets. Clearly,
future research is needed on robust color matching.
To summarize, we compare six costs: sampling-
insensitive absolute differences (BT), thr ee filter-based
costs (LoG, Rank, and Mean), hierarchical mutual in for-
mation (HMI), an d normalized cross-corr elation (NCC).
2.2. Stereo Algorithms
The performance of a match ing cost can depend on the
algorithm that uses the cost. We thus conside r th ree dif-
ferent stereo algorithms: a local, correlation-based method
(Corr), the semi-global method of [9] (SGM), and a global
method using graph cuts [4] (GC). We implemented each
of the six matching costs for each stereo method, except for
NCC which is only used with the local method.
Our local stereo method ( Corr) is a simple window-based
approa c h [10, 13, 17]. After aggregating the matching cost
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Minneapolis, Minneso ta, USA, June 18-23, 2007.

Figure 1. The left images of the Tsukuba, Venus, Teddy, and C ones stereo pairs.
with a square window of 9×9 pixels, the disparity with the
lowest aggregated cost is selecte d (winner-takes-all). This is
followed by subpixel interpolation, a left-rig ht consistency
check for invalidating occlusions and mismatches, and in-
validation of disparity segments smaller than 160 pixels [8].
Invalid disparity areas are filled by propagating neighboring
small (i.e., backgroun d) disparity values. The reason we
perform these post-processing steps, as op posed to compar-
ing the “raw” results, is to reduce the overall errors, which
in turn yields improved discrimination be twe en costs.
Our second stereo algo rithm is the semi-global match-
ing (SGM) method [9]. We selected it as an approach in-
between local and global matchin g. There are oth e r ap-
proach e s in this category, e.g., dynamic programming (DP),
but SGM outperforms DP a nd yield s no streaking artefacts.
SGM aims to minimize a global 2D energy function E(D)
by solving a large number of 1D minimizatio n problems.
Following [9], the actua l energy used is
E(D) =
p
C(p, D
p
)
+
qN
p
P
1
T[|D
p
D
q
| = 1]
+
qN
p
P
2
T[|D
p
D
q
| > 1]
.
(1)
The first term of (1) calculates the sum of a pixel-wise
matching cost C(p, D
p
) (as defined in Section 2.1) for all
pixels p at their disparities D
p
. The function T[] is defined
to return 1 if its argument is true and 0 otherwise. Thus, the
second term o f the energy func tion penalizes small dispar-
ity differences of neighboring pixels N
p
of p with the cost
P
1
. Similarly, the third term penalizes larger disparity steps
(i.e., discontinuities) with a higher penalty P
2
. The value of
P
2
is adapted to the local intensity gradient by P
2
=
P
2
|I
bp
I
bq
|
for the neighboring pixels p and q. This results in sharper
depth discontinuities as they mostly coincide with intensity
variations.
SGM c alculates E(D) along 1D paths from 8 directions
towards each pixel of interest using dynamic programming.
The costs of all paths are summed for e ach pixel and dispar-
ity. The disparity is then determined by winner-takes-all.
Subpixel interpolation is perform e d as w ell as a left-right
consistency check. Disparity segments below the size of
20 pixels are invalidated for getting rid of small patches of
outliers. Invalid disparities are again interp olated.
Finally, we use a graph-cuts (GC) stereo algorithm as a
representative of a global method [3, 4, 12]. Our implemen-
tation is based on the MRF library provided by [19]. We
tried to use the same energy function E(D) as for SGM.
However, we found that for GC it give s better resu lts to
adapt the cost P
2
not linearly with the intensity gradient, but
rather to double the va lue of P
2
for gradients below a given
threshold. Like SGM, GC only approximates the global
minimum of E(D), but it utilizes the full 2D connectivity
for the smoothness term in contrast to SGM, which opti-
mizes separately along 1D paths. Our G C implementation,
unlike Corr and SGM, ne ither includes subpixel interpola-
tion nor accounts for occlusions.
We manually tuned the smoothness paramete rs of SGM
and GC individually for each cost using ima ges without ra-
diometric differences. Afte r th e tuning phase, all parame-
ters were kept constant for all imag e s and experiments. This
approa c h allows to concentrate on the performance of the
matching cost rather than the stereo method.
3. Evaluation
We tested all combinations of all matching costs with the
local, semi-global, and global stereo algorithms on images
with simulated a nd real radiometric changes.
3.1. Simulated Radiometric Changes
For our first set of experiments, we use the standard Mid-
dlebury stereo da ta sets Tsukuba, Venus, Teddy, an d Co nes
[17, 18]. Figure 1 shows the left images of each set. All
images were carefully taken in a laboratory with the same
camera settings and under the same lighting con ditions.
Therefore radiometric changes are expected to be minimal.
We used a disparity range of 16 pixels for Tsukuba, 32 pix-
els for Venus and 64 pixels for Teddy, and Cones.
The first experiments consist of artificially changing
the global brightness line arly (i.e., gain change) and non-
linearly (e.g., gamma change). Only the right stereo images
were changed , while leaving the left images untouched.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Minneapolis, Minneso ta, USA, June 18-23, 2007.

Furthermore, w e applied a local brightness change that
mimics a vignettin g effect, i.e., the brightness decreases
proportionally with the distance to the image center. This
transformation was performed on both stereo images. Fi-
nally, we contaminated both stereo images with different
levels of Gaussian noise.
After computing dispa rity images for all transformations
and all combinations of matching costs and stereo algo-
rithms, we evaluate the results by counting the number o f
pixels with disparities that differ by more than 1 from the
ground truth. In our statistics we ignore occluded areas be-
cause the GC implementation does not consider occlusion s
(in contrast to Corr and SGM). For the correlation results
we also ignore an area of 4 pixels (half of the c orrelation
window) at the image border. Our final error measure is the
mean error per centage over all fo ur datasets. Figure 2 plots
these errors as a function of the amount of intensity c hange
for each combination of m a tc hing cost and stereo method.
We now discuss the individual results.
Figure 2a compares the matching costs when used with
correlation on images with decreasin g brig htness. The er-
rors of BT increa se very quickly with decreasing bright-
ness. This can be expected, because the absolute difference
is based on the assumption that corresp onding pixels have
the same values, which is violated. The Mean and LoG fil-
ters can compensate some of the differences, but degrade
quickly when s < 0.5. Both filters are designed for com-
pensating a bia s (i.e., constant offset), but not a gain (i.e.,
scale) change. NCC, HMI and Rank show a quite constant
performance, until the errors suddenly increase. Theoreti-
cally, all three methods should be able to fu lly compensate
the brightness change. The rea son f or the inc reased error
is that the transformed images are stored into 8 bits. Thus,
there is also an information loss, with low values of s.
Moving on to the next two plots, one ca n see that SGM
and GC (Figure 2b–c) gen erally perform better than corre-
lation. The relative performance of the different matching
costs remains similar, althoug h for SGM the LoG cost is
now slightly better than Rank on the non-transformed im-
ages (i.e., for a scale factor of 1). A more imp ortant ob-
servation is that HMI performs worse than Rank with cor-
relation, but much better with SGM and GC. The likely
reason is that Rank also reduce s the effect of outliers near
depth discontinuities. This is important for a window-based
method, but less so for pixel-based methods like SGM and
GC. It is interesting that on the non-transformed ima ges,
HMI performs better than BT, especially for SGM an d GC
(Figure 2b–c). One might assume th at BT should be best
for ima ges without any radiometric differences. However,
eve n though the images have been taken under controlled
conditions, some radiometric differences are inh erent and
surfaces are not L ambertian, and the brightness co nstancy
assumption is still violated. H MI relaxes this assumption
and only expects a globally consistent mapping.
The next three plots (Fig ure 2d–f) show the effect of a
gamma change as an example of a non-linear change of
brightn ess. The results are similar to the case of a line ar
change, although the performance of NCC degrades with
increasing gamma changes.
The artificial vignetting effect (Figure 2g–i) gives very
similar curves compared to the global br ightness changes,
except for HMI. The reason for the rather bad performa nce
of HMI is that its cost is explicitly based on the assump-
tion of a complex, but global radiometr ic transformation.
The vignetting effect locally changes the brightness. The
filter solutions LoG and Mean also assume global changes,
but only inside their rather small windows. Furthermore,
Rank only requires an unch anged order, which is main-
tained. Therefore, the filter solutions and especially Rank
are best in case of strong local radiometric variations.
Finally, the r esults fo r additive Gaussian noise with vary-
ing signal-to-noise ratios (SNR) are shown in the last three
plots (Figure 2j–l). Higher SNR number s mean lower noise.
For correlation th e different c osts perform quite similar,
probably since summing over a fixed window acts like av-
eraging, wh ic h reduces the effect of Gaussian n oise. The
situation is different for SGM and GC, where LoG, Rank,
and Mean perform even worse than BT. HMI performs con-
sistently best for SGM and GC on all noise levels.
To summarize the above experiments, Rank appears to
be the best matching cost for correlation based methods.
HMI app e ars to be best for pixel-based m atching meth-
ods like SGM and GC in the presence of global brightness
changes and noise. In the case of local brightne ss varia-
tions such as vignetting, Rank and LoG appear to be better
alternatives than HMI.
3.2. Real Exposure and Light Source Changes
As noted in the introduction, existing stereo test datasets
are unusually radiometrically “clean” and do not require ro-
bust matching co sts necessary for real-world stereo applica -
tions (unless, as in the previous sectio n, changes are intro-
duced synthetically). To remedy th is situatio n we have cre-
ated several new stereo datasets with ground truth using the
structured lighting technique of [18], which are available at
http://vision.middlebury.edu/stereo/data/
. In this pa-
per we use the six datasets shown in Figure 3: Art, Books,
Dolls, Laundry, Moebius, and Reindeer. Each dataset con-
sists of 7 rectified views taken from equidistant points along
a line, as well as ground-truth disparity maps for viewpoint
2 and 6. In this pa per we only c onsider binocular meth-
ods, so we use images 2 and 6 as left and right input im-
ages. Also, we downsample the original images to one third
of their size, resulting in images of roughly 460×370 pix-
els with a disparity range of 80 pixels. When creating the
datasets, we took each image using three different expo-
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Minneapolis, Minneso ta, USA, June 18-23, 2007.

0
5
10
15
20
25
30
0 0.2 0.4 0.6 0.8 1
Errors in unoccluded areas [%]
Scale factor s
BT
LoG/BT
Rank/BT
Mean/BT
HMI
NCC
(a) Global scale change (Corr)
0
5
10
15
20
25
30
0 0.2 0.4 0.6 0.8 1
Errors in unoccluded areas [%]
Scale factor s
BT
LoG/BT
Rank/BT
Mean/BT
HMI
(b) Global scale change (SGM)
0
5
10
15
20
25
30
0 0.2 0.4 0.6 0.8 1
Errors in unoccluded areas [%]
Scale factor s
BT
LoG/BT
Rank/BT
Mean/BT
HMI
(c) Global scale change (GC)
0
5
10
15
20
25
30
0 1 2 3 4 5 6
Errors in unoccluded areas [%]
Gamma factor g
BT
LoG/BT
Rank/BT
Mean/BT
HMI
NCC
(d) Global gamma change (Corr)
0
5
10
15
20
25
30
0 1 2 3 4 5 6
Errors in unoccluded areas [%]
Gamma factor g
BT
LoG/BT
Rank/BT
Mean/BT
HMI
(e) Global gamma change (SGM)
0
5
10
15
20
25
30
0 1 2 3 4 5 6
Errors in unoccluded areas [%]
Gamma factor g
BT
LoG/BT
Rank/BT
Mean/BT
HMI
(f) Global gamma change (GC)
0
5
10
15
20
25
30
0 0.2 0.4 0.6 0.8 1
Errors in unoccluded areas [%]
Scale factor at image border s
BT
LoG/BT
Rank/BT
Mean/BT
HMI
NCC
(g) Vignetting (Corr)
0
5
10
15
20
25
30
0 0.2 0.4 0.6 0.8 1
Errors in unoccluded areas [%]
Scale factor at image border s
BT
LoG/BT
Rank/BT
Mean/BT
HMI
(h) Vignetting (SGM)
0
5
10
15
20
25
30
0 0.2 0.4 0.6 0.8 1
Errors in unoccluded areas [%]
Scale factor at image border s
BT
LoG/BT
Rank/BT
Mean/BT
HMI
(i) Vignetting (GC)
0
5
10
15
20
25
30
10 15 20 25 30 35 40 45 50
Errors in unoccluded areas [%]
Signal to Noise Ratio (SNR) [dB]
BT
LoG/BT
Rank/BT
Mean/BT
HMI
NCC
(j) Adding Gaussian noise (Corr)
0
5
10
15
20
25
30
10 15 20 25 30 35 40 45 50
Errors in unoccluded areas [%]
Signal to Noise Ratio (SNR) [dB]
BT
LoG/BT
Rank/BT
Mean/BT
HMI
(k) Adding Gaussian noise (SGM)
0
5
10
15
20
25
30
10 15 20 25 30 35 40 45 50
Errors in unoccluded areas [%]
Signal to Noise Ratio (SNR) [dB]
BT
LoG/BT
Rank/BT
Mean/BT
HMI
(l) Adding Gaussian noise (GC)
Figure 2. Effect of applying radiometric changes or noise on the Tsukuba, Venus, Teddy, and Cones datasets. The columns correspond to
the three stereo methods, while each row examines a different type of intensity change.
sures and under three different con figurations of the light
sources. We thus have 9 different images from each view-
point that exhibit significant radiometric d ifferences. Figure
4 shows both exp osure and lighting variations of the left im-
age of the Art dataset.
We tested all combinations of matching costs and stereo
algorithm s over all 3×3 combinations of exposure and light
changes. The total matching e rror is calculated a s befo re as
the mean percentage of outliers (disparity error > 1) over all
six datasets. The resulting curves are shown in Figure 5. It
should be n oted th a t our new image s are more challenging
than the images used in Section 3.1, due to the increased
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Minneapolis, Minneso ta, USA, June 18-23, 2007.

Citations
More filters
Book

Computer Vision: Algorithms and Applications

TL;DR: Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images and takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene.
Journal ArticleDOI

Single image dehazing

TL;DR: Results demonstrate the new method abilities to remove the haze layer as well as provide a reliable transmission estimate which can be used for additional applications such as image refocusing and novel view synthesis.
Journal ArticleDOI

A Fast Single Image Haze Removal Algorithm Using Color Attenuation Prior

TL;DR: A simple but powerful color attenuation prior for haze removal from a single input hazy image is proposed and outperforms state-of-the-art haze removal algorithms in terms of both efficiency and the dehazing effect.
Book ChapterDOI

Single Image Dehazing via Multi-scale Convolutional Neural Networks

TL;DR: A multi-scale deep neural network for single-image dehazing by learning the mapping between hazy images and their corresponding transmission maps by combining a coarse-scale net which predicts a holistic transmission map based on the entire image, and a fine-scale network which refines results locally.
Proceedings ArticleDOI

End-to-End Learning of Geometry and Context for Deep Stereo Regression

TL;DR: A novel deep learning architecture for regressing disparity from a rectified pair of stereo images is proposed, leveraging knowledge of the problem’s geometry to form a cost volume using deep feature representations and incorporating contextual information using 3-D convolutions over this volume.
References
More filters
Journal ArticleDOI

A taxonomy and evaluation of dense two-frame stereo correspondence algorithms

TL;DR: This paper has designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms.
Journal ArticleDOI

Fast approximate energy minimization via graph cuts

TL;DR: This work presents two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves that allow important cases of discontinuity preserving energies.
Journal ArticleDOI

An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision

TL;DR: This paper compares the running times of several standard algorithms, as well as a new algorithm that is recently developed that works several times faster than any of the other methods, making near real-time performance possible.
Journal ArticleDOI

Alignment by Maximization of Mutual Information

TL;DR: A new information-theoretic approach is presented for finding the pose of an object in an image that works well in domains where edge or gradient-magnitude based methods have difficulty, yet it is more robust than traditional correlation.
Proceedings ArticleDOI

Fast approximate energy minimization via graph cuts

TL;DR: This paper proposes two algorithms that use graph cuts to compute a local minimum even when very large moves are allowed, and generates a labeling such that there is no expansion move that decreases the energy.
Related Papers (5)
Frequently Asked Questions (11)
Q1. What are the contributions in "Evaluation of cost functions for stereo matching" ?

In this paper the authors evaluate the insensitivity of different matching costs with respect to radiometric variations of the input images. The authors consider both pixel-based and window-based variants and measure their performance in the presence of global intensity changes ( e. g., due to gain and exposure differences ), local intensity changes ( e. g., due to vignetting, nonLambertian surfaces, and varying lighting ), and noise. 

Future work includes testing other matching costs that can handle radiometric differences, e. g., the census transform [ 23 ] and the approximation of MI of Zitnick et al. [ 24 ]. 

Common pixel-based matching costs include absolute differences, squared differences, sampling-insensitive absolute differences [2], or truncated versions, both on gray and color images. 

BT and HMI produce the best object borders, while the LoG, Rank, and especially the Mean filter cause distortions at object borders. 

NCC tends to blur depth discontinuities more than many other matching costs, because outliers lead to high errors within the NCC calculation. 

On images with simulated and real radiometric differences, the Rank transform appeared to be the best cost for correlation-based methods. 

A qualitative evaluation of the disparity images from images without radiometric transformations indicated that the filter-based costs (LoG, Rank and Mean) tend to blur object boundaries. 

The simplest matching costs assume constant intensities at matching image locations, but more robust costs model (explicitly or implicitly) certain radiometric changes and/or noise. 

For correlation the different costs perform quite similar, probably since summing over a fixed window acts like averaging, which reduces the effect of Gaussian noise. 

Each cost was evaluated with three different stereo algorithms: a local correlation method, a semi-global matching method, and a global method using graph cuts. 

In tests with global radiometric changes or noise, hierarchical mutual information performed best for pixel-based global matching methods like SGM and GC.