scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Efficient variants of the ICP algorithm

01 May 2001-pp 145-152
TL;DR: An implementation is demonstrated that is able to align two range images in a few tens of milliseconds, assuming a good initial guess, and has potential application to real-time 3D model acquisition and model-based tracking.
Abstract: The ICP (Iterative Closest Point) algorithm is widely used for geometric alignment of three-dimensional models when an initial estimate of the relative pose is known. Many variants of ICP have been proposed, affecting all phases of the algorithm from the selection and matching of points to the minimization strategy. We enumerate and classify many of these variants, and evaluate their effect on the speed with which the correct alignment is reached. In order to improve convergence for nearly-flat meshes with small features, such as inscribed surfaces, we introduce a new variant based on uniform sampling of the space of normals. We conclude by proposing a combination of ICP variants optimized for high speed. We demonstrate an implementation that is able to align two range images in a few tens of milliseconds, assuming a good initial guess. This capability has potential application to real-time 3D model acquisition and model-based tracking.

Summary (2 min read)

1 Intr oduction – Taxonom y of ICP Variants

  • TheICP(originally IterativeClosestPoint,thoughIterativeCorrespondingPointis perhapsabetterexpansionfor theabbreviation) algorithm hasbecomethe dominantmethodfor aligning threedimensionalmodelsbasedpurelyonthegeometry, andsometimes color, of themeshes.
  • The authors will look at variantsin eachof thesesix categories,andexaminetheir effectson theperformanceof ICP.
  • Ourcomparisonssuggesta combinationof ICP variantsthat is ableto aligna pair of meshesin a few tensof milliseconds,significantlyfaster thanmostcommonly-usedICP systems.
  • Next, wesummarizeseveralICPvariantsin eachof the above six categories,andcomparetheir convergence performance.

2 Comparison Methodology

  • The authors goal is to comparetheconvergencecharacteristicsof several ICPvariants.
  • In orderto limit thescopeof theproblem,andavoid acombinatorialexplosionin thenumberof possibilities,weadopt themethodologyof choosinga baselinecombinationof variants, andexaminingperformanceasindividual ICP stagesarevaried.
  • Matching eachselectedpoint to the closestsamplein the othermeshthathasanormalwithin 45degreesof thesource normal.
  • In addition, to ensurefair comparisonsamongvariants,the authors make thefollowing assumptions: Thenumberof sourcepointsselectedis always2,000.

2.1 Test Scenes

  • The“wave” scene is aneasycasefor mostICP variants, since it containsrelatively smoothcoarse-scalegeometry.
  • The“incisedplane”scene consistsof two planeswith Gaussiannoiseandgrooves in the shapeof an “X.”.
  • Thoughthesescenescertainly do not cover all possibleclassesof scannedobjects, they are representative of surfacesencounteredin many classesof scanning applications.
  • Themotivationfor usingsyntheticdatafor their comparisonsis so that the authors know the correcttransformexactly, andcanevaluate the performanceof ICP algorithmsrelative to this correctalignment.
  • The authors only presentthe resultsof onerun for eachtestedvariant.

3.1 Selection of Points

  • The authors begin by examiningthe effect of the selectionof point pairs on the convergenceof ICP.
  • In addition to these,the authors introducea new samplingstrategy: choosingpointssuchthat the distribution of normalsamongselectedpointsis aslargeaspossible.
  • Thus, one way to improve the chancesthat enoughconstraintsare presentto determineall the componentsof the transformationis to bucket the points accordingto the position of the normalsin angular space,then sampleasuniformly aspossibleacrossthe buckets.
  • If the authors usea more“asymmetric” matchingalgorithm,suchasprojectionor normalshooting(see Section3.2), they seethat samplingfrom both meshesappearsto give slightly betterresults, especiallyduring theearly stagesof the iterationwhenthe two meshesarestill far apart.

3.2 Matching Points

  • The next stageof ICP that the authors will examineis correspondence finding.
  • The authors will refer to this as“normal shooting.”.
  • Sincethe authors are not analyzingvariantsthat usecolor, the particular variantsthey will compareare: closestpoint, closestcompatible point (normalswithin 45 degrees),normalshooting,normal shootingto a compatiblepoint (normalswithin 45 degrees),projection,andprojectionfollowedby search.
  • The authors first look at performancefor the“fractal” scene.
  • The authors see thatalthoughtheprojectionalgorithmdoesnotoffer thebestconvergenceper iteration,eachiterationis fasterthanan iterationof closestpointfindingor normalshootingbecauseit is performedin constantime,ratherthaninvolving aclosest-pointsearch(which, evenwhenacceleratedby ak-d tree,takesO logn time).

3.3 Weighting of Pairs

  • The authors now examinethe effect of assigningdifferentweightsto the correspondingpoint pairs found by the previous two steps.
  • Theresultfor atypicallaser rangescanneris that theuncertaintyis lower, hencehigher weightshouldbeassigned,for surfacestilted away from the rangecamera.
  • Wefirst look at a versionof the“wave” scene.
  • The authors seethat even with the additionof extra noise, all of the weighting strategies have similar performance,with the “uncertainty” and“compatibility of normals”optionshaving marginally betterperformancethan the others.
  • The authors mustbecautious wheninterpretingthis result,sincetheuncertainty-basedweighting assignshigherweightsto pointson themodelthathave normalspointingaway from therangescanner.

3.4 Rejecting Pairs

  • Rejectionof pairs whosepoint-to-point distanceis larger than somemultiple of the standarddeviation of distances.
  • Rejectionof pairs containingpoints on meshboundaries [Turk 94].
  • Sinceits cost is usuallylow andin mostapplicationsits usehas few drawbacks,the authors alwaysrecommendusingthis strategy, andin factthey useit in all thecomparisonsin thispaper.
  • Thus,the authors concludethatoutlier rejection, thoughit mayhaveeffectsontheaccuracy andstabilitywith which the correctalignmentis determined,in generaldoesnot improve thespeedof convergence.

3.5 Error Metric and Minimization

  • Thefinal piecesof theICP algorithmthatthe authors will look at arethe error metric and the algorithm for minimizing the error metric.
  • For an error metric of this form, there exist closedform solutionsfor determiningthe rigid-body transformation that minimizes the error.
  • Theabove “point-to-point” metric,takinginto accountboth the distancebetweenpoints and the differencein colors [Johnson97b].
  • Theabove iterative minimization,combinedwith extrapolation in transformspaceto accelerateconvergence[Besl92].
  • Here,thepoint-to-pointalgorithmsarenot ableto reachthe correctsolution,sinceusingthe point-to-pointerror metric does notallow theplanesto “slide over” eachotheraseasily.

4 High-Speed Variants

  • Theability to have ICP executein real time (e.g.,at videorates) wouldpermitsignificantnew applicationsin computervisionand graphics.
  • If it werepossibleto alignthosescansasthey are generated,theusercouldbepresentedwith anup-to-datemodelin real time,makingit easyto seeandfill “holes” in themodel.
  • With thesegoalsin mind,the authors maynow constructa high-speed ICPalgorithmbycombiningsomeof thevariantsdiscussedabove.
  • Also, becauseof thepotentialfor overshoot,the authors avoid extrapolationof transforms.
  • Figure17 shows an exampleof the algorithmon real-world data:two scannedmeshesof anelephant figurinewerealignedin approximately30 ms.

5 Conc lusion

  • The authors have classifiedandcomparedseveral ICP variants,focusing on theeffect eachhason convergencespeed.
  • Wehave introduced a new samplingmethodthat helpsconvergencefor sceneswith small, sparsefeatures.
  • Finally, the authors have presentedan optimized ICP algorithmthat usesa constant-timevariantfor finding point pairs,resultingin a methodthattakesonly a few tensof millisecondsto align two meshes.
  • In addition, a better analysisof theeffectsof variouskindsof noiseanddistortion wouldyield furtherinsightsinto thebestalignmentalgorithmsfor real-world, noisy scanneddata.

Did you find this useful? Give us your feedback

Figures (18)

Content maybe subject to copyright    Report

Efficient Variants of the ICP Algorithm
Szymon Rusinkiewicz
Marc Levoy
Stanford University
Abstract
The ICP (Iterative Closest Point) algorithm is widely used for ge-
ometric alignment of three-dimensional models when an initial
estimate of the relative pose is known. Many variants of ICP have
been proposed, affecting all phases of the algorithm from the se-
lection and matching of points to the minimization strategy. We
enumerate and classify many of these variants, and evaluate their
effect on the speed with which the correct alignment is reached.
In order to improve convergence for nearly-flat meshes with small
features, such as inscribed surfaces, we introduce a new variant
based on uniform sampling of the space of normals. We conclude
by proposing a combination of ICP variants optimized for high
speed. We demonstrate an implementation that is able to align
two range images in a few tens of milliseconds, assuming a good
initial guess. This capability has potential application to real-time
3D model acquisition and model-based tracking.
1 Introduction Taxonomy of ICP Variants
The ICP (originally IterativeClosest Point, though Iterative Corre-
sponding Point is perhaps a better expansion for the abbreviation)
algorithm has become the dominant method for aligning three-
dimensional models based purely on the geometry, and sometimes
color, of the meshes. The algorithm is widely used for registering
the outputs of 3D scanners, which typically only scan an object
from one direction at a time. ICP starts with two meshes and
an initial guess for their relative rigid-body transform, and itera-
tively refines the transform by repeatedly generating pairs of cor-
responding points on the meshes and minimizing an error metric.
Generating the initial alignment may be done by a variety of meth-
ods, such as tracking scanner position, identification and index-
ing of surface features [Faugeras 86, Stein 92], “spin-image” sur-
face signatures [Johnson 97a], computing principal axes of scans
[Dorai 97], exhaustive search for corresponding points [Chen 98,
Chen 99], or user input. In this paper, we assume that a rough ini-
tial alignment is always available. In addition, we focus only on
aligning a single pair of meshes, and do not address the global reg-
istration problem [Bergevin 96, Stoddart 96, Pulli 97, Pulli 99].
Since the introduction of ICP by Chen and Medioni [Chen 91]
and Besl and McKay [Besl 92], many variants have been intro-
duced on the basic ICP concept. We may classify these variants
as affecting one of six stages of the algorithm:
1. Selection of some set of points in one or both meshes.
2. Matching these points to samples in the other mesh.
3. Weighting the corresponding pairs appropriately.
4. Rejecting certain pairs based on looking at each pair indi-
vidually or considering the entire set of pairs.
5. Assigning an error metric based on the point pairs.
6. Minimizing the error metric.
In this paper, we will look at variants in each of these six cat-
egories, and examine their effects on the performance of ICP. Al-
though our main focus is on the speed of convergence, we also
consider the accuracy of the final answer and the ability of ICP to
reach the correct solution given “difficult” geometry. Our compar-
isons suggest a combination of ICP variants that is able to align a
pair of meshes in a few tens of milliseconds, significantly faster
than most commonly-used ICP systems. The availability of such
a real-time ICP algorithm may enable significant new applications
in model-based tracking and 3D scanning.
In this paper, we first present the methodology used for com-
paring ICP variants, and introduce a number of test scenes used
throughout the paper. Next, we summarize several ICP variants in
each of the above six categories, and compare their convergence
performance. As part of the comparison, we introduce the con-
cept of normal-space-directed sampling, and show that it improves
convergence in scenes involving sparse, small-scale surface fea-
tures. Finally, we examine a combination of variants optimized
for high speed.
2 Comparison Methodology
Our goal is to compare the convergence characteristics of several
ICP variants. In order to limit the scope of the problem, and avoid
a combinatorial explosion in the number of possibilities, we adopt
the methodology of choosing a baseline combination of variants,
and examining performance as individual ICP stages are varied.
The algorithm we will select as our baseline is essentially that of
[Pulli 99], incorporating the following features:
Random sampling of points on both meshes.
Matching each selected point to the closest sample in the
other mesh that has a normal within 45 degrees of the source
normal.
Uniform (constant) weighting of point pairs.
Rejection of pairs containing edge vertices, as well as a per-
centage of pairs with the largest point-to-point distances.
Point-to-plane error metric.
The classic “select-match-minimize” iteration, rather than
some other search for the alignment transform.
We pick this algorithm because it has received extensive use in
a production environment [Levoy 00], and has been found to be
robust for scanned data containing many kinds of surface features.
In addition, to ensure fair comparisons among variants, we
make the following assumptions:
The number of source points selected is always 2,000. Since
the meshes we will consider have 100,000 samples, this cor-
responds to a sampling rate of 1% per mesh if source points
are selected from both meshes, or 2% if points are selected
from only one mesh.
All meshes we use are simple perspective range images, as
opposed to general irregular meshes, since this enables com-
parisons between “closest point” and “projected point” vari-
ants (see Section 3.2).
Surface normals are computed simply based on the four
nearest neighbors in the range grid.
1

(a) Wave (b) Fractal landscape (c) Incised plane
Figure 1: Test scenes used throughout this paper.
Only geometry is used for alignment, not color or intensity.
With the exception of the last one, we expect that changing any
of these implementation choices would affect the quantitative, but
not the qualitative, performance of our tests. Although we will
not compare variants that use color or intensity, it is clearly ad-
vantageous to use such data when available, since it can provide
necessary constraints in areas where there are few geometric fea-
tures.
2.1 Test Scenes
We use three synthetically-generated scenes to evaluate variants.
The “wave” scene (Figure 1a) is an easy case for most ICP vari-
ants, since it contains relatively smooth coarse-scale geometry.
The two meshes have independently-added Gaussian noise, out-
liers, and dropouts. The “fractal landscape” test scene (Figure 1b)
has features at all levels of detail. The “incised plane” scene (Fig-
ure 1c) consists of two planes with Gaussian noise and grooves
in the shape of an “X. This is a difficult scene for ICP, and
most variants do not converge to the correct alignment, even given
the small relative rotation in this starting position. Note that the
three test scenes consist of low-frequency, all-frequency, and high-
frequency features, respectively. Though these scenes certainly
do not cover all possible classes of scanned objects, they are
representative of surfaces encountered in many classes of scan-
ning applications. For example, the Digital Michelangelo Project
[Levoy 00] involved scanning surfaces containing low-frequency
features (e.g., smooth statues), fractal-like features (e.g., unfin-
ished statues with visible chisel marks), and incisions (e.g., frag-
ments of the Forma Urbis Romæ).
The motivation for using synthetic data for our comparisons is
so that we know the correct transform exactly, and can evaluate
the performance of ICP algorithms relative to this correct align-
ment. The metric we will use throughout this paper is root-mean-
square point-to-point distance for the actual corresponding points
in the two meshes. Using such a “ground truth” error metric al-
lows for more objective comparisons of the performance of ICP
variants than using the error metrics computed by the algorithms
themselves.
We only present the results of one run for each tested variant.
Although a single run clearly can not be taken as representing
the performance of an algorithm in all situations, we have tried
to show typical results that capture the significant differences in
performance on various kinds of scenes. Any cases in which the
presented results are not typical are noted in the text.
All reported running times are for a C++ implementation run-
ning on a 550 MHz Pentium III Xeon processor.
3 Comparisons of ICP Variants
We now examine ICP variants for each of the stages listed in Sec-
tion 1. For each stage, we summarize the variants in the literature
and compare their performance on our test scenes.
3.1 Selection of Points
We begin by examining the effect of the selection of point pairs
on the convergence of ICP. The following strategies have been
proposed:
Always using all available points [Besl 92].
Uniform subsampling of the available points [Turk 94].
Random sampling (with a different sample of points at each
iteration) [Masuda 96].
Selection of points with high intensity gradient, in variants
that use per-sample color or intensity to aid in alignment
[Weik 97].
Each of the preceding schemes may select points on only one
mesh, or select source points from both meshes [Godin 94].
In addition to these, we introduce a new sampling strategy:
choosing points such that the distribution of normals among se-
lected points is as large as possible. The motivation for this strat-
egy is the observation that for certain kinds of scenes (such as
our “incised plane” data set) small features of the model are vi-
tal to determining the correct alignment. A strategy such as ran-
dom sampling will often select only a few samples in these fea-
tures, which leads to an inability to determine certain compo-
nents of the correct rigid-body transformation. Thus, one way
to improve the chances that enough constraints are present to
determine all the components of the transformation is to bucket
the points according to the position of the normals in angular
space, then sample as uniformly as possible across the buckets.
Normal-space sampling is therefore a very simple example of
using surface features for alignment; it has lower computational
cost, but lower robustness, than traditional feature-based methods
[Faugeras 86, Stein 92, Johnson 97a].
Let us compare the performance of uniform subsampling, ran-
dom sampling, and normal-space sampling on the “wave” scene
(Figure 2). As we can see, the convergence performance is sim-
ilar. This indicates that for a scene with a good distribution of
normals the exact sampling strategy is not critical. The results for
the “incised plane” scene look different, however (Figure 3). Only
the normal-space sampling is able to converge for this data set.
The reason is that samples not in the grooves are only help-
ful in determining three of the six components of the rigid-body
transformation (one translation and two rotations). The other three
components (two translations and one rotation, within the plane)
2

0
0.2
0.4
0.6
0.8
1
1.2
0 2 4 6 8 10
RMS alignment error
Iteration
Convergence rate for "wave" scene
Uniform sampling
Random sampling
Normal-space sampling
Figure 2: Comparison of convergence rates for uniform, random, and
normal-space sampling for the “wave” meshes.
0
0.2
0.4
0.6
0.8
1
1.2
0 5 10 15 20
RMS alignment error
Iteration
Convergence rate for "incised plane" scene
Uniform sampling
Random sampling
Normal-space sampling
Figure 3: Comparison of convergence rates for uniform, random, and
normal-space sampling for the “incised plane” meshes. Note that, on the
lower curve, the ground truth error increases briefly in the early iterations.
This illustrates the difference between the ground truth error and the algo-
rithm’s estimate of its own error.
(a) (b)
(c) (d)
Figure 4: Corresponding point pairs selected by the (a) “random sam-
pling” and (b) “normal-space sampling” strategies for an incised mesh.
Using random sampling, the sparse features may be overwhelmed by pres-
ence of noise or distortion, causing the ICP algorithm to not converge to
a correct alignment (c). The normal-space sampling strategy ensures that
enough samples are placed in the feature to bring the surfaces into align-
ment (d). “Closest compatible point” matching (see Section 3.2) was used
for this example. The meshes in (c) and (d) are scans of fragment 165d of
the Forma Urbis Romæ.
0
0.5
1
1.5
2
0 2 4 6 8 10
RMS alignment error
Iteration
Convergence rate for "wave" scene
Source points in one mesh
Source points in both meshes
Figure 5: Comparison of convergence rates for single-source-mesh and
both-source-mesh sampling strategies for the “wave” meshes.
0
0.5
1
1.5
2
0 2 4 6 8 10
RMS alignment error
Iteration
Convergence rate for "wave" scene using "normal shooting"
Source points in one mesh
Source points in both meshes
Figure 6: Comparison of convergence rates for single-source-mesh and
both-source-mesh sampling strategies for the “wave” meshes, using nor-
mal shooting as the matching algorithm.
are determined entirely by samples within the incisions. The ran-
dom and uniform sampling strategies only place a few samples in
the grooves (Figure 4a). This, together with the fact that noise and
distortion on the rest of the plane overwhelms the effect of those
pairs that are sampled from the grooves, accounts for the inability
of uniform and random sampling to converge to the correct align-
ment. Conversely, normal-space sampling selects a larger number
of samples in the grooves (Figure 4b).
Sampling Direction: We now look at the relative advantages of
choosing source points from both meshes, versus choosing points
from only one mesh. For the “wave” test scene and the base-
line algorithm, the difference is minimal (Figure 5). However,
this is partly due to the fact that we used the closest compatible
point matching algorithm (see Section 3.2), which is symmetric
with respect to the two meshes. If we use a more “asymmetric”
matching algorithm, such as projection or normal shooting (see
Section 3.2), we see that sampling from both meshes appears to
give slightly better results (Figure 6), especially during the early
stages of the iteration when the two meshes are still far apart. In
addition, we expect that sampling from both meshes would also
improve results when the overlap of the meshes is small, or when
the meshes contain many holes.
3.2 Matching Points
The next stage of ICP that we will examine is correspondence
finding. Algorithms have been proposed that, for each sample
point selected:
Find the closest point in the other mesh [Besl 92]. This com-
putation may be accelerated using a k-d tree and/or closest-
point caching [Simon 96].
3

Find the intersection of the ray originating at the source point
in the direction of the source point’s normal with the desti-
nation surface [Chen 91]. We will refer to this as “normal
shooting.
Project the source point onto the destination mesh, from
the point of view of the destination mesh’s range camera
[Blais 95, Neugebauer 97]. This has also been called “re-
verse calibration.
Project the source point onto the destination mesh, then
perform a search in the destination range image. The
search might use a metric based on point-to-point distance
[Benjemaa 97], point-to-ray distance [Dorai 98], or compat-
ibility of intensity [Weik 97] or color [Pulli 97].
Any of the above methods, restricted to only matchingpoints
compatible with the source point according to a given metric.
Compatibility metrics based on color [Godin 94] and angle
between normals [Pulli 99] have been explored.
Since we are not analyzing variants that use color, the particu-
lar variants we will compare are: closest point, closest compat-
ible point (normals within 45 degrees), normal shooting, normal
shooting to a compatible point (normals within 45 degrees), pro-
jection, and projection followed by search. The first four of these
algorithms are accelerated using a k-d tree. For the last algorithm,
the search is actually implemented as a steepest-descent neighbor-
to-neighbor walk in the destination mesh that attempts to find the
closest point. We chose this variation because it works nearly as
well as projection followed by exhaustive search in some window,
but has lower running time.
We first look at performance for the “fractal” scene (Figure 7).
For this scene, normal shooting appears to produce the best re-
sults, followed by the projection algorithms. The closest-point
algorithms, in contrast, perform relatively poorly. We hypothesize
that the reason for this is that the closest-point algorithms are more
sensitive to noise and tend to generate larger numbers of incorrect
pairings than the other algorithms (Figure 8).
The situation in the “incised plane” scene, however, is different
(Figure 9). Here, the closest-point algorithms were the only ones
that converged to the correct solution. Thus, we conclude that
although the closest-point algorithms might not have the fastest
convergence rate for “easy” scenes, they are the most robust for
“difficult” geometry.
Though so far we have been looking at error as a function of the
number of iterations, it is also instructive to look at error as a func-
tion of running time. Because the matching stage of ICP is usually
the one that takes the longest, applications that require ICP to run
quickly (and that do not need to deal with the geometrically “dif-
ficult” cases) must choose the matching algorithm with the fastest
performance. Let us therefore compare error as a function of time
for these algorithms for the “fractal” scene (Figure 10). We see
that although the projection algorithm does not offer the best con-
vergence per iteration, each iteration is faster than an iteration of
closest point finding or normal shooting because it is performed in
constant time, rather than involving a closest-point search (which,
even when accelerated by a k-d tree, takes O
log n
time). As a re-
sult, the projection-based algorithm has a significantly faster rate
of convergence vs. time. Note that this graph does not include the
time to compute the k-d trees used by all but the projection algo-
rithms. Including the precomputation time (approximately 0.64
seconds for these meshes) would produce even more favorable re-
sults for the projection algorithm.
0
0.5
1
1.5
2
0 5 10 15 20
RMS alignment error
Iteration
Convergence rate for "fractal" scene
Closest point
Closest compatible point
Normal shoot
Normal shoot compatible
Project
Project and walk
Figure 7: Comparison of convergence rates for the “fractal” meshes, for
a variety of matching algorithms.
(a)
(b)
Figure 8: (a) In thepresence of noise andoutliers, the closest-point match-
ing algorithm potentially generates large numbers of incorrect pairings
when the meshes are still relatively far from each other, slowing the rate
of convergence. (b) The “projection” matching strategy is less sensitive to
the presence of noise.
0
0.5
1
1.5
2
0 5 10 15 20 25 30 35 40
RMS alignment error
Iteration
Convergence rate for "incised plane" scene
Closest point
Closest compatible point
Normal shoot
Normal shoot compatible
Project
Project and walk
Figure 9: Comparison of convergence rates for the “incised plane”
meshes, for a variety of matching algorithms. Normal-space-directed sam-
pling was used for these measurements.
0
0.5
1
1.5
2
0 0.2 0.4 0.6 0.8 1 1.2
RMS alignment error
Time (sec.)
Convergence rate vs. time for "fractal" scene
Closest point
Closest compatible point
Normal shoot
Normal shoot compatible
Project
Project and walk
Figure 10: Comparison of convergence rate vs. time for the “fractal”
meshes, for a variety of matching algorithms (cf. Figure 7). Note that
these times do not include precomputation (in particular, computing the
k-d trees used by the first four algorithms takes 0.64 seconds).
4

0
0.2
0.4
0.6
0.8
1
1.2
1.4
0 1 2 3 4 5 6 7 8
RMS alignment error
Iteration
Convergence rate for "wave" scene
Constant weight
Linear with distance
Uncertainty
Compatibility of normals
Figure 11: Comparison of convergence rates for the “wave” meshes, for
several choices of weighting functions. In order to increase the differences
among the variants we have doubled the amount of noise and outliers in
the mesh.
0
0.2
0.4
0.6
0.8
1
1.2
1.4
0 5 10 15 20
RMS alignment error
Iteration
Convergence rate for "incised plane" scene
Constant weight
Linear with distance
Uncertainty
Compatibility of normals
Figure 12: Comparison of convergence rates for the “incised plane”
meshes, for several choices of weighting functions. Normal-space-
directed sampling was used for these measurements.
3.3 Weighting of Pairs
We now examine the effect of assigning different weights to the
corresponding point pairs found by the previous two steps. We
consider four different algorithms for assigning these weights:
Constant weight
Assigning lower weights to pairs with greater point-to-point
distances. This is similar in intent to dropping pairs with
point-to-point distance greater than a threshold (see Section
3.4), but avoids the discontinuity of the latter approach. Fol-
lowing [Godin 94], we use
Weight = 1
Dist
p
1
, p
2
Dist
max
Weighting based on compatibility of normals:
Weight = n
1
n
2
Weighting on compatibility of colors has also been used
[Godin 94], though we do not consider it here.
Weighting based on the expected effect of scanner noise on
the uncertainty in the error metric. For the point-to-plane er-
ror metric (see Section 3.5), this depends on both uncertainty
in the position of range points and uncertainty in surface nor-
mals. As shown in the Appendix, the result for a typical laser
range scanner is that the uncertainty is lower, hence higher
weight should be assigned, for surfaces tilted away from the
range camera.
We first look at a version of the “wave” scene (Figure 11). Ex-
tra noise has been added in order to amplify the differences among
the variants. We see that even with the addition of extra noise,
all of the weighting strategies have similar performance, with
the “uncertainty” and “compatibility of normals” options having
marginally better performance than the others. For the “incised
plane” scene (Figure 12), the results are similar, though there is a
larger difference in performance. However, we must be cautious
when interpreting this result, since the uncertainty-based weight-
ing assigns higher weights to points on the model that have nor-
mals pointing away from the range scanner. For this scene, there-
fore, the uncertainty weighting assigns higher weight to points
within the incisions, which improves the convergence rate. We
conclude that, in general, the effect of weighting on convergence
rate will be small and highly data-dependent, and that the choice
of a weighting function should be based on other factors, such
as the accuracy of the final result; we expect to explore this in a
future paper.
3.4 Rejecting Pairs
Closely related to assigning weights to corresponding pairs is re-
jecting certain pairs entirely. The purpose of this is usually to
eliminate outliers, which may have a large effect when perform-
ing least-squares minimization. The following rejection strategies
have been proposed:
Rejection of corresponding points more than a given (user-
specified) distance apart.
Rejection of the worst n% of pairs based on some metric,
usually point-to-point distance. As suggested by [Pulli 99],
we reject 10% of pairs.
Rejection of pairs whose point-to-point distance is larger
than some multiple of the standard deviation of distances.
Following [Masuda 96], we reject pairs with distances more
than 2.5 times the standard deviation.
Rejection of pairs that are not consistent with neighbor-
ing pairs, assuming surfaces move rigidly [Dorai 98]. This
scheme classifies two correspondences
p
1
,q
1
and
p
2
,q
2
as inconsistent iff
Dist
p
1
, p
2
Dist
q
1
, q
2
is greater than some threshold. Following [Dorai 98], we use
0.1
max
Dist
p
1
, p
2
,Dist
q
1
, q
2

as the threshold. The algorithm then rejects those correspon-
dences that are inconsistent with most others. Note that the
algorithm as originally presented has running time O
n
2
at
each iteration of ICP. In order to reduce running time, we
have chosen to only compare each correspondence to 10 oth-
ers, and reject it if it is incompatible with more than 5.
Rejection of pairs containing points on mesh boundaries
[Turk 94].
The latter strategy, of excluding pairs that include points on
mesh boundaries, is especially useful for avoiding erroneous pair-
ings (that cause a systematic bias in the estimated transform) in
cases when the overlap between scans is not complete (Figure 13).
Since its cost is usually low and in most applications its use has
few drawbacks, we always recommend using this strategy, and in
fact we use it in all the comparisons in this paper.
Figure 14 compares the performance of no rejection, worst-
10% rejection, pair-compatibility rejection, and 2.5
rejection on
the “wave” scene with extra noise and outliers. We see that re-
jection of outliers does not help with initial convergence. In fact,
the algorithm that rejected pairs most aggressively (worst-10% re-
jection) tended to converge more slowly when the meshes were
5

Citations
More filters
Proceedings ArticleDOI
12 May 2009
TL;DR: This paper modifications their mathematical expressions and performs a rigorous analysis on their robustness and complexity for the problem of 3D registration for overlapping point cloud views, and proposes an algorithm for the online computation of FPFH features for realtime applications.
Abstract: In our recent work [1], [2], we proposed Point Feature Histograms (PFH) as robust multi-dimensional features which describe the local geometry around a point p for 3D point cloud datasets. In this paper, we modify their mathematical expressions and perform a rigorous analysis on their robustness and complexity for the problem of 3D registration for overlapping point cloud views. More concretely, we present several optimizations that reduce their computation times drastically by either caching previously computed values or by revising their theoretical formulations. The latter results in a new type of local features, called Fast Point Feature Histograms (FPFH), which retain most of the discriminative power of the PFH. Moreover, we propose an algorithm for the online computation of FPFH features for realtime applications. To validate our results we demonstrate their efficiency for 3D registration and propose a new sample consensus based method for bringing two datasets into the convergence basin of a local non-linear optimizer: SAC-IA (SAmple Consensus Initial Alignment).

3,138 citations

Journal ArticleDOI
TL;DR: A probabilistic method, called the Coherent Point Drift (CPD) algorithm, is introduced for both rigid and nonrigid point set registration and a fast algorithm is introduced that reduces the method computation complexity to linear.
Abstract: Point set registration is a key component in many computer vision tasks. The goal of point set registration is to assign correspondences between two sets of points and to recover the transformation that maps one point set to the other. Multiple factors, including an unknown nonrigid spatial transformation, large dimensionality of point set, noise, and outliers, make the point set registration a challenging problem. We introduce a probabilistic method, called the Coherent Point Drift (CPD) algorithm, for both rigid and nonrigid point set registration. We consider the alignment of two point sets as a probability density estimation problem. We fit the Gaussian mixture model (GMM) centroids (representing the first point set) to the data (the second point set) by maximizing the likelihood. We force the GMM centroids to move coherently as a group to preserve the topological structure of the point sets. In the rigid case, we impose the coherence constraint by reparameterization of GMM centroid locations with rigid parameters and derive a closed form solution of the maximization step of the EM algorithm in arbitrary dimensions. In the nonrigid case, we impose the coherence constraint by regularizing the displacement field and using the variational calculus to derive the optimal transformation. We also introduce a fast algorithm that reduces the method computation complexity to linear. We test the CPD algorithm for both rigid and nonrigid transformations in the presence of noise, outliers, and missing points, where CPD shows accurate results and outperforms current state-of-the-art methods.

2,429 citations


Cites background from "Efficient variants of the ICP algor..."

  • ...Here, we briefly overview the rigid and non-rigid point set registration methods and state our contributions....

    [...]

Proceedings ArticleDOI
16 Oct 2011
TL;DR: Novel extensions to the core GPU pipeline demonstrate object segmentation and user interaction directly in front of the sensor, without degrading camera tracking or reconstruction, to enable real-time multi-touch interactions anywhere.
Abstract: KinectFusion enables a user holding and moving a standard Kinect camera to rapidly create detailed 3D reconstructions of an indoor scene. Only the depth data from Kinect is used to track the 3D pose of the sensor and reconstruct, geometrically precise, 3D models of the physical scene in real-time. The capabilities of KinectFusion, as well as the novel GPU-based pipeline are described in full. Uses of the core system for low-cost handheld scanning, and geometry-aware augmented reality and physics-based interactions are shown. Novel extensions to the core GPU pipeline demonstrate object segmentation and user interaction directly in front of the sensor, without degrading camera tracking or reconstruction. These extensions are used to enable real-time multi-touch interactions anywhere, allowing any planar or non-planar reconstructed physical surface to be appropriated for touch.

2,373 citations


Cites background or methods from "Efficient variants of the ICP algor..."

  • ...In our system we use projective data association [24] to find these correspondences....

    [...]

  • ...Our approach for real-time camera tracking and surface reconstruction is based on two well-studied algorithms [1, 5, 24], which have been designed from the ground-up for parallel execution on the GPU....

    [...]

  • ...ICP is a popular and well-studied algorithm for 3D shape alignment (see [24] for a detailed study)....

    [...]

Proceedings ArticleDOI
01 Jan 2008
TL;DR: The architecture of MeshLab, an open source, extensible, mesh processing system that has been developed at the Visual Computing Lab of the ISTI-CNR with the helps of tens of students is described.
Abstract: The paper presents MeshLab, an open source, extensible, mesh processing system that has been developed at the Visual Computing Lab of the ISTI-CNR with the helps of tens of students. We will describe the MeshLab architecture, its main features and design objectives discussing what strategies have been used to support its development. Various examples of the practical uses of MeshLab in research and professional frameworks are reported to show the various capabilities of the presented system.

1,896 citations

Proceedings ArticleDOI
12 Jul 2014
TL;DR: The method achieves both low-drift and low-computational complexity without the need for high accuracy ranging or inertial measurements and can achieve accuracy at the level of state of the art offline batch methods.
Abstract: We propose a real-time method for odometry and mapping using range measurements from a 2-axis lidar moving in 6-DOF. The problem is hard because the range measurements are received at different times, and errors in motion estimation can cause mis-registration of the resulting point cloud. To date, coherent 3D maps can be built by off-line batch methods, often using loop closure to correct for drift over time. Our method achieves both low-drift and low-computational complexity without the need for high accuracy ranging or inertial measurements. The key idea in obtaining this level of performance is the division of the complex problem of simultaneous localization and mapping, which seeks to optimize a large number of variables simultaneously, by two algorithms. One algorithm performs odometry at a high frequency but low fidelity to estimate velocity of the lidar. Another algorithm runs at a frequency of an order of magnitude lower for fine matching and registration of the point cloud. Combination of the two algorithms allows the method to map in real-time. The method has been evaluated by a large set of experiments as well as on the KITTI odometry benchmark. The results indicate that the method can achieve accuracy at the level of state of the art offline batch methods.

1,879 citations

References
More filters
Journal ArticleDOI
Paul J. Besl1, H.D. McKay1
TL;DR: In this paper, the authors describe a general-purpose representation-independent method for the accurate and computationally efficient registration of 3D shapes including free-form curves and surfaces, based on the iterative closest point (ICP) algorithm, which requires only a procedure to find the closest point on a geometric entity to a given point.
Abstract: The authors describe a general-purpose, representation-independent method for the accurate and computationally efficient registration of 3-D shapes including free-form curves and surfaces. The method handles the full six degrees of freedom and is based on the iterative closest point (ICP) algorithm, which requires only a procedure to find the closest point on a geometric entity to a given point. The ICP algorithm always converges monotonically to the nearest local minimum of a mean-square distance metric, and the rate of convergence is rapid during the first few iterations. Therefore, given an adequate set of initial rotations and translations for a particular class of objects with a certain level of 'shape complexity', one can globally minimize the mean-square distance metric over all six degrees of freedom by testing each initial registration. One important application of this method is to register sensed data from unfixtured rigid objects with an ideal geometric model, prior to shape inspection. Experimental results show the capabilities of the registration algorithm on point sets, curves, and surfaces. >

17,598 citations

Journal ArticleDOI
TL;DR: A closed-form solution to the least-squares problem for three or more paints is presented, simplified by use of unit quaternions to represent rotation.
Abstract: Finding the relationship between two coordinate systems using pairs of measurements of the coordinates of a number of points in both systems is a classic photogrammetric task . It finds applications i n stereoph and in robotics . I present here a closed-form solution to the least-squares problem for three or more paints . Currently various empirical, graphical, and numerical iterative methods are in use . Derivation of the solution i s simplified by use of unit quaternions to represent rotation . I emphasize a symmetry property that a solution to thi s problem ought to possess . The best translational offset is the difference between the centroid of the coordinates i n one system and the rotated and scaled centroid of the coordinates in the other system . The best scale is equal to th e ratio of the root-mean-square deviations of the coordinates in the two systems from their respective centroids . These exact results are to be preferred to approximate methods based on measurements of a few selected points . The unit quaternion representing the best rotation is the eigenvector associated with the most positive eigenvalue o f a symmetric 4 X 4 matrix . The elements of this matrix are combinations of sums of products of correspondin g coordinates of the points .

4,522 citations


"Efficient variants of the ICP algor..." refers background in this paper

  • ...Solution methodsbased on singular value decomposition[Arun 87], quaternions [Horn 87],orthonormalmatrices[Horn 88], anddualquaternions [Walker 91] have beenproposed;Eggertet. al. have evaluatedthe numericalaccuracy and stability of eachof these[Eggert97], concludingthat the differencesamong…...

    [...]

Journal ArticleDOI
TL;DR: An algorithm for finding the least-squares solution of R and T, which is based on the singular value decomposition (SVD) of a 3 × 3 matrix, is presented.
Abstract: Two point sets {pi} and {p'i}; i = 1, 2,..., N are related by p'i = Rpi + T + Ni, where R is a rotation matrix, T a translation vector, and Ni a noise vector. Given {pi} and {p'i}, we present an algorithm for finding the least-squares solution of R and T, which is based on the singular value decomposition (SVD) of a 3 × 3 matrix. This new algorithm is compared to two earlier algorithms with respect to computer time requirements.

3,862 citations

Proceedings ArticleDOI
09 Apr 1991
TL;DR: The authors propose an approach that works on range data directly and registers successive views with enough overlapping area to get an accurate transformation between views and performs a functional that does not require point-to-point matches.
Abstract: The problem of creating a complete model of a physical object is studied. Although this may be possible using intensity images, the authors use range images which directly provide access to three-dimensional information. The first problem that needs to be solved is to find the transformation between the different views. Previous approaches have either assumed this transformation to be known (which is extremely difficult for a complete model) or computed it with feature matching (which is not accurate enough for integration. The authors propose an approach that works on range data directly and registers successive views with enough overlapping area to get an accurate transformation between views. This is performed by minimizing a functional that does not require point-to-point matches. Details are given of the registration method and modeling procedure, and they are illustrated on range images of complex objects. >

2,157 citations

Proceedings ArticleDOI
01 Jul 2000
TL;DR: A hardware and software system for digitizing the shape and color of large fragile objects under non-laboratory conditions and the largest single dataset is of the David - 2 billion polygons and 7,000 color images.
Abstract: We describe a hardware and software system for digitizing the shape and color of large fragile objects under non-laboratory conditions Our system employs laser triangulation rangefinders, laser time-of-flight rangefinders, digital still cameras, and a suite of software for acquiring, aligning, merging, and viewing scanned data As a demonstration of this system, we digitized 10 statues by Michelangelo, including the well-known figure of David, two building interiors, and all 1,163 extant fragments of the Forma Urbis Romae, a giant marble map of ancient Rome Our largest single dataset is of the David - 2 billion polygons and 7,000 color images In this paper, we discuss the challenges we faced in building this system, the solutions we employed, and the lessons we learned We focus in particular on the unusual design of our laser triangulation scanner and on the algorithms and software we developed for handling very large scanned models

1,675 citations

Frequently Asked Questions (19)
Q1. What contributions have the authors mentioned in the paper "Efficient variants of the icp algorithm" ?

In order to improve convergence for nearly-flat meshes with small features, such as inscribed surfaces, the authors introduce a new variant based on uniform sampling of the space of normals. The authors demonstrate an implementation that is able to align two range images in a few tens of milliseconds, assuming a good initial guess. This capability has potential application to real-time 3D model acquisition and model-based tracking. 

Generating the initial alignment may be done by a variety of methods, such as tracking scanner position, identification and indexing of surface features [Faugeras 86, Stein 92], “spin-image” surface signatures [Johnson 97a], computing principal axes of scans [Dorai 97], exhaustive search for corresponding points [Chen 98, Chen 99], or user input. 

The motivation for using synthetic data for their comparisons is so that the authors know the correct transform exactly, and can evaluate the performance of ICP algorithms relative to this correct alignment. 

For this scene, therefore, the uncertainty weighting assigns higher weight to points within the incisions, which improves the convergence rate. 

Though so far the authors have been looking at error as a function of the number of iterations, it is also instructive to look at error as a function of running time. 

Using such a “ground truth” error metric allows for more objective comparisons of the performance of ICP variants than using the error metrics computed by the algorithms themselves. 

Their comparisons suggest a combination of ICP variants that is able to align a pair of meshes in a few tens of milliseconds, significantly faster than most commonly-used ICP systems. 

Allowing the user to be involved in the scanning process in this way is a powerful alternative to solving the computationally difficult “next best view” problem [Maver 93], at least for small, handheld objects. 

Because the matching stage of ICP is usually the one that takes the longest, applications that require ICP to run quickly (and that do not need to deal with the geometrically “difficult” cases) must choose the matching algorithm with the fastest performance. 

Repeatedly generating a set of corresponding points using the current transformation, and finding a new transformation that minimizes the error metric [Chen 91]. 

the authors have presented an optimized ICP algorithm that uses a constant-time variant for finding point pairs, resulting in a method that takes only a few tens of milliseconds to align two meshes. 

As shown in the Appendix, the result for a typical laser range scanner is that the uncertainty is lower, hence higher weight should be assigned, for surfaces tilted away from the range camera. 

Although the authors will not compare variants that use color or intensity, it is clearly advantageous to use such data when available, since it can provide necessary constraints in areas where there are few geometric features. 

Normal-space sampling is therefore a very simple example of using surface features for alignment; it has lower computational cost, but lower robustness, than traditional feature-based methods [Faugeras 86, Stein 92, Johnson 97a]. 

If the authors use a more “asymmetric” matching algorithm, such as projection or normal shooting (see Section 3.2), the authors see that sampling from both meshes appears to give slightly better results (Figure 6), especially during the early stages of the iteration when the two meshes are still far apart. 

the authors must be cautious when interpreting this result, since the uncertainty-based weighting assigns higher weights to points on the model that have normals pointing away from the range scanner. 

together with the fact that noise and distortion on the rest of the plane overwhelms the effect of those pairs that are sampled from the grooves, accounts for the inability of uniform and random sampling to converge to the correct alignment. 

The ability to have ICP execute in real time (e.g., at video rates) would permit significant new applications in computer vision and graphics. 

In addition, the authors expect that sampling from both meshes would also improve results when the overlap of the meshes is small, or when the meshes contain many holes.