scispace - formally typeset
Open AccessProceedings ArticleDOI

Joint self-localization and tracking of generic objects in 3D range data

Reads0
Chats0
TLDR
A new algorithm is proposed that treats both the estimation of the trajectory of a sensor and the detection and tracking of moving objects jointly and has applicability to any type of environment since specific object models are not used at any algorithm stage.
Abstract
Both, the estimation of the trajectory of a sensor and the detection and tracking of moving objects are essential tasks for autonomous robots. This work proposes a new algorithm that treats both problems jointly. The sole input is a sequence of dense 3D measurements as returned by multi-layer laser scanners or time-of-flight cameras. A major characteristic of the proposed approach is its applicability to any type of environment since specific object models are not used at any algorithm stage. More specifically, precise localization in non-flat environments is possible as well as the detection and tracking of e.g. trams or recumbent bicycles. Moreover, 3D shape estimation of moving objects is inherent to the proposed method. Thorough evaluation is conducted on a vehicular platform with a mounted Velodyne HDL-64E laser scanner.

read more

Content maybe subject to copyright    Report

Joint Self-Localization and Tracking of Generic Objects
in 3D Range Data
Frank Moosmann
1
and Christoph Stiller
1
Abstract—Both, the estimation of the trajectory of a sensor and
the detection and tracking of moving objects are essential tasks
for autonomous robots. This work proposes a new algorithm that
treats both problems jointly. The sole input is a sequence of dense
3D measurements as returned by multi-layer laser scanners or
time-of-flight cameras. A major characteristic of the proposed
approach is its applicability to any type of environment since
specific object models are not used at any algorithm stage.
More specifically, precise localization in non-flat environments
is possible as well as the detection and tracking of e.g. trams or
recumbent bicycles. Moreover, 3D shape estimation of moving
objects is inherent t o the proposed method. Thorough evaluation
is cond ucted on a vehicular platform with a mounted Velodyne
HDL-64E laser scanner.
I. INTRODUCTION
Two main tasks can be identified for a perception sys-
tem of a robot: precise self-localization, often perfo rmed
simultaneou sly with mapping (SLAM), and the detection and
tracking of moving objects (DATMO). While most methods
from literature treat the two tasks as being independent, a join t
estimation scheme is introduced in this co ntribution.
A. Self-Localization
The problem of localization is usually understood a s the
estimation of the robot’s pose, i. e. position and orientation.
The fr ame of reference thereby varies. Some appr oaches seek
a global estima te using GPS or global landmarks. Others refer
to the relative motion of the robot, specifying the pose w.r.t.
the starting point the goal of this work.
The most widely spread algorithms for range sensor s fol-
low the principle of simultaneous localization and m apping
(SLAM) [23]. Though the last decade showed a trend towards
probabilistic techniq ues, the computationa l complexity with
3D data in outdoor environments notably shifts the used
method types in favor of scan-matching [18], [11], [ 3], [16].
Most SLAM methods only estimate the motion of the
vehicle w.r.t. a static scene and usua lly average out objects
with different motion. For low outlier ratios these registration
methods provide good results. A high portion of moving
objects, however, might cause these methods to fail. Only
few SLAM methods try to simultaneously detect and track
moving objects [2 6], [25]. Unfortunately, their computational
efficiency and robustness in the 3D real world was not yet
shown.
1
Both authors are with the Institute of Measurement and Control, Karlsruhe
Institute of Technology, 76131 Karlsruhe, Germany frank.moosmann
at kit.edu
Figure 1. Result of the proposed method: The mapped static environment
colored by altitude (left) and tracked moving objects highlighted with a unique
coloring in the sensor data (right).
B. Multi Target Tracking
The problem of multi target tracking is usually understood
as the task to detec t a set of objects in the environment and
to characterize them by their position, orientation, extent, and
velocity. Existing solutions frequently decompose the proble m
into two independent stages. T he first stage detects objects
indepen dently for each point in tim e. State of the art methods
mostly train classifiers for the detection of specific objec t
classes like cars or humans [19]. Only few methods employ
generic segmentation meth ods to detect any kind of o bject
that sticks out well from background [22]. The second stage
associates the detections over time in order to get continuous
tracks, i. e. estimates of the objects in trinsic state like e. g.
position and velocity. Possible generic solutions for this stage
are given in [1]. When using dense data, the association
of measurements can be ambiguous especially wh e n several
detections per object exist. To overcome ambiguities, solutions
like fuzzy segmentation [21], segment matching [8] or appear-
ance learning [10] have been proposed.
This two-stage approach has been applied to various kind
of sensors, from 2D laser scanners [17] over 3D laser scanners
[12], [19] up to time-of-flight cameras [9]. I ts major drawback
is the dependen c e on a reliable and repeatable object detector.
To the best of our knowledge, no approach exists that can
robustly track arbitrary objects.
A completely different methodology is track before detect
[7]. Sensor data is quantized e. g. at fixed im age columns
[20] or at fixed intervals in the horiz ontal pla ne [4]. Although
results seem very promising, finding a good grouping of the
tracked partitions, which corresponds to the dete ction, is still
2013 IEEE International Conference on Robotics and Automation (ICRA)
Karlsruhe, Germany, May 6-10, 2013
978-1-4673-5642-8/13/$31.00 ©2013 IEEE 1138

PointCloud
Prediction
Prediction
Registration&Update
Merging
Tracklets Tracks
PointCloud/
RangeImage
Object
Hypotheses
Pre-processing
&Features
Object
Detection
t 1
t
t + 1
t1
t1
T
t1
t2
T
t1
tm
T
t1
T
t
t1
e
T
t
t2
e
T
t
tm
e
T
t
e
T
t
t1
T
t
t2
T
t
tm
T
t
T
t
t
T
t
t1
T
t
t2
T
t
T
Figure 2. Overview of the proposed method.
an open issue.
One step further is the idea to optimize the partitioning of
data (which can here be regarded as object de te c tion) and the
motion estimation together. However, the proposed solutions
[13], [24] are computationally too complex to be applied in
real-time on ordinary computers within the next years.
Hence, all successful object tracking methods seem to be
either 2D or model based, which requires manual model
construction and model selection through classification.
This work propo ses a novel idea for the joint solution of
both pro blems. The combination of a dynamic data partitioning
with track before detect techniques allows to tr ack arbitrary
objects. By treating the static scene as object, mapping is ap -
plied to both, moving objects and th e static scene, in a unique
way. Experiments conducted in a vehicular environmen t show
the applicability to 3D environments with both tracking and
self-localization performed with full 6 degrees of freedom.
II. PROPOSED METHOD
Throu ghout this work, a left superscript x
t
denotes the
current time index and a left subscript
t
x the measurem ent
time. For clarity, these ar e only specified if necessary. All
computations are made w.r.t. the sensor coordinate system.
No fixed world coordina te frame is used.
A. Overview
Input to the algorithm at each time t is a set of range
measurements represented as 3D points
t
P =
(x, y, z)
T
.
This point cloud is preproce ssed , features are calc ulated, and
Figure 3. Each segment, indicated by a unique color, is turned into an object
hypothesis and verified by tracking across m frames.
object h ypotheses are generated. Each object hy pothesis
t
S is
turned into a tracklet T
t
t
. Hence, the set of tracklets
t
t
T = { T
t
t
}
is created. The only exception is the initialization in the very
first frame. Object detection is skipped and one single (static-
scene-) track is created from all measurements. The track(let)s
are predicted and updated across m frames and finally merged
with existing tra c ks. Note that the registration step uses the
unsegmented point cloud as re ference, which is in contrast
to most existing tracking m ethods. Ou tput of th e a lgorithm
is a set of tracks which includes the track of the static scene.
Hence, the sensor motion w.r.t. a fixed world coordinate frame
can be deduced as the inverse static scene motion.
B. Pre-processing and Features
The input point cloud is smoothed and two features are
calculated for each point p
i
P: a normal vector n
i
=
(n
x
, n
y
, n
z
)
T
: kn
i
k = 1, representing a local surface plane,
and a so-called flatness value f
i
[0, 1] which characte rizes
how app ropriate the approximation by a surface plane is.
The exact calculations are taken from [ 16], where the normal
vectors N = {n
i
} are denoted as N and the flatness-values
F = {f
i
} are denoted as C.
C. Object Detection
The aim of this work is to track any kind of object that
is m oving. As a consequ e nce, object class specific detectors
cannot be used. Better suited are segmentation methods, that
split the set of input points P, represented by th e set of indices
S = {i}, into segments S
g
S,
S
g
S
g
= S, g, h, g 6=
h : S
g
T
S
h
= , where each segment corresponds to on e
object hypothesis. Any meaningful segmentation method can
be employed within the pro posed tracking fra mework; h e re
the so-called local convexity criterion is used, which was
introdu ced in [15] and improved in [ 14], also see Fig. 3.
D. Tracking
A tracklet T
g
is created from each object h ypothesis S
g
with a minimum size and can be regarded as object hypothesis
in the time domain. A local object coordinate system O
g
is
introdu ced, as depicted in Fig. 4. It is specified by a pose
vector
ρ
g
= (φ, θ, ψ, x, y, z)
T
(1)
1139

e
x
e
y
e
z
e
x
e
y
e
z
ρ
Figure 4. The pose ρ of the state vector defines the position and the
orientation of a track coordinate system (top) w.r.t. the scanner coordinate
system (bottom). The track appearance is stored as point cloud (violet) with
normal vectors and atness values (both not shown) relative to the track
coordinate system.
which define s its orientation and position w.r.t. the sensor
coordinate system S. The pose and its derivative constitute
the state of the tracklet:
x
g
=
ρ
g
˙
ρ
g
= (φ, θ, ψ, x, y, z,
˙
φ,
˙
θ,
˙
ψ, ˙x, ˙y, ˙z)
T
(2)
The 3D points P
g
P, the normals N
g
, and the flatness values
F
g
constitute th e appearance of the tracklet. They are stored
relative to the object coordinate system O
g
, see Fig. 4 . In total,
a tracklet is defined by its state and appearance:
T
g
= (x
g
, P
O
g
g
, N
O
g
g
, F
g
) (3)
It is worth noting that our method fo r track estimation thus
includes the 3D reconstruction of the shape of moving objects.
In the first m frames the appearanc e of a tracklet is kept
constant. On the contrary, the state is re-e stima te d for each
new in coming frame within the Prediction and Update step
of Fig. 2. This makes the appearance move along with the
coordinate frame defined by the state. A Kalman Filter with
constant velocity model
1
is employed upon the state vector
which can express any rigid motion. Prediction c orresponds
exactly to the prediction step of the Kalman Filter. Registration
and update is performe d as in [14] by align ing the track s
appearance point cloud with the full input point cloud by
means of the ICP algorithm. The predicted pose thereby serves
as initial pose of this iterative algorithm. In case the average
flatness value of the tracklet exceeds some threshold, the point-
to-plane ICP [6] is used, otherwise the point-to-point variant
[2]. The measuremen t covariance for the Kalman Filter update
is calculated with the method of [5]. One special treatme nt
is made for the static scene track: instead of registering the
track appearance a gainst the input data, the input data is
registered against the trac k appearance as in [16]. This makes
the approa c h faster and more robust and allows for sensor
motion compensation.
Note that up to this point, no associations are made yet
between the tracklets, since registration is performed with the
full input data . Relations are established only in the track
management stage described next.
1
More specific and possibly non-linear models could of course be used for
specific object classes to extend and hence improve the method.
Figure 5. Input points as virtual range image, colored by distance.
Figure 6. Associations between new tracklets
t
t
T (upper image) and tracks
t
T
(lower image) are established by overlaying their projections and counting the
number of pixels they overlap. Shown are the association strengths for moving
objects (in the lower figure); the edges that are not labeled are associations
with the static track (gray).
E. Track-Management
This section essentially describes the Merging step in Fig. 2,
which also handle s the transition of tracklets to tracks. Both
describe moving objects by their state and a ppearan c e. The
difference is conceptual only: tracklets are track hypotheses
that, after successful verification, can become track s. As a
consequence, tra cks are predicted and updated exactly like
tracklets.
The merge step ta kes as input the current set of tracks
t
T
and the set of tracklets
t
tm
T
that was (independently)
registered across m fr ames and produces an updated set of
tracks
t
T. First, any track that moved out of the field of view
is removed from
t
T
. Then, each tracklet is compared with
the existing tracks and one of three actions is taken:
1) The tracklet is kept and add ed to the set of tracks if
the trac klet was successfuly registered over the last m
frames and if it represents an object with a motion
different to all existing track s.
2) The tracklet is merged with the track in c a se a track on
the same ob je c t alre ady exists.
3) The tracklet is discarded if none of the above two ca ses
is true.
In all three c ases, tracklets are inherently associated and
compare d with existing tracks. Fig. 6 illustrates the efficient
method used to determine these associations: The appearan ces
of all existing tracks and tracklets are projec ted to two virtual
1140

range images and the num ber of overlapping pixels determines
the association strength a
gh
between track le t T
t
tm g
and track
T
t
h
.
To decide upon the three cases, the tracklet T
t
tm g
is charac-
terized by a feature vector ( see Sec. IV). Several characteristics
are there fore calculated. Among others is the motion histogram
m = (m
1
, m
2
, m
3
, m
4
)
T
. For the trackle t that moved from
tm
ρ
g
to
t
ρ
g
, it summarizes how many appearance p oints
moved perpendicular to their normal vector (m
1
), aslant to
it ( m
2
and m
3
), and along the normal vector (m
4
). This
effectively characterize s how reliable motion estimation is,
since motion perpendicular to the normal vector is, generally,
unreliable. Furthermore, the tracklet is compared to each
associated track T
t
h
with association strength a
gh
> 0.
Therefore, the motion of the associated track T
t
h
within the
last m frames is applied to the tracklet T
t
tm g
:
t
ρ
′′
g,h
=
tm
ρ
g
+ (
t
ρ
h
tm
ρ
h
) (4)
In case both the tracklet and the track referred to the same
object and tracking was successful,
t
ρ
′′
g,h
should be very sim-
ilar to
t
ρ
g
. The ICP energy e
g
(
t
ρ
′′
g,h
) is calculated using both
the Eu clidean point-to-po int distance [2] and the projective
point-to-plane distance [6]. These errors are deno te d e
g,h,2
and
e
g,h,P
in the following as opposed to e
g,2
and e
g,P
, the errors
for the original pose
t
ρ
g
. Based on these errors the associated
tracks causing minimum e rror can be determined as well as
the track with maximum association strength
h
2
= arg min
h:a
g,h
>0
{e
g,h,2
} (5)
h
P
= arg min
h:a
g,h
>0
{e
g,h,P
} (6)
h
a
= arg max
h
{a
g,h
} (7)
Note that h
2
, h
P
, and h
a
are not necessarily different. The
features are gath ered within a 52-dimensional feature vector
f
g
, detailed in the appendix. A multi-class suppo rt vector
machine (SVM) with RBF kernel is used to classify the feature
vector in order to decid e upon the three cases.
In case a tracklet is kept as new tr ack, th e tracklet’s
appearance is removed from all associated tracks and the
tracklet is added to the set of tracks. This implicitly handles
track-splits.
In case a tracklet is to be merged with an existing track,
the corresponding track still has to be determined. This is
performed by calculating for each associated track h a score
s
gh
and choose the track with the highest score. T he score is
calculated as linear combination of a second feature vector:
s
gh
= (1 f
gh
T
) · w (8)
The featur e vector f
gh
is similar to f
g
and is deta iled
in the appendix. The parameter vecto r w is determ ined by
optimization on a labeled training set. When merging, the
state of the associated track remains unchanged. Only the
appearance of the tracklet is added to the track. Th e algorithm
for accumulatin g the appearanc e is taken from [16]. There, flat
Table I
CLASSIFICATION RESULTS ON A LABELED DATA S ET FOR TWO DI FFERENT
PARAMETER SETTINGS OF THE CLASSIFIER.
Decision variant A Decision variant B
Keep Merge Ignore Keep Merge Ignore
Keep 23 14 89 103 13 10
Merge 0 12208 612 196 12516 108
Ignore 1 208 3614 200 567 3056
Accuracy 94.49% 93.48%
areas are contracted to yield sharper surface representations.
This so-called moving object mapping (MOM) not only makes
the results nicer, it also improves the registration result.
As compared to [16], one further step is added to process the
appearances. This is particularly r e levant for non- rigid object
in order to avoid tracking inaccuracies. In the projection step
illustrated in Fig. 6, each appearance point is removed from the
track that yields a closer range value than the range value at the
pixel of the current sensor data, see Fig. 5. As consequence,
the appearance can adapt to non -rigid ob je cts like pedestrians.
III. RESULTS
The proposed algorithm is evaluated on data capture d with
a Velodyne HDL-6 4E laser scanner. The sensor, a 64-bea m
laser scanner, is mounted on top of a car and yields a 360
view of the environment, as illustrated in Fig. 9. We set m = 3
through all the experiments.
The first stage of evaluation con c erns the localization pre-
cision. Since in static scenes the pr oposed algorithm for local-
ization equals the algorithm presented in [16], the results are
transferable . Two scenarios were evaluated that both represent
loops in a non -flat urban environment. These loops c an be
used to evaluate drift, i. e. the localization imprecision that
increases with traveled distance. In average, a position error
of 2.66 m after a 1 km drive was determin ed. This value can
be regarded as very low and is about an orde r of magnitude
lower than for common camera-based tech niques. More details
and discussions are given in [16].
The evaluation of object detection and tracking p roceeds
in several stages. First, the classifier for track management is
evaluated. This classifier decides for each trac klet, i. e. track
hypothesis, whether it is to be merged with an existing track,
kept as new track, or ignored. Four-fold cross-validation is
applied on a dataset that was set-up and lab e le d manually. The
classification accuracy reaches the values listed in table I. The
two decision variants correspo nd to different weigh tings of
the classes during training. With this weighting, the SVM can
be pulled towards favoring c ertain decisions. Varia nt A favors
the ignorance of tra cklets, which yields a higher precision but
lower recall of the tracker. On the contrary, variant B yields
a lower pr ecision but higher recall. Many alternative variants
exist, most of them with a classification accuracy between 90%
and 95%. For further experiments, variant A is selected.
In order to evaluate the quality of tracking, an experiment
was conducted in real traffic using a second car, denoted as
1141

Figure 8. Moving Object Mapping: Appearance of a car accumulated over time (from left to right). Initial points are depicted with double size.
-10
-5
0
5
10
0 5 10 15 20 25 30 35
0
5
10
15
20
speed-error / (m/s)
speed / (m/s)
time / s
true target speed
true sensor car speed
speed-error, with MOM
speed-error, first appearance
speed-error, replace appearance
Figure 7. Tracking quality assessed by using a second car, denoted target car.
Shown are the speed-profiles of both cars and the speed error as difference
between the estimated speed (by the tracker) and the true target speed
(measured by DGPS/IMU) for different tracking strategies. Missing values
indicate a temporary failure of the tracking method.
Table II
TRACKING S TATISTICS F OR SPEED COMPARISON EXPERIMENT OF FIG. 7
GENERATED WITHOUT OUTER 10% QUANTILES.
nb. of speed error in m/s
tracks median mean std-deviation
with MOM 2 -0.84 -0.96 ±1.16
first appearance 4 -0.82 -1.04 ±1.29
replace appearance
12 -10.62 -7.69 ±10.27
target car. This target car starts in front of the sensor ca r,
accelerates, and gets overtaken by the sensor car after 26 s.
The speed-profiles (measured by DGPS/IMU) as w e ll as the
speed-err ors are depicted in Fig. 7, some chara cteristic values
are listed in ta ble II. Evident is the advantage of MOM
over using only the appea rance of one frame. The speed-error
is w ithin an acceptable range and the track gets lost only
once. Esp ecially during the overtaking maneuver, the car is
continuously tracked because the appearance smoothly adapts
to the new viewpoints. This adaption is well illustrated in
Fig. 8. Using constantly the first appearance leads to three
track losses and a slightly higher speed error. Replacing the
appearance each frame leads to the worst results. As this
technique ca uses the track to drift, speed errors are high and
the track gets lost 11 times.
Additional experiments were conducted around intersections
with many moving objects. A video and the data is available
on www.mrt.kit.edu/z/publ/download/velodynetracking/. Vari-
0
5
10
15
20
25
30
35
40
0 5 10 15 20 25 30 35
count
track-length / s
Figure 10. Track lengths on the sequence illustrated in Fig. 9 (total 50 s).
ous types were successfully tracked: p e destrians (with rolling
case), cyclists, cars, vans, trucks, trams. Fig. 9 shows some
tracking results at a big intersection in the city of Karlsruhe.
Most moving objects are immediately detected, some slowly
moving pedestrians with a short delay. M ost tracks are stable,
i. e. trac king is successful until the objec t moves out of view.
This is shown by Fig. 10 that lists the distribution of track
lengths across the sequence. Cars th at move in parallel to the
sensor car are tracked for the whole tim e of movement, i. e. 30
seconds. Most other objects are tracked for several seconds,
even in areas where the objects are partly occluded.
IV. CONCLUSIONS
A novel approach was presented for self-lo c alization and
mapping combined with moving object tracking in dense
range data. Tracking and mapping was applied to both object
hypotheses and the static scene identically. Thus, 3D shape es-
timation of moving objects is inherent to the proposed method.
A classification-based track management was introduc ed for
track verification, merging, and splitting. The applicability
of the method was shown for a vehicular platform in a
crowded city environmen t. But these are not the limits of the
approa c h. Since object models were kept generic and tracking
is performed in full 3D, the approach is applicable to o ther
sensors and in other application areas, too.
APPENDIX
Let log
p
(x) := log(1 + max{0, x}).
The 52-dimensional feature vector f
g
is composed as
follows: f[1] {0, 1} is 1 if the last measurement
was su c cessful and 0 otherwise. f [2] {0, 1, 2, 3} is
1142

Citations
More filters
Proceedings ArticleDOI

Efficient Surfel-Based SLAM using 3D Laser Range Data in Urban Environments

TL;DR: A novel, dense approach to laserbased mapping that operates on three-dimensional point clouds obtained from rotating laser sensors is proposed that is efficient and enables real-time capable registration and is able to detect loop closures and to perform map updates in an online fashion.
Journal ArticleDOI

Lidar for Autonomous Driving: The Principles, Challenges, and Trends for Automotive Lidar and Perception Systems

TL;DR: A review of state-of-the-art automotive lidar technologies and the perception algorithms used with them and the limitations, challenges, and trends for automotive lidars and perception systems.
Proceedings ArticleDOI

Motion-based detection and tracking in 3D LiDAR scans

TL;DR: This paper presents a novel model-free approach for detecting and tracking dynamic objects in 3D LiDAR scans obtained by a moving sensor that outperforms the state of the art.
Journal ArticleDOI

Lidar for Autonomous Driving: The principles, challenges, and trends for automotive lidar and perception systems

TL;DR: A review of state-of-the-art automotive LiDAR technologies and the perception algorithms used with those technologies can be found in this paper, where the main components from laser transmitter to its beam scanning mechanism are analyzed and compared.
Proceedings ArticleDOI

Rigid scene flow for 3D LiDAR scans

TL;DR: This paper proposes a novel method for estimating dense rigid scene flow in 3D LiDAR scans as an energy minimization problem, where it assumes local geometric constancy and incorporate regularization for smooth motion fields.
References
More filters
Journal ArticleDOI

A method for registration of 3-D shapes

TL;DR: In this paper, the authors describe a general-purpose representation-independent method for the accurate and computationally efficient registration of 3D shapes including free-form curves and surfaces, based on the iterative closest point (ICP) algorithm, which requires only a procedure to find the closest point on a geometric entity to a given point.
Proceedings ArticleDOI

Object modeling by registration of multiple range images

TL;DR: The authors propose an approach that works on range data directly and registers successive views with enough overlapping area to get an accurate transformation between views and performs a functional that does not require point-to-point matches.
Journal ArticleDOI

Simultaneous Localization, Mapping and Moving Object Tracking

TL;DR: Based on the SLAM with DATMO framework, practical algorithms are proposed which deal with issues of perception modeling, data association, and moving object detection.
Related Papers (5)
Frequently Asked Questions (9)
Q1. What is the main purpose of the method?

Since object models were kept generic and tracking is performed in full 3D, the approach is applicable to other sensors and in other application areas, too. 

Since in static scenes the proposed algorithm for localization equals the algorithm presented in [16], the results are transferable. 

In case the average flatness value of the tracklet exceeds some threshold, the pointto-plane ICP [6] is used, otherwise the point-to-point variant [2]. 

moosmann at kit.eduThe problem of multi target tracking is usually understood as the task to detect a set of objects in the environment and to characterize them by their position, orientation, extent, and velocity. 

Especially during the overtaking maneuver, the car is continuously tracked because the appearance smoothly adapts to the new viewpoints. 

The ICP energy eg( tρ′′g,h) is calculated using both the Euclidean point-to-point distance [2] and the projective point-to-plane distance [6]. 

One special treatment is made for the static scene track: instead of registering the track appearance against the input data, the input data is registered against the track appearance as in [16]. 

Experiments conducted in a vehicular environment show the applicability to 3D environments with both tracking and self-localization performed with full 6 degrees of freedom. 

Registration and update is performed as in [14] by aligning the track’s appearance point cloud with the full input point cloud by means of the ICP algorithm.