3D perception and planning for self-driving and cooperative automobiles

doi:10.1109/SSD.2012.6198130

3D Perception and Planning for Self-Driving and Cooperative

Automobiles

Christoph Stiller and Julius Ziegler

Abstract—This presentation focusses on key technologies for

automobiles that perceive a priori unknown environment and

automatically navigate through everyday trafﬁc. Methods for

3D Machine perception based on lidar and video sensors are

outlined. Beyond classical metrology, the recognition and basic

understanding of situations must be accomplished for automated

trajectory planning in urban trafﬁc. We discuss how to represent

and acquire metric, symbolic and conceptual knowledge from

video and lidar data of a vehicle. A hardware and software ar-

chitecture tailored to this knowledge structure for an autonomous

vehicle is proposed. Emphasis is laid on methods for situation

recognition employing geometrical and topological reasoning and

Markov Logic Networks. A quality measure for trajectories is

imposed that considers safety, efﬁciency, and comfort. We adopt

a ﬂat input parameterization to plan trajectories that optimize

the imposed quality measure. Results from the autonomous

vehicle AnnieWAY that recently won the Grand Cooperative

Driving Challenge are shown in real world urban and platooning

scenarios.

I. INTRODUCTION

Autonomous Vehicles that perceive their environment, com-

municate with each other, understand the current trafﬁc situa-

tion and may by themselves or cooperatively with others plan

and conduct appropriate driving trajectories are an intense ﬁeld

of international research. This contribution outlines the con-

cept and architecture of the ’Cognitive Automobile AnnieWAY’

that has successfully participated in international competitions

such as the 2005 Grand and the 2007 Urban Challenge, and

recently won the 2011 Grand Cooperative Driving Challenge

[1], [2], [3], [4]. The vehicle constitutes an experimental basis

for automated machine behaviour [5], [6]. Within a few years,

large improvements in trafﬁc safety is expected from such

technologies [7].

A major goal of the scientiﬁc research is to advance knowl-

edge acquisition and representation as a basis for automated

decisions. As illustrated in Figure 1, driving - whether by

a human or by a cognitive machine - involves knowledge

representation in various forms. Metric knowledge, such as

the lane geometry and the position or velocity of other trafﬁc

participants is required to keep the vehicle on the lane at a

safe distance to others. Symbolic knowledge, e.g. classifying

lanes as either ’vehicle lane forward’, ’vehicle lane rearward’,

’bicyle lane’, ’walkway’, etc. is needed to conform with basic

rules. Finally, conceptual knowlegde, e.g. specifying a rela-

tionship between other trafﬁc participants allows to anticipate

the expected evolution of the scene to drive foresightedly.

C. Stiller and J. Ziegler are with Institut f

¨

ur Mess- und Regelungstech-

nik, KIT - Karlsruher Institut f

¨

ur Technologie, 76131 Karlsruhe, Germany

stiller, ziegler@kit.edu

25 m

30 km/h

S2

P2

P3

S1

P1

S3

P1

follows

P3

vehicle lane forward

bicycle

lane

Fig. 1: Metric (yellow), symbolic (orange), and conceptual

(red) knowledge for cognitive automobiles

II. ANNIEWAY S YSTEM OVERVIEW

A. AnnieWAY Hardware Architecture

Embodiment is widely considered a crucial element in

cognitive systems research. To assess and validate theoretical

ﬁndings we have adopted the uniﬁed hardware and software

framework of the Karlsruhe-Munich collaborate research cen-

ter ’cognitive automobiles’ [8], [9]. Based on the architecture

depicted in Figure 2, meanwhile some ten experimental cog-

nitive automobiles were set up [10], [6], [11]. To ensure real-

GPS-Antennas

3D-LIDAR

2D-LIDAR

Stereo Vision

Control

Computer

IMU

Power

Supply

Main

Computer

Radar

E-Throttle

E-Brakes

E-Steering

V2V

Communication

Fig. 2: Hardware setup for the cooperative cognitive automo-

bile AnnieWAY.

time capabilities, vehicle control is performed on a dedicated

dSpace AutoBox which directly communicates with the actu-

ators over the vehicle CAN. All other perception and planning

modules as well as sensor data acquisition are performed

by a single multicore multiprocessor computer system which

delivers sufﬁcient computing power to host all processes

providing low latencies and high bandwidth for inter-process

communication.

Stiller, Christoph and Ziegler, Julius: 3D Perception and Planning for Self-Driving and Cooperative Automobiles. In: Proc. 9th

IEEE Int. Multi-Conf. Systems, Signals and Devices. Chemnitz, Germany, March 2012, pp. 1–7

B. AnnieWAY Software Architecture

The hardware is complemented with a real-time capable

software architecture as depicted in Figure 3. The framework

has been proposed, implemented, and made publicly available

by [8], [12]. Its central element is a real-time database for in-

formation exchange. The various driving and perception tasks

run in separate processes that communicate via the database

and share a centralized view on all available information

at every time. The framework supports parallel operation of

processes at variable update rates and ascertains hard real-time

performance where needed.

Velodyne Lidar 8 Mb/s

Sensor Interface

Global Services

ServicDienste

real time

knowledge basis

GPS/INS

Sick 1D Lidar

Perception

static 3D map

dynamic objects

transparent access

system watchdog

read/write < 10 !s

time referenced

low delay

hard real time

behaviour generation

Decision & Planning

situation assessment

throttle/brakes

Control

steering

gear shift, turn indicators, etc.

Stereo Vision 18 Mb/s

coop. behaviour

Communication

coop. perception

lane geometry

on-road trajectory planning

off-road trajectory planning

Fig. 3: Software setup for the cooperative cognitive automobile

AnnieWAY.

III. SITUATION RECOGNITION

A. Simple geometric and topological reasoning

In this section, we will assume that a representation of

the road network is available. This representation has to

contain the geometry of single lanes as well as a topology,

i.e. their interconnectedness within the network. Formally,

this representation is a special geometric graph, i.e. a graph

whose edges describe a distinctive road geometry, expressed

by a planar curve. Such a representation was available during







Fig. 4: Geometric graph for road representation and situation

recognition.

the Urban Challenge in form of a so called road network

deﬁnition ﬁle (RNDF). As has been shown in [13], such a

representation can also be derived from vision cues using

formal logic reasoning. Figure 4 shows an example for such a

graph. The depicted situation is that of a one-way road forming

a T-type-junction towards a road which allows two-way trafﬁc.

Other road users are embedded into the graph using purely

geometric reasoning. They are assigned to that edge in the

graph which best explains their position and heading. A

simple, orientation-aware point-to-curve distance function can

be used for this task. Figure 4, depicts three vehicles and their

association to edges in the graph.

The graph provides a rich description that readily allows to

determine roles of and relations among other road users. From

Figure 4, e.g., the relations “A follows B” and “B must yield

to C” can be derived ad-hoc.

B. Markov Logic Networks

Markov Logik Networks (MLNs) refer to a class of prob-

abilistic logical models combining ﬁrst-order predicate logics

with Markov random ﬁelds [14]. An MLN is deﬁned through

a set of formulas {F

1

,...,F

n

} in ﬁrst-order predicate logics

on a random ﬁeld with random variables X =(X

1

,...,X

q

)

and a set of scalar weights {w

1

,...,w

n

} such that one weight

is attributed to each formula.

The joint distribution of the random ﬁeld is then deﬁned by

a Gibbs distribution

P (X = x)=

1

Z

exp



n



k=1

w

k

F

k

(x)



, (1)

where x =(x

1

,...,x

q

) denotes a realization of the random

ﬁeld X, and Z is a normalizing constant. The logical formulas

F

k

are instantiated by the realizations x rendering each for-

mula either true or false. Typically, each formula will depend

on a small subset of variables in x only that forms a clique

of the Gibbs distribution.

Table I shows a simple example for an MLN with two

generic formulas. The ﬁrst formula is applied to each vehicle

O

i

while the second formula is applied to each pair of vehicle

and lane (O

i

, R

j

) detected in the scene. For a speciﬁc scene

w

i

F

i

1 1.4 ∀o hasDirection(o, Same) ⇒ car (o)

2 0.6 ∀o∀ron(o,r)∧road (r)∧hasSpeed (o,Low ) ⇒ car ( o)

TABLE I: Formulas and weights specifying an MLN

with, e.g. two vehicles {O1 , O2 } and one lane {R1 }, one is

left with the Markov random ﬁeld shown in the graph of Fig. 5.

This simple example supports the classiﬁcation of cars through

context information [15]. The formulas of an MLN can thus be

considered as probabilistic rules with the weights quantifying

our degree of belief in these rules. The Gibbs distribution (1)

models world conﬁgurations as most probable the more they

conform with rules that posses large weights.

Fig. 6: AnnieWAY’s hierarchical state automaton.

hasDirection(O1,Same)

hasDirection(O2,Same)

on(O1,R1)

on(O2,R1)

car(O1)

car(O2)

road(R1)

hasSpeed(O1,Low)

Fig. 5: Graphical representation of the Markov Logic Network

deﬁned through the generic formulas and weights from Table I

and a scene with two vehicles {O1 , O2 } and one lane {R1 }.

IV. BEHAVIOUR GENERATION

Building on the information provided by the situation

recognition module, the behavioural layer makes decisions on

actions which need to be carried out in the current situation.

Actions are communicated downstream to the trajectory gen-

eration stage in the form of center and boundary lines for the

driving corridor, or as hard constraints which are imposed onto

the generated trajectories (like forcing a stop at a stop line, or

obeying a speed limit). Some simplistic tasks, like ﬂashing an

indicator, are passed on to the vehicle hardware directly. All

these actions are generated using a state automaton which is

organised in a hierarchical fashion. The possibility to describe

state automata hierarchically has ﬁrst been described by David

Harel in [16] (Harel state charts). Figure 6 shows the state

automaton which has been used on board ANNIEWAY during

the Urban Challenge. Descriptors of states and events are

preﬁxed by St...and Ev..., respectively. Substates are, for the

most part, displayed in short form, e.g. the state StDrive

contains sub states StOnLane, StFollow, StChangeLane

etc. The principle of hierarchal organisation is illustrated by

an exemplary “zoom” into the state StIntersection, which

shows the detailed structure of the relation of StIntersection’s

sub states. For a detailed treatment of state charts and their

graphical notation cf. [16]. For a more detailed discussion

of the speciﬁc use on board ANNIEWAY we refer interested

readers to [17].

V. T RAJECTORY PLANNING

After the situation has been recognized and an appropriate

behaviour has been identiﬁed a speciﬁc trajectory is planned.

The planning concept described in the sequel belongs to the

class of state lattice planers which has been adapted for on road

driving in the presence of moving obstacles. A more complete

description of the methodology can be found in [18].

A. Spatiotemporal state lattices

Static state lattices result from appropriate sampling of the

continuous conﬁguration space and are known as efﬁcient rep-

resentations for path planning in static environments [19], [20].

Spatiotemporal augment the conﬁguration space of a standard

state lattice with time into a single manifold, followed by

discretization. To illustrate this concept, we will ﬁrst consider

the simplistic case of a one dimensional spatial conﬁguration

space.

Consider a vehicle traveling with varying velocity in . Its

state is described by its distance from the origin, l and time t.

In the spirit of the state lattice approach, we constrain the state

space to an equidistantly sampled subset of

2

with sampling

interval ∆l, ∆t.

Figure 7 depicts a spatiotemporal state lattice over the

described workspace. The ﬁgure sketches state transitions

for piecewise constant, positive velocities and C

2

continuous

paths achieved by quintic polynomials, respectively. Quintic

Fig. 7: A spatiotemporal state lattice over a one dimensional

workspace. The lower left shaded area depicts a control set

for paths in C

0

while the upper right one depicts one designed

for higher order continuity, consisting of quintic polynomials.

polynomials are attractive for planning dynamic driving ma-

noeuvres, because they minimize squared jerk [21] and allow

for fast computation of their coefﬁcients for given boundary

conditions. Closed form expressions exist to describe the

integral of squared jerk and for maximum speed, acceleration

and speed along the trajectory [22]. Quintic splines have been

used for automotive motion planning before [23], albeit only

to describe kinematic paths without time parametrization.

B. Motion planning using spatiotemporal state lattices

In order to account for moving obstacles their future posi-

tions are predicted. Obstacles can then readily be transferred

to the space-time manifold, as shown in Figure 8. The shaded

area is occupied by a small object that moves with velocity

1

2

∆l

∆t

. A trajectory is found within the spatiotemporal lattice

that does not collide with the obstacle.

To deal with obstacles efﬁciently, we create a mapping

between a discrete space-time obstacle map and the set of

all edges in the graph. This can be done in the ofﬂine graph

generation phase. Then, edges blocked by obstacles can be

invalidated quickly by a single run over the obstacle map. This

method scales well with the number of obstacles maintaining

an almost constant overall processing time.

Edge costs consider the integral of the squared jerk of their

geometric representations, as opposed to simply considering

arc length. This improves safety, controllability and driving

comfort.

Graph-based motion planning algorithms usually employ

shortest path algorithms that maintain vertices visited in a

partially ordered data structure. Algorithms belonging to this

class include A* search, as well as Stentz’ D* [24] and focused

D* [25]. Spatiotemporal lattices belong to the class of directed

acyclic graphs (DAG). Hence, sorting vertices by time yields

a topological ordering in advance, and vertices can be just

processed in this order. The resulting algorithm is linear in

the number of vertices n, as opposed to Dijkstra’s general

scheme which is in O(n log n).

Fig. 8: Planning with a moving obstacle in the space-time

manifold. The shaded area is covered by a moving object. A

trajectory is shown that is composed of elements of the control

set. Shortest paths can be found by relaxing vertices from left

to right.

Fig. 9: Reparametrisation of the Cartesian plane. The dotted

line indicates the original run of the road, (X, Y ). The grey

structure illustrates the discrete reparametrization in l and r.

C. Lane-adapted reparametrization

The principle of spatiotemporal state lattices developed in

the preceding sections generalizes naturally to two dimensions.

Doing this na

¨

ıvely, however, produces dimensionality prob-

lems due to the required dense sampling of the state space.

Note that, in comparison with [20] the dimensionality of the

sampling space for the state lattice rises from 3 (2D position

and orientation, in [19], curvature is consider additionally) to

7 (2D position, 2D velocity, 2D acceleration and time), due to

moving from a kinematic to a higher order dynamic model and

the incorporation of time. With dimensionality rising, coverage

of the conﬁguration space requires an exponentially growing

number of samples. Hence, an efﬁcient way of sampling the

conﬁguration space is needed that is adapted to the special

case of navigating on a road whose run is known a prioi, e.g.

from digital map data.

Given a continuous, piecewise twice differentiable, arc

length s parametrized representation (X(s),Y(s)) of the

course of the road, we deﬁne the following reparametrization

(l, r) of the 2D workspace, where (x, y) denote Cartesian

coordinates, l(t) is the distance travelled along the road, and

r(t) is the lateral offset towards the road centre:

x(t)=X(l) − rY

�

(l) (2)

y(t)=Y (l)+rX

�

(l). (3)

Fig. 10: State transitions on the transformed grid. The succes-

sors of one vertex are shown in black.

This is a base change towards a local orthogonal coordinate

system that has its abcissa aligned with the road for any l. It

deﬁnes a two dimensional manifold as depicted in Figure 9.

As described earlier, differential boundary conditions of up to

second order are required for edge generation. We therefore

need to transform them through equations (2) and (3): Given

˙

l, ˙r,

¨

l and ¨r, by application of the chain rule we obtain

˙x =

˙

lX

�

(l) − ˙rY (l) − r

˙

lY

�

(l) (4)

˙y =

˙

lY

�

(l)+ ˙rX (l)+r

˙

lX

�

(l) (5)

and

¨x =

¨

lX

�

+

¨

l

2

X

��

− ¨rY − (2 ˙r

˙

l + r

¨

l)Y

�

− ˙r

˙

l

2

Y

��

(6)

¨y =

¨

lY

�

+

¨

l

2

Y

��

− ¨rX − (2 ˙r

˙

l + r

¨

l)X

�

− ˙r

˙

l

2

X

��

. (7)

We now restrict parameters l, r,

˙

l, ˙r,

¨

l and ¨r to a discrete,

grid like set (the vertices of the search graph) and transform

them through equations (2) - (7). The resulting x, y, ˙x, ˙y, ¨x

and ¨y, together with discrete values for time t, are used as

boundary values to calculate quintic polynomial trajectories

as described in section V-A. To assert dynamic and kinematic

feasibility, a respective edge is only added to the graph, if

velocity, acceleration and jerk stay within bounds deﬁned in

advance. In the effort to further reduce the number of vertices,

some ad hoc reductions can be applied to the sets of discrete

parameters: r is constrained to an interval so as to restrict all

vertices of the lattice to be within the bounds of the road.

We set ˙r =0and constrain

˙

l to be positive, since we wish

the vehicle to make progress along the road, while crosswise

motion is to be avoided. Second derivatives

¨

l and ¨r of the

untransformed coordinates are set to zero at the grid points.

Figure 10 gives an impression of the graph we used for our

experiments by displaying successor edges of a single vertex.

The outdegree of vertices is approximately 200.

VI. EXPERIMENTAL RESULTS

Figure 11 shows an exemplary result of the proposed

trajectory planning method. The scenario selected for this

example is that of merging into running trafﬁc at a T-junction.

As can be seen, the proposed method yields a trajectory that is

smooth in the sense of minimum mean squared jerk and safe

in the sense of entering the gap at a safe distance to all other

vehicles with a velocity of the gap itself. The proposed method

inherently selects the optimum gap to cut in. The planner

is also able to ﬁnd trajectories under more complex trafﬁc

conditions.

Figure 12 shows AnnieWAY performing in the Urban Chal-

lenge in 2007, where AnnieWAY was one of only two non-US

Vehicles that successfully entered the ﬁnals and belonged to

the the few vehicles that drove collision-free through a mock

up suburban environment. The graph based representation

of the road network can be seen. In the top right of each

frame, the sequence of states the hierarchical state automaton

traverses is displayed.

The Grand Cooperative Driving Challenge 2011 (GCDC)

was the ﬁrst international competition to implement coop-

erative driving in a realistic, heterogenous scenario. It was

organized by the Netherlands Organisation for Applied Sci-

entiﬁc Research (TNO) on a highway near Helmond. Ve-

hicle to vehicle (V2V) and vehicle to infrastructure (V2I)

communication in the ITS band at 5.8 GHz following the

IEEE 802.11p standard built the basis for cooperation between

vehicles [26]. The information broadcasted by each vehicle

included vehicle length and width, latitude and longitude of

position, heading and yaw rate, and velocity and acceleration.

The main challenge for participating teams was to develop

longitudinal control strategies that allowed autonomous and

cooperative vehicle platooning maneuvres without knowing

the algorithms and technical equipment of other vehicles in

the platoon. Control strategies had to cope with standard

maneuvres, like platoon merging as well as with unexpected

behavior of other vehicles, such as, e.g., varying data quality,

or sudden failure of communication. Vehicles were repeatedly

and randomly teamed up for heats in two platoons. Both

platoons were led by the same vehicle that induced challeng-

ing braking and acceleration maneuvres to the competitors.

Figure 13 shows one of these heats of the GCDC, illustrating

the large variety of vehicles and technical solutions in the

competition. Our vehicle AnnieWAY is the silver vehicle

directly in front of the truck. AnnieWAY won the Grand

Cooperative Driving Challenge 2011 tightly followed by the

team from Halmstedt. In contrast to previous challenges, the

GCDC not only assessed individual driving of each vehicle

but also considered its impact on other trafﬁc participants, i.e.

the criteria included platoon velocity, length, and damping of

acceleration/deceleration cycles [27]. For more details we refer

interested readers to the IEEE Trans. ITS special issue on the

GCDC including team AnnieWAYs contribution [4].

VII. CONCLUSIONS

AnnieWAY is an experimental autonomous vehicle that

perceives a priori unknown environments, recognizes situa-

tions, plans appropriate trajectories, and controls its actuators

to follow these. It acquires metric, symbolic and conceptual

3D perception and planning for self-driving and cooperative automobiles

Figures

Citations

A survey of technical trend of ADAS and autonomous driving

Accurate Mobile Urban Mapping via Digital Map-Based SLAM.

A Survey of State-Action Representations for Autonomous Driving

Dynamic Hierarchical Aggregation for Vehicular Sensing

Collaborative vehicle detection of objects with a predictive distribution

References

Statecharts: A visual formalism for complex systems

Markov logic networks

IEEE 802.11p: Towards an International Standard for Wireless Access in Vehicular Environments

Optimal and efficient path planning for partially-known environments

The focussed D* algorithm for real-time replanning

Related Papers (5)

Real-Time Motion Planning Approach for Automated Driving in Urban Environments

Using Ontology-Based Traffic Models for More Efficient Decision Making of Autonomous Vehicles

Integrating Perception and Planning for Autonomous Navigation of Urban Vehicles

Integrating Intuitive Driver Models in Autonomous Planning for Interactive Maneuvers

Functional architecture for automated vehicles trajectory planning in complex environments