Visual Navigation of a Mobile Robot
with Laser-based Collision Avoidance
Andrea Cherubini
∗1,2
and François Chaumette
†1
1
INRIA Rennes - Bretagne Atlantique, IRISA, Campus de Beaulieu 35042, Rennes, France
2
LIRMM - Université de Montpellier 2 CNRS, 161 Rue Ada, 34392 Montpellier, France
August 19, 2012
Abstract
In this paper, we propose and validate a framework for visual navigation with collision avoidance for a
wheeled mobile robot. Visual navigation consists of following a path, represented as an ordered set of
key images, which have been acquired by an on-board camera in a teaching phase. While following such
path, the robot is able to avoid obstacles which were not present during teaching, and which are sensed
by an on-board range scanner. Our control scheme guarantees that obstacle avoidance and navigation are
achieved simultaneously. In fact, in the presence of obstacles, the camera pan angle is actuated to maintain
scene visibility while the robot circumnavigates the obstacle. The risk of collision and the eventual avoiding
behaviour are determined using a tentacle-based approach. The framework can also deal with unavoidable
obstacles, which make the robot decelerate and eventually stop. Simulated and real experiments show that
with our method, the vehicle can navigate along a visual path while avoiding collisions.
Keywords: Vision-based Navigation, Visual Servoing, Collision Avoidance, Integration of vision with
other sensors.
∗
Andrea.Cherubini@lirmm.fr
†
Francois.Chaumette@inria.fr
1
1 Introduction1
A great amount of robotics research focuses on vehicle guidance, with the ultimate goal of automati-2
cally reproducing the tasks usually performed by human drivers [Buehler et al. 2008, Zhang et al. 2008,3
Nunes et al. 2009, Broggi et al. 2010]. In many recent works, information from visual sensors is used for lo-4
calization [Guerrero et al. 2008, Scaramuzza and Siegwart 2008] or for navigation [Bonin-Font et al. 2008,5
López-Nicolás et al. 2010]. In the case of autonomous navigation, an important task is obstacle avoid-6
ance, which consists of either generating a collision-free trajectory to the goal [Minguez et al. 2008], or7
decelerating to prevent collision when bypassing is impossible [Wada et al. 2009]. Most obstacle avoidance8
techniques, particularly those that use motion planning [Latombe 1991], rely on the knowledge of a global9
and accurate map of the environment and obstacles.10
Instead of utilizing such a global model of the environment, which would infringe the perception11
to action paradigm [Sciavicco and Siciliano 2000], we propose a framework for obstacle avoidance with12
simultaneous execution of a visual servoing task [Chaumette and Hutchinson 2006]. Visual servoing is13
a well known method that uses vision data directly in the control loop, and that has been applied on14
mobile robots in many works [Mariottini et al. 2007, Becerra et al. 2011, López-Nicolás and Sagüés 2011,15
Allibert et al. 2008]. For example, in [Mariottini et al. 2007] the epipolar geometry is exploited to drive a16
nonholonomic robot to a desired configuration. A similar approach is presented in [Becerra et al. 2011],17
where the singularities are dealt with more efficiently. The same authors achieve vision-based pose sta-18
bilization using a state observer in [López-Nicolás and Sagüés 2011]. Trajectory tracking is tackled in19
[Allibert et al. 2008] by integrating differential flatness and predictive control.20
The visual task that we focus on is appearance-based navigation, which has been the target of our21
research in [Šegvi
´
c et al. 2008, Cherubini et al. 2009, Diosi et al. 2011]. In the framework that we have de-22
veloped in the past
1
, the path is a topological graph, represented by a database of ordered key images. In23
contrast with other similar approaches, such as [Booij et al. 2007], our graph does not contain forks. Further-24
more, as opposed to [Royer et al. 2007, Goedemé et al. 2007, Zhang and Kleeman 2009, Fontanelli et al. 2009,25
Courbon et al. 2009], we do not use the robot pose for navigating along the path. Instead, our task is purely26
image-based (as in [Becerra et al. 2010]), and it is divided into a series of subtasks, each consisting of driving27
the robot towards the next key image in the database. More importantly, to our knowledge, appearance-based28
navigation frameworks have never been extended to take into account obstacles.29
1
See:
www.irisa.fr/lagadic/demo/demo-cycab-vis-navigation/vis-navigation
.
2
Obstacle avoidance has been integrated in many model-based navigation schemes. In [Yan et al. 2003],1
a range finder and monocular vision enable navigation in an office environment. The desired trajectory2
is deformed to avoid sensed obstacles in [Lamiraux et al. 2004]. The authors of [Ohya et al. 2008] use a3
model-based vision system with retroactive position correction. Simultaneous obstacle avoidance and path4
following are presented in [Lapierre et al. 2007], where the geometry of the path (a curve on the ground)5
is perfectly known. In [Lee et al. 2010], obstacles are circumnavigated while following a path; the radius6
of the obstacles (assumed cylindrical) is known a priori. In practice, all these methods are based on the7
environment 3D model, including, for example, walls and doors, or on the path geometry. In contrast, we8
propose a navigation scheme which does not require neither the environment nor the obstacle model.9
One of the most common techniques for model-free obstacle avoidance is the potential field method,10
originally introduced in [Khatib 1985]. The gap between global path planning and real-time sensor-based11
control has been closed with the elastic band [Quinlan and Khatib 1993], a deformable collision-free path,12
whose initial shape is generated by a planner, and then deformed in real time, according to the sensed data.13
Similarly, in [Bonnafous et al. 2001, Von Hundelshausen et al. 2008], a set of trajectories (arcs of circles or14
“tentacles”) is evaluated for navigating. However, in [Bonnafous et al. 2001], a sophisticated probabilistic15
elevation map is used, and the selection of the optimal tentacle is based on its risk and interest, which both16
require accurate pose estimation. Similarly, in [Von Hundelshausen et al. 2008], the trajectory computation17
relies on GPS way points, hence - once more - on the robot pose.18
Here, we focus on this problem: a wheeled vehicle, equipped with an actuated pinhole camera and with19
a forward-looking range scanner, must follow a visual path represented by key images, without colliding20
with the ground obstacles. The camera detects the features required for navigating, while the scanner senses21
the obstacles (in contrast with other works, such as [Kato et al. 2002], only one sensor is used to detect the22
obstacles). In this sense, our work is similar to [Folio and Cadenat 2006], where redundancy enables reactive23
obstacle avoidance, without requiring any 3D model. A robot is redundant when it has more DOFs than those24
required for the primary task; then, a secondary task can also be executed. In [Folio and Cadenat 2006], the25
two tasks are respectively visual servoing and obstacle avoidance. However, there are various differences26
with that work. First, we show that the redundancy approach is not necessary, since we design the two tasks27
so that they are independent. Second, we can guarantee asymptotic stability of the visual task at all times, in28
the presence of non occluding obstacles. Moreover, our controller is compact, and the transitions between29
safe and unsafe contexts is operated only for obstacle avoidance, while in [Folio and Cadenat 2006], three30
3
controllers are needed, and the transitions are more complex. This compactness leads to smoothness of1
the robot behaviour. Finally, in [Folio and Cadenat 2006], a positioning task in indoor environments is2
considered, whereas we aim at continuous navigation on long outdoor paths.3
Let us summarize the other major contributions of our work. An important contribution is that our ap-4
proach is merely appearance-based, hence simple and flexible: the only information required is the database5
of key images, and no model of the environment or obstacles is necessary. Hence, there is no need for6
sensor data fusion nor planning, which can be computationally costly, and requires precise calibration of the7
camera/scanner pair. We guarantee that the robot will never collide in the case of static, detectable obstacles8
(in the worse cases, it will simply stop). We also prove that our control law is always well-defined, and9
that it does not present any local minima. To our knowledge, this is the first time that obstacle avoidance10
and visual navigation merged directly at the control level (without the need for sophisticated planning) are11
validated in real outdoor urban experiments.12
The framework that we present here is inspired from the one designed and validated in our previous13
work [Cherubini and Chaumette 2011]. However, many modifications have been applied, in order to adapt14
that controller to the real world. First, for obstacle avoidance, we have replaced classical potential fields with15
a new tentacle-based technique inspired from [Bonnafous et al. 2001] and [Von Hundelshausen et al. 2008],16
which is perfectly suitable for appearance-based tasks, such as visual navigation. In contrast with those17
works, our approach does not require the robot pose, and exploits the robot geometric and kinematic char-18
acteristics (this aspect will be detailed later in the paper). A detailed comparison between the potential field19
and the tentacle techniques is given in [Cherubini et al. 2012]. In that work, we showed that with tentacles,20
smoother control inputs are generated, higher velocities can be applied, and only dangerous obstacles are21
taken into account. In summary, the new approach is more robust and efficient than its predecessor. A second22
modification with respect to [Cherubini and Chaumette 2011] concerns the design of the translational veloc-23
ity, which has been changed to improve visual tracking and avoid undesired deceleration in the presence of24
non-dangerous obstacles. Another important contribution of the present work is that, in contrast with the25
tentacle-based approaches designed in [Von Hundelshausen et al. 2008] and [Bonnafous et al. 2001], our26
method does not require the robot pose. Finally, the present article reports experiments, which, for the first27
time in the field of visual navigation with obstacle avoidance, have been carried out in real-life, unpredictable28
urban environments.1
The article is organized as follows. In Section 2, the characteristics of our problem (visual path following2
4
IJRR - large
x
d
CURRENT IMAGE I
KEY IMAGES
I
1
… I
N
Visual
Navigation
NEXT KEY
IMAGE I
d
y
O
x
v
Z
c
X
C
φ
φ
.
ω
Y
X
c
Y
c
Z
R
Figure 1: General definitions. Left: top view of the robot (rectangle), equipped with an actuated camera
(triangle); the robot and camera frame (respectively, F
R
and F
C
) are shown. Right: database of the key
images, with current and next key images emphasized; the image frame F
I
is also shown, as well as the
visual features (circles) and their centroid (cross).
with simultaneous obstacle avoidance) are presented. The control law is presented in full details in Section 3,3
and a short discussion is carried out in Section 4. Simulated and real experimental results are presented4
respectively in Sections 5 and 6, and summarized in the conclusion.5
2 Problem Definition6
2.1 General Definitions7
The reader is referred to Fig. 1. We define the robot frame F
R
(R, X, Y ) (R is the robot center of rotation),8
image frame F
I
(O, x, y) (O is the image center), and camera frame F
C
(C, X
c
, Y
c
, Z
c
) (C is the optical9
center). The robot control inputs are:10
u = (v, ω, ˙ϕ) .
These are, respectively, the translational and angular velocities of the vehicle, and the camera pan angular11
velocity. We use the normalized perspective camera model:12
x =
X
c
Z
c
, y =
Y
c
Z
c
.
We assume that the camera pan angle is bounded: |ϕ| ≤
π
2
, and that C belongs to the camera pan rotation13
axis, and to the robot sagittal plane (i.e., the plane orthogonal to the ground through X). We also assume14
that the path can be followed with continuous v (t) > 0. This ensures safety, since only obstacles in front of15
the robot can be detected by our range scanner.1
5