scispace - formally typeset
Open AccessProceedings ArticleDOI

Combining central and peripheral vision for reactive robot navigation

Antonis A. Argyros, +1 more
- Vol. 2, pp 2646-2651
Reads0
Chats0
TLDR
A new method for vision-based, reactive robot navigation that enables a robot to move in the middle of the free space by exploiting both central and peripheral vision and is computationally efficient.
Abstract
In this paper we present a new method for vision-based, reactive robot navigation that enables a robot to move in the middle of the free space by exploiting both central and peripheral vision. The robot employs a forward-looking camera for central vision and two side-looking cameras for sensing the periphery of its visual field. The developed method combines the information acquired by this trinocular vision system and produces low-level motor commands that keep the robot in the middle of the free space. The approach follows the purposive vision paradigm in the sense that vision is not studied in isolation but in the context of the behaviors that the system is engaged as well as the environment and the robot's motor capabilities. It is demonstrated that by taking into account these issues, vision processing can be drastically simplified still giving rise to quite complex behaviors. The proposed method does not make strict assumptions about the environment, requires very low level information to be extracted from the images, produces a robust robot behavior and is computationally efficient. Results obtained by bath simulations and from a prototype on-line implementation demonstrate the effectiveness of the method.

read more

Content maybe subject to copyright    Report

Combining Central and Peripheral Vision
for Reactive Robot Navigation*
Antonis
A.
&gyros Fredrik Bergholm
Computer Vision and Robotics Lab.
Computational Vision and Active Perception Lab.
ICSFORTH
Heraklion
-
Crete
-
Greece
argyros @ics.forth.gr
Abstract
In
this paper; we present a new method for vision-based,
reactive robot navigation that enables a robot
to
move in
the middle of the free space
by
exploiting both central and
peripheral vision. The robot employs a forward-looking
camera for central vision and
two
side-looking cameras for
sensing the periphery of its visual jield. The developed
method combines the information acquired by this trinocu-
lar vision system and produces low-level motor commands
that keep the robot in the middle of the free space. The ap-
proach follows the purposive vision paradigm in the sense
that vision is not studied in isolation but in the context of the
behaviors that the system is engaged as well as the environ-
ment and the robot’s motor capabilities. It is demonstrated
that by taking into account these issues, vision processing
can be drastically simpl$ed, still giving rise to quite com-
plex behaviors. The proposed method does not make strict
assumptions about the environment, requires very low level
information to be extracted from the images, produces a ro-
bust robot behaviorand is computationally eficient. Results
obtained by both simulations and from a prototype on-line
implementation demonstrate the effectiveness of the method.
1
Introduction
The term navigation refers to the capability of a system
to move autonomously in its environment by using its own
sensors. The more specific term visual navigation is used for
the process
of
motion control based on the analysis of data
gathered by visual sensors. The topic of visual navigation is
*This
research has been carried out during the first author’s 1996-97
appointment to CVAPINADAIKTH and was funded under the VIRGO
research network (EC Contract
No
ERBFMRX-CT96-0049) of the TMR
Programme.
NADNKTH
Stockholm
-
Sweden
fredrikb
@
nada.kth.se
of
particular importance mainly because
of
the rich percep-
tual input provided by vision. Moreover, navigation that is
based on other types
of
sensors,
in
contrast to vision, often
requires modification of the environment (e.g. insertion of
emitters) which imposes constraints on the application of
such methods in unknown environments.
The problem of visual navigation has been traditionally
treated without taking very much into account the environ-
ment
of
the robot, its body and the characteristics
of
the de-
sired behavior. Typically, monocular
or
stereoscopic visual
systems are assumed and the effort is focused on construct-
ing
a
general representation of the environment that may
thereafter support the solution of any vision-related prob-
lem. During the last decade, a new vision paradigm has
attracted the interest
of
the computational vision research
community. According to this paradigm, called active and
purposive vision
[I],
vision is more readily understood in
the context
of
the behaviors
in
which the system is engaged.
Consequently, vision attemtps to explore the aspects
of
the
world that are important
for
the system at a given point
in time, instead of aiming at a general representation
of
the environment which, besides being extremely difficult to
extract,
it
is probably not needed either. The interest in
purposive vision is largely motivated by the fact that all bi-
ological vision systems are highly active and purposive
[2].
The purposiveness
of
visual processes enables the formula-
tion and the solution of simpler problems that have a relative
small number of possible solutions and can be treated
in
a
qualitative manner
[3].
In this paper, we describe a new method for visual robot
navigation based on the principles of purposive vision. By
employing a forward-looking camera for central vision and
two side-looking cameras for sensing the periphery
of
the
visual field, reactive robot navigation has been achieved.
The developed method combines the information acquired
by this trinocular vision system to produce low-level
mo-
tor commands that keep the robot in the middle of the free
646
0-7695-0149-4/99
$10.00
0
1999
IEEE
In Proc. of the Comp. Vision and Pat. Recogn. Conf., (CVPR’99), Vol. 2, Ft. Collins, Colorado, Jun. 21-25, 1999.

Figure
1.
Top-down view of the robot geome-
try. The placement of the two peripheral cam-
eras is also shown.
space. The aim of this work is to investigate how the de-
sign of a visual system can assist robots with specific bodies
and motor capabilities
in
exhibiting particular behaviors. It
is demonstrated that considering the behavior and the mo-
tor capabilities of a robot when designing its visual system,
leads to theoretically simpler and computationally more ef-
ficient solutions.
The rest of the paper is organized as follows. Section 2
describes the requirements that the target behavior poses to
the design of the robot’s visual system. Section
3
presents is-
sues related to the motion information that can be computed
by each camera as well as how this information is processed
and used to drive the robot. Section
4
presents results from
simulations of the method as well as implementation issues
and results obtained by an on-line implementation on a real
robotic platform. Finally, in section 5, the main conclusions
of this work are summarized and future research plans are
described.
2
The behavior, the environment and the
body
The study of the behaviors that should be exhibited by
an observer, its environment as well as the specifics of the
observer, provides valuable hints on how the sensors should
be placed in order to facilitate the implementation of a par-
ticular behavior. In this work we assume a robot that can
translate in the forward direction and rotate (pan) around
its vertical axis (Fig.
1).
We aim at developing a vision
based reactive navigation capability that enables the robot
to navigate in flat-floor indoor environments (long corridors,
narrow passages, rooms), avoiding collisions with walls and
obstacles. The term reactive is used to express lack of a
particular destination that could be set by using maps of
Figure
2.
The KTH head with
two
extra cam-
eras mounted on it for implementing periph-
eral vision.
the environment, landmark recognition etc. Free space is
defined based on the motor capabilities of the robot: the
robot moves on a plane and, therefore, all
3D
structures
that do not belong to this plane can be potentially harmful
if the robot crashes on them. Since the robot is about to
“live” in indoor environments, it is expected to be able to
handle situations where long corridors and narrow passages
are encountered. It can be shown that difficulties arise when
only central vision is used (i.e. a camera or a fixating stereo
configuration at the direction of translation). Consider for
example a camera with a field of view of
30
degrees that
is placed in a a 2-meters wide corridor with its optical axis
parallel to the walls. The camera can only see walls that
are approximately
4
meters ahead and it is therefore quite
difficult to maneuver accurately. On the other hand, the use
of cameras with wide field of view
[4]
give rise to depth
dependent geometric distortions that are difficult to correct.
In order to implement this behavior,
it
appears quite natural
to exploit the information provided by peripheral vision, i.e.
visual information at large angles with respect to the direc-
tion of forward translational motion (see for example the
configuration in Fig.
2).
By using such a camera configura-
tion, the robot is able to perceive walls and obstacles that are
immediately close to it. Moreover, the target behavior may
be implemented by indirectly comparing crude structure in-
formation acquired by the left and right peripheral cameras
instead of computing precise structure information. This ap-
proach is motivated by experiments that study the behavior
of honeybees [5]. In these experiments, bees were trained
to navigate along corridors towards a source of food. The
bees were observed to navigate in the middle of the corridor.
The eyes of the bees are pointing laterally (at about
180
degrees). The behavior is based [5] on velocity information
computed at the left and right eyes of the bee. In simple
terms, if a non-rotating (no panningkilting) bee is
in
the
center of the corridor, it perceives the world as ‘‘leaving’’ its
optical field with the same velocity in both eyes, while if
the bee is closer to one of the sides of the corridor, it per-
ceives it as moving faster. For a non-rotating observer, the
difference in the observed velocities depends only on depth.
647

a
Figure
3.
(a), (b) If a robot with lateral pe-
ripheral cameras is in the middle of the free
space, it perceives equal distances from left
and right walls independently of pose. (c) and
(d) If the peripheral cameras are slanted, the
distances at left and right are equal only
if
the
robot’s pose is parallel to the walls.
Therefore, balancing the flow, balances the distances to the
left and the right of the observer. Santos-Victor et al.
[6]
proposed the divergent stereo approach in order to exploit
this finding in robots. They exploit visual information that
is captured by two cameras with optical axes of opposite
ori-
entation that are mounted perpendicularly to the direction
of forward translation. Our research differs to the approach
in
[6]
in
several ways. First, peripheral cameras are not
placed
in
opposite directions because decisions on forward
motion should not be influenced by “past” structure infor-
mation. Proximity calculations based on data at
90
degrees
angle to the motion direction are sort of obsolete for most
reactive schemes and navigation situations. Second,
it
turns
out that control is facilitated when the cameras are slanted.
See for example Figs. 3(a) and 3(b) where the cameras
are placed laterally on the robot body. The robot perceives
equal distances at its left and right side, independently of
pose. However, the situation is different in cases 3(c) and
3(d) where the cameras are slanted towards the direction of
translation. If the robot is in the middle of the free space,
it perceives equal distances from the walls only if its pose
is parallel to the walls. Therefore, in this case, flow balanc-
ing fixes also the pose of the robot. Last, but not least, we
study the effects of the observer’s rotational motion in the
flow computed by the two peripheral cameras. A moving
robot is not only translating but also rotating and this rota-
tion affects the computed flow. We also show how central
vision (i.e. visual information acquired in the direction of
the translation) can be used along with peripheral vision in
order to simplify the problems to be solved.
3
Method description
Consider an arbitrary 3D reference coordinate system
(RCS). Consider also a 3D camera coordinate system (CCS)
that is positioned at the optical center (nodal point) of a
pinhole camera. Assume that the center
of
the RCS remains
fixed at coordinates
(X,
,
Y,,
2,)
with respect to the CICS.
If the RCS moves with 3D translational velocity
(U,
V,
W)
and 3D rotational velocity
(a,
p,
y),
the equations relating
the 2D velocity
(U,
v)
of an image point
p(x, y)
to the 3D
motion parameters of the projected 3D point
P(X,
Y,
2:)
are
[71:
-Uf
+
XW
-
axY,
+
P(XX,
+
Zsf)
+
yYsf
-
Z
U=
+
a--@
-+f
+yy
xy
f
(; )
+
a
-+f
-p---yx
(?
)
7
where
f
denotes the focal length of the camera.
The projection of the optical flow
(U,
v)
along the inten-
sity gradient direction (i.e. the perpendicular to the edge at
that point) is also known as normal flow. The normal flow
is less informative than optical flow but can be computed
robustly and efficiently from image sequences by just using
differentiation techniques. Moreover, in contrast to the com-
putation of optical flow, no environmental assumptions such
as smoothness are required for normal flow computation.
For the above reasons, the proposed method for reactive
robot navigation relies on the computation of the normal
flow field. Let
(nz,
n,)
be the unit vector in the gradient
direction. The magnitude
uM
of
the normal flow vector is
given by:
uM
=
nzu+nyw
(2)
By substituting Eqs.
(1)
in Eq. (2) we obtain:
+
a)
v
-
az,
+
yx,
-
nyf
(
+
(ynz
-xny)y
(3)
In
our
robot setup (see Fig
l),
we set the RCS on the robot’s
body
so
that the
Y
axis coincides with the robot’s rotational
axis and the
Z
axis is parallel to the robot’s translational
motion.
As
it has already been discussed, we assume that
the robot is capable of translating with velocity
S
and rotate
(pan) with velocity
p.
We set the nodal point of the right
peripheral camera
so
that the center of the RCS is at coor-
dinates
(-X,,
0,
-ZS).
Similarly we set the nodal points
648

of the left peripheral camera and of the central camera
so
that the center of the RCS is at coordinates (Xs,
0,
-Zs)
and
(0, 0, 0),
respectively. The translational velocity
S
of the robot produces translational velocities
(-U,
0,
W),
(U,
0,
W) and
(0,
0,
S)
for the right, left and central cam-
eras, respectively. For all three cameras, the rotational ve-
locity of the robot produces a rotational velocity
(0,
P,
0)
at
the camera coordinate system.
By taking into account the above considerations for each
of the left
(L),
right
(R),
and central
(C)
cameras, Eq. (3)
gives:
UCM
=
bxf
where
ZL,
ZR
and
Zc
represent the depth of the
3D
points
perceived by the left, right and central cameras, respectively.
By selecting normal flow vectors
for
which it holds that
xn,
+
yn,
=
O',
we obtain:
and
and and by selecting normal flow vectors for which it holds
that
(nx,
nY)
=
(0,l)
(the vertical normalflows), we obtain:
The left sides of Eqs.
(7),
(8)
and
(9)
can be computed
from images because they employ normal flow values, point
coordinates and gradient directions. The right sides of Eqs.
(7)
and
(9)
employ functions of depth and can get an even
simpler form by noting that the function acquired by the
central camera (Eq.
(8)),
gives the rotation. Thus, Eqs.
(7)
and
(9)
after derotation, become:
'The selected
normal flow
vectors
are
those that are tangent
to
circles
centered at the
image
origin
and
Thus, central vision can be used to derotate the flow fields
produced at the peripheral cameras. Having exploited this
observation, it turns out that:
Equation (12) can be rewritten in a simpler form as follows:
In Eq. (13),
.F
is
a quantity that can be directly computed
from functions of normal flow that have been extracted from
the central and peripheral cameras.
C
is an unknown con-
stant (of known sign) that depends on the characteristics of
the body of the observer as well as its constant translational
velocity. Function
F
is equal to zero when the left and right
cameras are in equal distances from world points and takes
positive
or
negative values depending on whether the right
camera is farther
or
closer from obstacles compared to the
left camera. Therefore, the computable quantity
T
can be
used to control the rotational velocity by keeping the quan-
tity
F
as close to zero as possible, achieving this way the
desired behavior. Note that
C
is equal to zero if the nodal
points of the left, right and central cameras are collinear be-
cause in this case
U2
=
W. In our robot setup we avoid
this special case.
4
Implementation issues
-
Experimental re-
sults
An experimental evaluation of the proposed method has
been based on both simulation results as well as on results
obtained by an on-line implementation of the method on
a real robotic platform. Simulations have been based on
the KHEPERA simulator
[8],
which has been modified to
simulate the central and peripheral cameras of the robot. The
aim of the simulation experiments was to test the control
law used to drive the robot. Thus the function
.F
of Eq.
(1
3) has been simulated and the robot was set to navigate in
various environments. Several experiments were conducted.
Figure
4
shows a sample run. Thin dark lines represent
the walls of the corridor-like environment. The thick dark
line is the trace of the robot. It can be observed that the
robot started at the bottom-right end of the environment and,
after reaching the end of the corridor has started moving
649

Figure
4.
A
run
of
the simulated robot.
backwards. Moreover, the robot moves in a smooth path
among the various obstacles of the environment.
One of the most interesting results of simulations was the
difference in the behavior of the simulated robot depending
on whether the peripheral cameras were laterally placed or
slanted with respect to the direction of translation. ,It turns
out that slanted cameras result in smooth robot paths, while
the laterally placed ones produce snake-like robot motion
patterns.
The simulation experiments are, of course, not adequate
for testing the performance of the method when real vision
processes are employed. For this reason, an on-line imple-
mentation of the method has been realized.
The
platform
used was a LABMATE ROBUTER on which “Charlie”, the
KTH active vision head has been mounted. Two extra cam-
eras were mounted on “Charlie” implementing peripheral
vision
(Fig.
2).
Only one
of
the central cameras was used
to
implement central vision. Note that the central camera
is
not placed at coordinates
(0,
0,O)
with respect to the RCS as
it has been theoretically assumed. The process of selecting
normal flow vectors at specific directions leads to cancelling
of motion components. It turns out that Eq.
(8)
still holds if
In our implementation, a
SUN
Ultra Sparc was respon-
sible for peripheral vision processing and a PENTIUM pro-
cessor running LINUX was responsible for central vision
processing. The distributed processing as well as the inter-
process communication was based on the TCX communi-
cations library
[9].
Various navigation scenarios have been
tested in which the robot successfully managed to perform
maneuvers in narrow passages. In Figs.
5
and
6,
we present
snapshots from two different navigation sessions.
In our present algorithm, we did not allow for individual
motions of the cameras (eye movements). This is roughly
tantamount to the assumption of approximately known FOE
while exhibiting this behavior. However, we did not cali-
brate the head
so
that the central camera pointed exactly
in
the forward motion translation direction. In fact, we noticed
that in many of the successful navigation experiments the
optical axis of the central camera was
5-10
degrees off the
xs
#
0.
Figure
5.
Snapshots
of
a navigation session
(left
to right, top to bottom).
650

Citations
More filters

TOURBOT - Interactive Museum Tele-presence Through Robotic Avatars Project Presentation and Prospects

TL;DR: TOURBOT as mentioned in this paper is an interactive tour-guide robot able to provide individual access to museums' exhibits and cultural heritage over the Internet, where the robot operates as the user's avatar in the museum (i.e. as a remote “representative” of the museum).
Proceedings ArticleDOI

Motion Estimation for Obstacle Detection and Avoidance Using a Single Camera for UAVs/Robots

TL;DR: The Motion Estimation technique has been proposed to solve for obstacle detection problem for collision avoidance, using a single camera, and not only overcomes various constraints of other approaches, but also retains most of their merits.
Journal ArticleDOI

Optical flow‐based obstacle avoidance of a fixed‐wing MAV

TL;DR: In this paper, the optical flow information of two cameras mounted on the aircraft is used to detect the obstacle and make a rapid turn manoeuvre for the aircraft, and the proposed navigation and control strategy give satisfactory results in different flight environments like corridors with parallel and non-parallel walls and in the L junctions.
Proceedings ArticleDOI

Visual-based fuzzy navigation system for mobile robot: Wall and corridor follower

TL;DR: A visual-based fuzzy navigation system that enables a mobile robot in moving through a corridor or following a wall and a mamdani-type fuzzy logic controller that shows a good performance in navigating the robot is presented.

Enhancing Museum Visitor Access Through Robotic Avatars Connected to the Web

TL;DR: The TOURBOT project is presented, which emphasizes the development of alternative ways for interactive museum tele-presence, essentially through the use of robotic "avatars", and presents a high degree of novelty as well as a number of technical and conceptual issues and challenges.
References
More filters

Intelligence without reason.

TL;DR: It is claimed that the state of computer architecture has been a strong influence on models of thought in Artificial Intelligence over the last thirty years.
Book ChapterDOI

Intelligence without reason

TL;DR: In this article, the authors make the converse claim that the state of computer architecture has been a strong influence on our models of thought, and they use non-Von Neumann computational models they use share many characteristics with biological computation.
Journal ArticleDOI

The interpretation of a moving retinal image

TL;DR: It is shown that from a monocular view of a rigid, textured, curved surface it is possible, in principle, to determine the gradient of the surface at any point, and the motion of the eye relative to it, from the velocity field of the changing retinal image, and its first and second spatial derivatives.
Book

Active vision

TL;DR: Sometimes, reading is very boring and it will take long time starting from getting the book and start reading, but in modern era, you can take the developing technology by utilizing the internet and search for the book that is needed.
Proceedings ArticleDOI

Purposive and qualitative active vision

TL;DR: The traditional view of the problem of computer vision as a recovery problem is questioned, and the paradigm of purposive-qualitative vision is offered as an alternative and the design of the Medusa of CVL is described.