What contributions have the authors mentioned in the paper "Qualitative vision-based path following" ?

The authors present a simple approach for vision-based path following for a mobile robot. The authors also demonstrate that the same approach works with wide-angle and omnidirectional cameras with only slight modification.

What are the future works mentioned in the paper "Qualitative vision-based path following" ?

Future work should be aimed at incorporating higher-level scene knowledge to enable obstacle avoidance and terrain characterization, as well as connecting multiple teaching paths in a graph-based framework to enable autonomous navigation between arbitrary points.

What does the algorithm require to be able to do?

The algorithm does not make use of the traditional concepts of Jacobians, homographies, fundamental matrices, or the focus of expansion, and it does not require any camera calibration, including lens calibration.

What is the odometry of the robot?

At any given time, the desired heading of the robot is given byθd = η 1NN∑i=1θ (i) d + (1 − η)θo, (1)where N is the total number of feature points, θo is the desired heading obtained by sampling a third-order polynomial that is fit to the initial and destination odometry measurements of the segment in the teaching phase, and the factor 0 ≤ η ≤ 1 determines the relative importance of visual measurements versus odometry measurements.

What is the definition of the funnel lane?

Definition 2: The funnel lane of a fixed landmark λ, a robot location D, and a relative angle α is the set of locations Fλ,D,α ⊂ Fλ,D such that θC − θD = α for each C ∈ Fλ,D,α.Multiple features yield multiple funnel lanes, the intersection of which is the set of locations for which both constraints are satisfied for all the features.

What is the simplest way to prove the theorem?

If the robot moves toward the destination in a straight line with the same heading direction as that of the destination (i.e., θC = θD), then the point uC will move away from the principal point toward uD, reaching uD when the robot reaches D. This observation is made more precise in the following theorem.

(Open Access) Qualitative Vision-Based Path Following (2009) | Zhichao Chen

Q: What was the first experiment that the robot did?

In the first, the robot navigated a slanted ramp in a 40 m run, thus verifying that the algorithm does not require a flat ground plane.

IEEE TRANSACTIONS ON ROBOTICS, VOL. X, NO. X, XX XXXX 1

Qualitative Vision-Based Path Following

Zhichao Chen and Stanley T. Birchﬁeld, Senior Member, IEEE

Abstract—We present a simple approach for vision-based path

following for a mobile robot. Based upon a novel concept called

the funnel lane, the coordinates of feature points during the

replay phase are compared with those obtained during the teach-

ing phase in order to determine the turning direction. Increased

robustness is achieved by coupling the feature coordinates with

odometry information. The system requires a single off-the-shelf,

forward-looking camera with no calibration (either external or

internal, including lens distortion). Implicit calibration of the

system is needed only in the form of a single controller gain.

The algorithm is qualitative in nature, requiring no map of

the environment, no image Jacobian, no homography, no fun-

damental matrix, and no assumption about a ﬂat ground plane.

Experimental results demonstrate the capability of real-time

autonomous navigation in both indoor and outdoor environments,

on ﬂat, slanted, and rough terrain with dynamic occluding objects

for distances of hundreds of meters. We also demonstrate that

the same approach works with wide-angle and omnidirectional

cameras with only slight modiﬁcation.

Index Terms—feature tracking, mobile robot navigation,

vision-based navigation, control

I. INTRODUCTION

Oute-based knowledge, in which the spatial layout of

an environment is recorded from the perspective of a

ground-level observer, is an important component of human

and animal navigation systems [31]. In this representation,

navigating from one location to another involves comparing

current visual inputs with a sequence of views captured along

the path in a previous instance. Applications that would beneﬁt

from such a path-following capability include courier and

delivery robots [4], robotic tour guides [32], or reconnaissance

robots following a scout [7].

One approach to path following is visual servoing, in which

the robot is controlled to align the current image with a

reference image, both taken by an onboard camera [14]. Such

an approach generally employs a Jacobian to relate the coordi-

nates of world points to their projected image coordinates [5],

a homography or fundamental matrix to relate the coordinates

between images [29], [20], [27], [36], or bundle adjustment

to minimize the reprojection error over multiple image frames

[28]. As a result, the camera usually must be calibrated [5],

[27], [28], [36], and even uncalibrated systems require lens

distortion to be removed. Alternative vision-based algorithms

make strong assumptions about the environment or the sensor,

such as a ﬂat ground plane [5], [20], [12], [29], a man-made

environment in which vertical straight lines are present [16],

[12], [29], [34], or an omnidirectional camera [10], [35], [18],

[34].

To overcome these limitations, we consider the problem

from a novel viewpoint in which there is no equation re-

lating image coordinates to world coordinates. Such a direct

approach is motivated by the observation that the problem

is vastly overdetermined, with tens of thousands of image

pixels available to determine a single turning command output.

We present a simple algorithm that uses a single, off-the-

shelf camera attached to the front of the robot. The technique

follows the teach-replay approach [5] in which the robot is

manually led through the path once during a teaching phase

and then follows the path autonomously during the replay

phase. Without any camera calibration (even calibration for

lens distortion), the robot is able to follow the path by making

only qualitative comparisons between the feature coordinates

in the two phases. All that is needed is a single controller

gain parameter to convert pixel coordinates to turning angles.

We demonstrate the technique on several indoor and outdoor

experiments, showing its robustness with respect to slanted

surfaces, changing lighting conditions, and dynamic occluding

objects. This paper extends the applicability and improves

upon the robustness of our earlier work [6] by incorporating

odometry information and correcting for camera roll. We

also demonstrate the ability of the technique to work with

wide-angle and omnidirectional cameras, with only slight

modiﬁcation in the latter case to ignore the bottom half of

the image which views the scene behind the robot.

The proposed approach falls within the category of mapless

algorithms [8]. As such, it is closely related to the view-

sequenced route representation (VSRR) of Matsumoto et al.

[21], [22], [15] in which the turning angle is computed by

cross-correlating images acquired during the replay phase with

those captured during training. However, VSRR requires large

amounts of memory to store the views and is sensitive to

occlusions by dynamic objects. Along with a homography-

based extension using vertical lines [29], it has only been

demonstrated for short sequences on ﬂat terrains. An alternate

mapless approach is to learn the mapping from images to

turning commands based on their classiﬁcation [37], [1]. While

this method can successfully follow a speciﬁc pattern such

as a road or hallway, it will have difﬁculty generalizing to

environments in which the images cannot be categorized into

a small number of classes known at training time. Another

approach that has received considerable attention [10], [38],

[39], [33], [17], [35], [13] is to store an example image with

each speciﬁc location of interest. At run time, the image

database is searched to ﬁnd the image that most closely

resembles the current one (or, alternatively, the current image

is projected onto a manifold learned from the database [25],

[18]). Such approaches require extensive training and have

difﬁculty providing sufﬁcient spatial resolution to determine

actual turning commands in large environments. Similarly,

sensory-motor learning has been used used to map visual

inputs to turning commands, but the resulting algorithms

have been too computationally demanding for real-time per-

formance [11]. Other researchers have developed mapless

IEEE TRANSACTIONS ON ROBOTICS, VOL. X, NO. X, XX XXXX 2

algorithms for low-level functionality like corridor following

or obstacle avoidance [26], [30], [2], [19], [23], [24], but these

techniques are not applicable to following a speciﬁc arbitrary

path.

II. QUALITATIVE MAPPING FROM FEATURE COORDINATES

TO TURNING DIRECTION

Consider a mobile robot equipped with a camera whose

optical axis is parallel to the heading direction of the robot.

Suppose we wish to move the robot from location C =

, y

, θ

) to a previously encountered location D =

, y

, θ

), where (x

, y

) and θ

are the position and

orientation, respectively, in the xy plane, i ∈ {C, D}. The

robot has access to a current image I

, taken at C, and a

destination image I

, taken previously at the destination D.

We start with a simple observation. Suppose the robot

views a ﬁxed landmark in both images yielding image feature

coordinates of u

and u

, as shown in Figure 1. The features

are computed with respect to a coordinate system centered

at the principal point (the intersection of the optical axis and

the image plane), so that positive coordinates are on the right

side of the image while negative coordinates are on the left

side. If the robot moves toward the destination in a straight

line with the same heading direction as that of the destination

(i.e., θ

= θ

), then the point u

will move away from

the principal point toward u

, reaching u

when the robot

reaches D. This observation is made more precise in the

following theorem.

Theorem 1: Let a mobile robot move in a straight line

toward location D on a ﬂat surface. Let u

be the horizontal

image coordinate, relative to the principal point, of a mono-

tonic projection at location j of a ﬁxed landmark. For any

location C along the line such that θ

= θ

, |u

| < |u

and sign(u

) = sign(u

The theorem can be easily proved by geometry. Note that

the image projection function is only required to be monotonic

(i.e., perspective projection is not necessary), so the result

applies equally to a camera with radial lens distortion. The pri-

mary assumption is that the optical axis of the camera passes

through the axis of rotation of the robot. Other assumptions

include the alignment of the optical axis with the robot heading

direction, zero roll and tilt angles of the camera with respect

to the robot, and a ﬂat ground plane. In practice, misalignment

is not an issue because the camera alignment can be learned

automatically by estimating the focus of expansion as the robot

drives forward. Similarly, rough terrain is easily handled by

measuring image rotation to compensate for a non-zero roll

angle of the robot and by recognizing that a non-zero tilt angle

has a negligible effect on the horizontal feature coordinates.

A. The funnel lane

According to the preceding theorem, if the robot is on the

path toward the destination with the same heading direction,

then two constraints are satisﬁed. Conversely, as shown in

Figure 2, if the constraints are satisﬁed then the robot lies

within a trapezoidal region (assuming perspective projection)

for any given relative robot angle α = θ

− θ

. For α = 0

Fig. 1. The robot is at C moving toward the destination D with the same

heading direction. The open circle coincides with both the camera focal point

and the robot position, the arrow indicates the heading direction, π is the

image plane, and φ is the angle between the optical axis and the projection

ray from the landmark.

the sides of the trapezoid are deﬁned by two lines passing

through the landmark, one through D and another that is

parallel to the destination direction. These lines are rotated

about the landmark by α if the relative angle is nonzero.

We call the trapezoidal region the funnel lane associated with

the landmark, destination, and relative angle. The terminology

arises from the analogy of pouring liquid into a funnel: The

liquid moves in a straight line until it hits the sides of the

funnel, which cause it to bounce back and forth until it

eventually reaches the spout. In a similar manner, the sides

of the trapezoid act as bumpers, guiding the robot toward the

goal. The notion of the funnel and the funnel lane are captured

in the following deﬁnitions.

Deﬁnition 1: The funnel of a ﬁxed landmark λ and a robot

location D is the set of locations F

λ,D

such that, for each

C ∈ F

λ,D

, the two funnel constraints are satisﬁed:

| < |u

| (Constraint 1)

sign(u

) = sign(u

) (Constraint 2)

where u

and u

are the coordinates of the image projection

of λ at the locations C and D, respectively.

Deﬁnition 2: The funnel lane of a ﬁxed landmark λ, a robot

location D, and a relative angle α is the set of locations

λ,D,α

⊂ F

λ,D

such that θ

− θ

= α for each C ∈ F

λ,D,α

Multiple features yield multiple funnel lanes, the intersec-

tion of which is the set of locations for which both constraints

are satisﬁed for all the features. This intersection, which we

call the combined funnel lane, is depicted in Figure 2. Notice

the importance of having features on both sides of the image

in order to narrowly constrain the path of the robot, thus

achieving more robust and accurate results. Features can be at

any depth, and there need not be any relationship between the

depths of the various features as long as they remain visible.

B. Qualitative control algorithm

The funnel constraints lead to a simple control algorithm,

illustrated in Figure 3. The robot continually moves forward,

turning to the right whenever Constraint 1 is violated and to

the left whenever Constraint 2 is violated, given a feature on

IEEE TRANSACTIONS ON ROBOTICS, VOL. X, NO. X, XX XXXX 3

Fig. 2. TOP: The funnel lane created by the two constraints, shown when the

robot is facing the correct direction (left) and when it has turned by an angle

α (right). BOTTOM: The combined funnel lane created by multiple feature

points, shown when the robot is facing the correct direction (left) and when

it has turned by an angle α (right).

the right side of the image (u

> 0). If the feature is on the

left side (u

< 0), then the directions are reversed.

For each feature i, a desired heading is obtained by

(i)







γ min{u

, φ(u

, u

)} if u

> 0 and u

> u

γ max{u

, φ(u

, u

)} if u

< 0 and u

< u

0 otherwise

where φ(u

, u

) =

√



− u



is the signed distance to

the line u

= u

. Here we approximate the conversion of

pixels to radians with a constant gain γ.

At any given time, the desired heading of the robot is given

= η

i=1

(i)

+ (1 − η)θ

, (1)

where N is the total number of feature points, θ

is the desired

heading obtained by sampling a third-order polynomial that is

ﬁt to the initial and destination odometry measurements of

the segment in the teaching phase, and the factor 0 ≤ η ≤

1 determines the relative importance of visual measurements

versus odometry measurements. We set η = 0.5 in our system.

III. TEACH-AND-REPLAY NAVIGATION

The navigation system involves two phases. In the teaching

phase, an operator manually moves the robot along a desired

path to gather training data. The path is divided into a number

of non-overlapping segments. Within each segment, feature

points are automatically detected in the ﬁrst image and tracked

in subsequent images. When the percentage of features that

have been successfully tracked falls below 50% of the original

features in the segment, a new segment is declared. For each

feature that is successfully tracked throughout a segment, its

−50

(i)

Fig. 3. Qualitative control decision space. The horizontal coordinates of the

feature point in the current and destination images (u

and u

, respectively)

are compared to determine whether to turn the robot to the right, to the left,

or not at all. LEFT: Top-down view of decision space. RIGHT: 3D view of

decision space, showing the desired angle θ

(i)

versus u

and u

graylevel intensity pattern and x-coordinate in the ﬁrst and last

images of the segment are stored in a database for use in the

replay phase. We also store the length of each segment and the

change of heading direction of the robot in each segment by

odometry, which are used in determining the desired heading

and the segment transitions.

In the replay phase, the robot automatically proceeds se-

quentially through the segments starting from approximately

the same initial location as that of the teaching phase. At

the beginning of each segment, correspondence is established

between feature points in the current image and those of

the ﬁrst teaching image of the segment. Then, as the feature

points are tracked in the incoming images, their coordinates

are compared with those of the milestone image (i.e., the last

teaching image of the segment) in order to determine the

turning direction for the robot. Prior to comparison, feature

coordinates are warped to compensate for a non-zero roll angle

about the optical axis by applying the RANSAC algorithm

[9] to pairs of random features. This compensation removes

the undesirable in-plane image rotation that occurs due to

unpaved, rough terrain. Note that this is the only place in the

algorithm where the y-coordinates of the features are used.

A crucial component of the technique is determining when

to transition to a new segment. To solve this problem, we

continually monitor the probability that the robot at time t is

at the end of the current segment:

δ(t) = exp

(

−



(t)

2σ

)

{z }

feature

exp



−



(t)

2σ



{z }

distance

exp



−



(t)

2σ



{z }

heading

, (2)

assuming that the feature, distance, and heading measurements

are independent. In this equation 

(t) is the mean squared

error of the feature coordinates between the current and

milestone images; 

(t) is the difference between the distance

traveled in the current segment and the corresponding segment

in the teaching phase, calculated by odometry; and 

(t) is

the difference between the current heading and the heading at

the end of the teaching segment. These errors are normalized

by values computed automatically by the system: σ

is the

mean squared error of the feature points at the beginning

of the segment; σ

is the length of the segment calculated

IEEE TRANSACTIONS ON ROBOTICS, VOL. X, NO. X, XX XXXX 4

0 1 2 3 4 5 6

−4

−3

−2

−1

x(m)

y(m)

teaching

replay

Desk

Chair

Chair Chair

Chair

Desk

Chair

Desk

Chair

Cabinet

Door

Chair

0 50 100

−40

−20

x(m)

y(m)

teaching

replay

start

end

Fig. 4. The teaching and replay paths of the robot in an indoor environment

(left), and an outdoor environment (right).

by odometry in the teaching phase; and σ

is the maximum

variation in heading encountered during the teaching segment.

Two values are actually computed: δ(t) using the current

milestone image and δ

−

(t) using the previous milestone

image. If δ(t−1)−δ(t) > τ and δ

−

(t−1)−δ

−

(t) > τ , where

τ = 0.05, then the system advances to using the next milestone

image. The rationale is that both δ(t) and δ

−

(t) increase as

the robot approaches the end of the segment then decrease

afterward. Therefore, when both values have decreased by a

signiﬁcant amount, the end has been reached. We have found

that using both values yields improved results compared with

using a single value. To reduce the effects of noise, both

signals are ﬁrst smoothed by a low-pass nonlinear ﬁlter.

IV. EXPERIMENTAL RESULTS

The qualitative algorithm was implemented in Visual C++

on a Dell Inspiron 700m laptop (1.6 GHz) controlling an

ActivMedia Pioneer P3-AT mobile robot with an inexpensive

Logitech QuickCam Pro 4000 webcam mounted on the front.

The 320 × 240 images were acquired at 30 Hz and processed

by the KLT algorithm with the default 7 × 7 feature window

size [3]. In all experiments a maximum of 60 features were

detected and tracked throughout each segment. On average

85% of the features survive the initial correspondence in the

ﬁrst image of the segment during replay.

The algorithm was tested in a number of indoor and outdoor

environments.

Figure 4 shows two typical runs in which the

robot successfully navigated between chairs and desks along

a 10 m path in our laboratory, as well as along a 380 m

loop trajectory in a parking lot of our university campus. The

driving speed of the robot was 100 mm/s and the turning speed

was 4 degrees per second during both the teaching and replay

phases of the indoor experiments. Outdoors, the additional

maneuvering room enabled the driving and turning speeds to

be increased to 750 mm/s (the maximum driving speed of the

robot) and 6 degrees per second, respectively. The error was

less than 1 m for two-thirds of the sequence and remained

below 2.5 m for the entire sequence.

Figure 5 shows sample images from two experiments

demonstrating the robustness of the algorithm. In the ﬁrst, the

robot navigated a slanted ramp in a 40 m run, thus verifying

that the algorithm does not require a ﬂat ground plane. In

the second, the robot navigated a narrow road for 80 m

Videos of the results can be found in the multimedia attachment or at

http://www.ces.clemson.edu/˜stb/research/mobile_robot

Fig. 5. Sample image frames showing the robot traveling down and up a

ramp and past dynamic objects. The circles indicate the features.

0 1 2 3 4 5

x(m)

y(m)

Teaching

Replay

start

end

0 1 2 3 4 5 6

−1

x(m)

y(m)

Teaching

Replay

startstart

end

Fig. 6. The approach successfully following a path using a wide-angle camera

(left) and an omnidirectional camera (right).

while a pedestrian walked by the robot and later a van drove

by it. Because the milestone images change frequently, the

algorithm quickly recovered from the loss of features due to

the occlusion caused by the dynamic objects.

Similarly, Figure 6 shows the results of the approach using

cameras with severe lens distortion. In one experiment we

used a wide-angle camera with a 3.5 mm focal length and

110-degree ﬁeld of view. The other experiment utilized an

omnidirectional camera with a 360-degree ﬁeld of view. For

both experiments we used the same parameters as the previous

experiments. The only change made to the code was to

discard the bottom half of the omnidirectional donut image.

This step was necessary because features behind the robot

(whether viewed by an omnidirectional or standard camera)

move in a way that violates the fundamental assumptions of

our approach. In contrast, features in front of the camera obey

the funnel constraints sufﬁciently to be of use in keeping

the robot on the path, despite their moving in curved image

paths due to the severe lens and catadioptric distortion. The

average error of the two experiments was 0.04 m and 0.04 m,

respectively, while the maximum error was 0.13 m and 0.09 m.

To further illustrate the lack of calibration, we conducted

an outdoor experiment in which the robot navigated the same

50 m path twice. In the ﬁrst run the robot used the Logitech

Quickcam Pro 4000 camera, while in the second run it used an

Imaging Source DFK21F04 Firewire camera with an 8.0 mm

F1.2 lens. The same camera was used for both teaching

and replay. As shown in Figure 7, the algorithm was able

to successfully follow the path using either camera, without

changing any parameters between runs.

Three additional experiments are shown in Figure 8. In

the ﬁrst, a scout robot was sent along an outdoor path.

Another robot, which received the transmitted path informa-

tion, was then able to follow the same path as the scout.

This demonstrates a natural application to swarm robotics,

where calibrating dozens or hundreds of cameras would be

prohibitive, especially if recalibration is needed whenever the

IEEE TRANSACTIONS ON ROBOTICS, VOL. X, NO. X, XX XXXX 5

0 5 10 15

−40

−30

−20

−10

x(m)

y(m)

Teaching

Replay

start

end

0 5 10 15

−40

−30

−20

−10

x(m)

y(m)

Teaching

Replay

start

end

Fig. 7. Teaching and replay paths for the robot using two different

uncalibrated cameras, with the same system parameters. LEFT: Logitech

QuickCam Pro 4000 USB webcam, RIGHT: Imaging Source DFK 21F04

Firewire camera.

−20 −10 0 10 20 30 40 50

−50

−40

−30

−20

−10

x(m)

y(m)

scout robot

following robot

start

end

0 20 40 60

−50

−40

−30

−20

−10

x (m)

y (m)

teaching

replay

start

end

0 5 10 15 20 25 30 35

−5

x(m)

y(m)

Teaching

Replay

start

end

Fig. 8. LEFT: The robot followed a path taken earlier by a scout robot.

MIDDLE: A path on rough terrain. RIGHT: A path with sharp turns.

lenses are refocused or the cameras adjusted. The second

experiment shows the robot following a path along rough

terrain, in which roll and tilt angles up to 5 degrees were

encountered. The roll angle compensation described earlier

was sufﬁcient to enable the robot to remain on the path. In

the third, a path with several sharp turns is demonstrated. This

ability is achieved by setting the replay driving speed to be

that of the teaching driving speed, which is decreased during

a turn.

Additionally, the algorithm was tested in various scenarios

to quantitatively measure its accuracy and repeatability. Table I

displays the results of the algorithm compared with those of

the earlier version [6] which did not use odometry, relied

upon a bang-bang control scheme, and did not compensate

for the camera roll angle. The algorithms were tested in three

environments: a 15 m path in an indoor laboratory environment

with rich texture for feature tracking, a 60 m trajectory in an

outdoor paved parking lot, and a 40 m path along unpaved

terrain. In each case, we conducted ten trials and recorded

the ﬁnal 2D location of the robot for each trial: {x

}

i=1

where x

∈ R

and n = 10. Accuracy was measured as the

RMS Euclidean distance to the ﬁnal ground truth location:

i=1

||x

− x

g t

. Repeatability was measured as the

standard deviation of the ﬁnal locations:

i=1

||x

− µ||

where µ =

i=1

. While the earlier algorithm works well

when the ground is paved and the scenery is rich in texture, the

improved algorithm is more robust, achieving maximum errors

of only 0.23 m, 1.20 m, and 1.76 m, respectively, compared

with 0.45 m, 1.20 m, and 5.68 m for the earlier algorithm.

V. DISCUSSION

Because our system does not explicitly model the geometric

world, its geometric accuracy is limited. Therefore, when com-

pared with map-based approaches using calibrated cameras

[28], the errors exhibited by the simple control scheme of

outdoor outdoor

Algorithm indoor paved ground rough terrain

acc. / rep. acc. / rep. acc. / rep.

(m) / (m) (m) / (m) (m) / (m)

vision only [6] 0.30 / 0.18 0.77 / 0.74 3.87 / 1.85

combination (this paper) 0.14 / 0.08 0.60 / 0.55 1.47 / 0.66

TABLE I

COMPARISON OF THE ACCURACY AND REPEATABILITY OF THE

ALGORITHM WITH AN EARLIER VERSION, IN THREE DIFFERENT

SCENARIOS. THE LOWEST NUMBER IN EACH CASE IS IN BOLD.

our algorithm are rather large. Nevertheless, the remarkable

ﬂexibility and versatility of the system offer some important

advantages over more precise techniques. With our approach,

one can literally take an off-the-shelf camera, attach it to the

robot, align it approximately in the forward direction, and start

the system.

The algorithm is not perfect, and there are scenarios in

which it will fail. For example, occasionally the algorithm does

not properly transition to the next milestone image, in which

case the overlap between the current and milestone image can

decrease to the point that an insufﬁcient number of features

are matched. Also, untextured scenes containing distant trees,

bushes, or undecorated indoor hallways sometimes prevent the

KLT algorithm from successfully tracking enough features

to accurately compute the heading direction. While only a

handful of features are necessary for the algorithm to succeed,

it is important that features exist on both sides of the image,

and that some number of features remain visible throughout

the milestone.

Another source of error is due to distant features. Although

features near the center of the image produce a narrow funnel

lane even when they are far from the camera, distant features

near the side of the image produce much larger funnel lanes

which are less useful for navigation. Moreover, image parallax

is inversely proportional to the distance to a feature. As a

result, distant features are primarily useful for correcting the

rotation of the robot and are quite incapable of informing

the robot about minor translation errors. This problem is

compounded by the inherent ambiguity between rotation and

translation in the funnel lane itself. Even though this ambiguity

has little effect when the robot is near the path, it hinders

the ability of the visual information to correctly determine

the correct amount of rotation when the robot has deviated

signiﬁcantly. Odometry helps to overcome this limitation, and

we have conducted experiments in which the robot consistently

returns to the path after deviating by several meters. However,

much larger deviations either initially or during replay cannot

be handled by our present system. At any rate, it should be

noted that odometry drift is not an issue because we only

store odometry values local to the segment, not in a global

coordinate frame.

VI. CONCLUSION

In this paper we have presented a novel approach to the

problem of vision-based mobile robot path following using a

single off-the-shelf camera. The robot navigates by performing

Qualitative Vision-Based Path Following

Figures

Citations

Mobile robot vision navigation & localization using Gist and Saliency

Simple yet stable bearing-only navigation

Image features for visual teach-and-repeat navigation in changing environments

Simple yet stable bearing-only navigation: Krajník et al.: Simple Yet Stable Bearing-Only Navigation

A Taxonomy of Vision Systems for Ground Mobile Robots

References

Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography

A tutorial on visual servo control

Vision for mobile robot navigation: a survey

Experiences with an interactive museum tour-guide robot

Real-time motion planning for agile autonomous vehicles

Related Papers (5)

Distinctive Image Features from Scale-Invariant Keypoints

Monocular Vision for Mobile Robot Localization and Autonomous Navigation

SURF: speeded up robust features

Visual teach and repeat for long-range rover autonomy

Speeded-Up Robust Features (SURF)

Frequently Asked Questions (12)

Q1. What contributions have the authors mentioned in the paper "Qualitative vision-based path following" ?

Q2. What are the future works mentioned in the paper "Qualitative vision-based path following" ?

Q3. Why did the algorithm recover from the loss of features caused by the dynamic objects?

Q4. How did the robot navigate the ramp?

Q5. How many ms did the earlier algorithm achieve?

Q6. What does the algorithm require to be able to do?

Q7. What is the odometry of the robot?

Q8. What is the definition of the funnel lane?

Q9. How was the replay driving speed achieved?

Q10. What was the first experiment that the robot did?

Q11. How was the robot able to follow the path?

Q12. What is the simplest way to prove the theorem?