What contributions have the authors mentioned in the paper "Robust lane detection and tracking in challenging scenarios" ?

The authors present a robust lane-detection-and-tracking algorithm to deal with challenging scenarios such as a lane curvature, worn lane markings, lane changes, and emerging, ending, merging, and splitting lanes. The authors first present a comparative study to find a good real-time lanemarking classifier. The suggested framework effectively combines a likelihood-based object-recognition algorithm with a Markov-style process ( tracking ) and can also be applied to general-part-based object-tracking problems. An experimental result on local streets and highways shows that the suggested algorithm is very reliable.

What have the authors stated for future works in "Robust lane detection and tracking in challenging scenarios" ?

Future work will integrate it with a vision-based obstacle-detection algorithm, for example [ 20 ], for a collision-warning system.

How is the pixel value of the rectified image calculated?

Since u and v are not integer numbers in most cases, the pixel value of the rectified image is calculated by linearlyinterpolating the intensity values of the four neighboring pixels (by flooring and ceiling the u and v) in the original image.

How do the authors normalize the size of the lane marking?

Since the size of the lane marking changes dramatically with respect to its distance from the car, the authors need to normalize them to apply a standard classifier.

What is the way to improve the detection performance of lane markings?

Applying a stereo algorithm [3], [4] can further improve the lane-marking-detection performance, but the authors focus on a monocular image in this paper.

Why is it good to give a reasonable weight to this?

when the detection performance is good enough, it is good to give a reasonably large weight to this because redundant detection compensates tracking failures.

Why is the motion of the lane boundaries in world (vehicle) coordinates?

Due to vehicle’s vibration, including pitch change, the motion of the lane boundaries in world (vehicle) coordinates is not smooth enough to be properly modeled by a Kalman filter.

Why did the authors choose a particle filter over the Kalman filter?

the authors chose a particle-filtering algorithm over the Kalman filter to prevent the result from being biased too much on the predicted vehicle motion but to give more weight to the image evidence.

Why was the motion of the vehicle modeled by Gaussian distributions?

For the particle filtering, the vehicle’s motion (rotation and translation) was modeled by Gaussian distributions for simplicity, but the scoring function is carefully designed to prevent the result from being dictated by this model.

How many hypotheses are selected per lane boundary?

In their implementation, up to five hypotheses per lane boundary (left/right) are selected, including the ones from the particle-filtering process.

How many control points are generated from a random set of two line segments?

An approximate arc of three control points is generated from a random set of two line segments, and a more complicated hypothesis of four control points is generated from a random set of three line segments.

What is the way to generate a straight line?

Whereas a single-line segment is sufficient to make a straight-line hypothesis, the authors also use a pair of line segment for robust fitting.

Why is the second control point examined?

The second control point is also examined to see if its position is too low, because if it keeps going down, it will eventually collide with the first control point.

What is the way to detect a curb?

Whether curbs can be detected or not depends on the application—detecting a single lane boundary is sufficient in many applications, including the ones for collision warning.

(Open Access) Robust Lane Detection and Tracking in Challenging Scenarios (2008) | ZuWhan Kim

Q: How did the authors obtain the ROC curves for all the classifiers?

For all the classifiers, the authors obtained the ROC curves by changing only the threshold values (no relearning with different parameters).

UC Berkeley

UC Berkeley Previously Published Works

Title

Robust lane detection and tracking in challenging scenarios

Permalink

https://escholarship.org/uc/item/50n0c8cg

Journal

IEEE Transactions on Intelligent Transportation Systems, 9(1)

ISSN

1524-9050

Author

Kim, ZuWhan

Publication Date

2008-03-01

Peer reviewed

eScholarship.org Powered by the California Digital Library

University of California

16 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 9, NO. 1, MARCH 2008

Robust Lane Detection and Tracking

in Challenging Scenarios

ZuWhan Kim, Member, IEEE

Abstract—A lane-detection system is an important component

of many intelligent transportation systems. We present a robust

lane-detection-and-tracking algorithm to deal with challenging

scenarios such as a lane curvature, worn lane markings, lane

changes, and emerging, ending, merging, and splitting lanes. We

ﬁrst present a comparative study to ﬁnd a good real-time lane-

marking classiﬁer. Once detection is done, the lane markings are

grouped into lane-boundary hypotheses. We group left and right

lane boundaries separately to effectively handle merging and split-

ting lanes. A fast and robust algorithm, based on random-sample

consensus and particle ﬁltering, is proposed to generate a large

number of hypotheses in real time. The generated hypotheses

are evaluated and grouped based on a probabilistic framework.

The suggested framework effectively combines a likelihood-based

object-recognition algorithm with a Markov-style process (track-

ing) and can also be applied to general-part-based object-tracking

problems. An experimental result on local streets and highways

shows that the suggested algorithm is very reliable.

Index Terms—Collision warning, computer vison, lane detec-

tion, part-based object tracking.

I. INTRODUCTION

ETECTING and localizing lanes from a road image is an

important component of many intelligent-transportation-

system applications. There has been active research on lane

detection [1]–[9], and a wide variety of algorithms of various

representations (including ﬁxed-width line pairs, spline rib-

bon, and deformable-template model), detection and tracking

techniques (from Hough transform to probabilistic ﬁtting and

Kalman ﬁltering), and modalities (stereo or monocular) have

been proposed.

Due to a real-time constraint and, then, slow processor speed,

the lane markings have been detected based only on simple

gradient changes, and much of the older work has presented

results on straight roads and/or highways with clear lane mark-

ings or with an absence of obstacles on the road.

Many commercial lane-detection systems are available and

show good performance in many challenging road and illumi-

nation conditions. However, they do not provide lane-curvature

information but just lane positions to deliver robust results.

Although lane positions are sufﬁcient for some applications,

Manuscript received December 27, 2006; revised April 19, 2007, July 14,

2007, and July 31, 2007. The Associate Editor for this paper was U. Nunes.

The author is with California Partners for Advanced Transit and Highways,

University of California, Berkeley, Richmond, CA 94804-4698 USA (e-mail:

zuwhan@berkeley.edu).

Color versions of one or more of the ﬁgures in this paper are available online

at http://ieeexplore.ieee.org.

Digital Object Identiﬁer 10.1109/TITS.2007.908582

Fig. 1. False-alarm scenario of a collision-warning system. Without knowing

the lane curvature, the system will generate a false alarm for the postbox.

such as lane-departure warning, there are other applications

which require lane-curvature information.

For example, a collision-warning system can generate false

alarms when the lane curvature is not known. An example sce-

nario is shown in Fig. 1. Without knowing the road curvature,

the system cannot distinguish objects on the sidewalk (e.g., the

postbox) from the objects on the road, and it may generate a

false alarm. As an alternative to a vision-based approach, one

may use a global-positioning system (GPS) with a geographic-

information system (GIS). However, the GPS has a limitation

on the spatial and temporal resolution, and detailed information

is often missing or not updated frequently in GIS. For example,

it is important to detect the road curvature at an off-ramp

because it can generate a false-collision warning, but most

GPS-based systems suffer from even discriminating whether

the vehicle entered an off-ramp or not.

Recent efforts deal with curved roads [5], [7]–[9], and robust

detection results on challenging images, such as distracting

shadows or a leading vehicle, have been reported. Some of them

work in real time, and some do not.

We present a real-time lane-detection-and-tracking system

which is distinguished from the previous ones in the follow-

ing ways.

1) It uses more sophisticated lane-marking-detection algo-

rithm (than gradient- or intensity-bump-based detection)

to deal with challenging situations, such as worn lane

markings and distracting objects/markings, for example,

at an intersection and on a road surface.

2) It detects the left- and right-lane boundaries separately,

whereas most of the previous work uses a ﬁxed-width

lane model. As a result, it can handle challenging sce-

narios such as merging or splitting lanes and on- and off-

ramps effectively.

3) It combines lane detection and tracking into a single

probabilistic framework that can effectively deal with

KIM: ROBUST LANE DETECTION AND TRACKING IN CHALLENGING SCENARIOS 17

Fig. 2. Flow diagram of the algorithm.

Fig. 3. Example image and a rectiﬁed image.

lane changes, emerging, ending, merging, or splitting

lanes. Much previous work has focused on lane tracking

and usually uses a time-consuming detection algorithm

to initialize the tracking. We introduce a fast and robust

lane-detection algorithm that can be applied in every

frame in real-time.

Our algorithm follows the “hypothesize and verify” par-

adigm. In the “hypothesize” step, lower level features are

grouped into many higher level feature hypotheses, and they

are ﬁltered in the “verify” step to reduce the complexity of the

higher level grouping. Fig. 2 shows the ﬂow diagram. First, the

image is rectiﬁed, assuming that the ground is ﬂat.

An example

image and the rectiﬁed image are shown in Fig. 3. Possible lane-

marking pixels are detected in the rectiﬁed image. Then, the

detected lane-marking pixels are grouped into lane-boundary

hypotheses. A lane-boundary hypothesis is represented by a

constrained cubic-spline curve. A combined approach of a

particle-ﬁltering technique (for tracking) and a RANdom SAm-

ple Consensus (RANSAC) algorithm (for detection) is intro-

duced to robustly ﬁnd lane-boundary hypotheses in real-time.

Finally, a probabilistic-grouping algorithm is applied to group

lane-boundary hypotheses into left- and right-lane boundaries.

Note that we generate left- and right-lane-boundary hypotheses

separately (unlike much of the previous work which has a lane

model of uniform width) to deal with various scenarios such as

on/off-ramps or an emerging lane.

In Section II, a comparative study of both classiﬁcation

performance and computation time on various lane-marking-

However, a nonﬂat case is also addressed at a later stage (lane-boundary

grouping).

Fig. 4. Example road images.

classiﬁcation methods is presented. In Section III, we present

our approach to hypothesize lane boundaries. The probabilistic-

grouping algorithm is proposed in Section IV. Experimental

results are presented in Section V, and we present the summary

and future work in Section VI.

II. L

ANE-MARKING DETECTION

Sample road images are shown in Fig. 4. Many of the pre-

vious algorithms simply look for “horizontal intensity bumps”

to detect lane markings, which shows reasonably good perfor-

mance in many cases, but it cannot distinguish false intensity

bumps caused by leading vehicles and road markings/textures

from weak lane markings. For example, worn yellow markings

often have similar grayscale intensity to the road pixels. In

addition, we sometimes need to deal with a poor image quality,

for example, when we need to postprocess an MPEG data.

To deal with such problems, we apply machine learning. We

applied various classiﬁers to the lane-marking-detection task

and present a comparative analysis. Since the size of the lane

marking changes dramatically with respect to its distance from

the car, we need to normalize them to apply a standard classiﬁer.

Therefore, we ﬁrst rectify the original image, as shown in Fig. 3.

When we assume that the ground is ﬂat (for this stage only), we

can apply a plane homography to ﬁnd an image rectiﬁcation.

A point (x, y) on the rectiﬁed image corresponds to the point

(u, v) in the original image, where





λx

λy





= H









and H is a homography matrix. A homography matrix can

easily be obtained by applying a simple external camera cal-

ibration with four reference points. Details on the plane ho-

mography can be found in many computer-vision textbooks, for

example [10].

When a plane homography is given, image rectiﬁcation is

done in the following manner: For each pixel (x, y) of the

rectiﬁed image, its correspondence (u, v) on the original image

is obtained. Since u and v are not integer numbers in most cases,

the pixel value of the rectiﬁed image is calculated by linearly

18 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 9, NO. 1, MARCH 2008

Fig. 5. Example image patches of lane markings and nonmarkings.

interpolating the intensity values of the four neighboring pixels

(by ﬂooring and ceiling the u and v) in the original image.

Once we have a rectiﬁed image, a lane-marking classiﬁer

is applied on a small image patch around each and every

pixel. A typical width of the lane marking on the rectiﬁed

images are about three pixels. Therefore, raw pixel values of

a9× 3 window were used as inputs (total 27 features for

a grayscale image and 81 RGB values for a color image).

To ﬁnd a suitable classiﬁcation algorithm, we tested various

classiﬁers.

Applying a stereo algorithm [3], [4] can further improve

the lane-marking-detection performance, but we focus on a

monocular image in this paper.

For learning, we have gathered image patches of 421 lane

markings and 11 124 nonmarkings. Fig. 5 shows example

image patches. We observe a variety in colors, textures, and

width. We compared the classiﬁcation performances and the

computation requirements of various classiﬁers on the data set.

The following classiﬁers were considered.

1) Intensity-Bump Detection: Intensity-bump detection is

the most popular method in the lane-detection litera-

ture. It is the simplest and fastest detection method,

and it can also be applied to nonrectiﬁed images. We

use an implementation by Ieng et al. [6]. We applied

various values for the gradient threshold (s

) to control

the tradeoffs between the detection rates and the false-

alarm rates.

2) Artiﬁcial Neural Networks (ANNs): We tested two-

layer neural networks with various numbers of hid-

den nodes. Training an ANN (a back-propagation al-

gorithm was applied in our experiment) requires sig-

niﬁcant computation, but the actual classiﬁcation time

is relatively small. When there are n features (inputs)

and m hidden nodes, it requires nm multiplications,

nm + m additions, and m sigmoid-function calculations

to classify a hypothesis (n =27or 81 and m =7in our

examples).

3) Naive Bayesian Classiﬁers (NBC): NBCs show good

classiﬁcation performances, in spite of their unrealistic

conditional independence assumption. We compare the

discrete and the unimodal Gaussian representations of

the conditional probability. For both representations, the

learning time is linear to the number of the examples

(fastest). A discrete NBC requires very little computa-

tion for classiﬁcation. The Gaussian representation re-

quires computation of the exponential function n times.

However, we can avoid calling the exponential func-

tion by using a logarithm of the probability instead of

the actual one. In fact, for both representations, it is

Fig. 6. Classiﬁcation performance of the classiﬁers.

necessary to use a logarithm to minimize the numeri-

cal errors, particularly when the number of features is

large. For the discrete NBC, we can precalculate the

logarithms of all the probability table entries to save

computation.

4) Support Vector Machine (SVM): During the last

decade, SVMs have rapidly gained popularity. They pro-

vide a good framework for incorporating kernel methods.

We tested the second-order polynomial kernel, which

requires the smallest computation. Learning requires sig-

niﬁcant computation, but it is bounded in polynomial

time. The classiﬁcation involves a large number of mul-

tiplications: O(mn), where m is the number of support

vectors. The number of support vectors is at least n +1,

and it can be much greater when the data is not clearly

separable (in the transformed feature space) or when a

small tuning parameter is given. For training, we used the

implementation by Collobert et al. [11] (SVMTorch) with

the tuning parameter of 100.

Details on most of the above classiﬁers can be found in the

machine-learning literature, for example, in [12].

Fig. 6 shows the classiﬁcation performances of the presented

classiﬁers. We followed the evaluation scheme presented in

[13]. We repeated stratiﬁed ﬁvefold cross-validation ten times

and showed the receiver operating-characteristic (ROC) curves

with the conﬁdence intervals. For all the classiﬁers, we obtained

the ROC curves by changing only the threshold values (no

relearning with different parameters).

For all the classiﬁers, we applied various parameters and

chose the best ones. For ANN, we compared the ones with

seven, 10, and 15 hidden nodes, but we present the result of the

one with seven hidden nodes because it is the fastest, whereas

the performances among them are not signiﬁcantly different.

For the discrete naive Bayesian network, we used seven-level

discretization. The SVM was learned with the tuning parameter

of 100.0.

We observe that all the classiﬁers show superior performance

than the intensity-bump detector. In fact, intensity-bump de-

tectors introduce too many false alarms, given an acceptable

detection rate. Therefore, applying any of the above classiﬁers

will deliver much better lane-detection performance. The SVM

KIM: ROBUST LANE DETECTION AND TRACKING IN CHALLENGING SCENARIOS 19

TAB LE I

OMPUTATION TIME OF THE CLASSIFIERS

Fig. 7. Classiﬁcation performance with neural networks when directly using

color pixels (81 features), gray-level pixels with 5 : 4 : 1 weights, and gray-

level pixels with equal weights. Using a gray image with 5 : 4 : 1 weights gives

competitive performance to that of using a color image.

shows far better performance than any other classiﬁers, and

then, the ANN follows.

We also compared the classiﬁcation computation for the

classiﬁers. We have applied the classiﬁers on images of a

70 × 250 size and summarized the computing time in Table I.

The algorithms ran on an Intel Core 1.83-GHz processor. For

fair comparison, all the classiﬁcation algorithms were imple-

mented in C++ inline functions and optimized to bring maxi-

mum performance.

Unfortunately, the SVM was not fast enough for real-time

classiﬁcation, and we chose to use the ANN. To further reduce

the computation time, we applied a cascade classiﬁcation: First,

a simple gradient detector and an intensity-bump detector with

loose (low) threshold values are successively applied to quickly

ﬁlter out nonlane markings, and then, the ANN classiﬁer is

applied to the remaining samples (much smaller in number).

As shown in Table I, it signiﬁcantly reduces the classiﬁca-

tion time.

We used gray-level lane-marking images for the above

classiﬁcation result and the computation-time analysis. The

gray-level images were generated by weighted-summing RGB

values (0.5 for red, 0.4 for green, and 0.1 for blue) to

better detect worn yellow lane markings. Applying such

weights outperformed the equal-weight conversion, as shown

in Fig. 7. We have tested various different weight combi-

nations, and the proposed weights showed the best perfor-

mance. One may apply the classiﬁer directly to the color

pixels (total 81 features), but it introduces too much com-

putation in image-rectiﬁcation classiﬁcation, whereas it does

not improve the performance signiﬁcantly, as also shown in

the Fig. 7.

Fig. 8. (a) Detected lane-marking pixels. (b) Smoothed lane-marking score.

ﬁltering/RANSAC algorithm.

III. LANE-BOUNDARY-HYPOTHESES GENERATION WITH

PARTICLE FILTERING AND RANSAC

Once possible lane-marking pixels are detected [an example

is shown in Fig. 8(a)], they are grouped into uniform cubic-

spline curves of two to four control points. Splines are smooth

piecewise polynomial functions, and they are widely used in

representing curves. Various spline representations have been

proposed, and we use a cubic spline among them. In a cubic-

spline representation, a point p on a curve between the ith and

(i +1)th control point is represented as

p =(x

(t),y

(t))

where

(t)=a

+ b

t + c

+ d

(t)=e

+ f

t + g

+ h

where the parameters a

,...,h

are uniquely determined by the

control points so that the curve is smooth. (x

(0),y

(0)) is the

ith control point, (x

(1),y

(1)) is the (i +1)th control point,

and 0 ≤ t ≤ 1.

A cubic-spline curve enables fast ﬁtting, because the control

points are actually on the curve. We use this property to apply

a RANSAC algorithm [14]. A RANSAC algorithm is a robust

Robust Lane Detection and Tracking in Challenging Scenarios

Figures

Citations

Data-Driven Intelligent Transportation Systems: A Survey

Recent progress in road and lane detection: a survey

Towards End-to-End Lane Detection: an Instance Segmentation Approach

Perception, Planning, Control, and Coordination for Autonomous Vehicles

Deep Neural Network for Structural Prediction and Lane Detection in Traffic Scene

References

Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography

The Elements of Statistical Learning

Multiple view geometry in computer vision

Multiple View Geometry in Computer Vision.

Sequential Monte Carlo methods in practice

Related Papers (5)

Video-based lane estimation and tracking for driver assistance: survey, system, and evaluation

Lane detection and tracking using B-Snake

Real time detection of lane markers in urban streets

GOLD: a parallel real-time stereo vision system for generic obstacle and lane detection

Recent progress in road and lane detection: a survey

Frequently Asked Questions (17)

Q1. What contributions have the authors mentioned in the paper "Robust lane detection and tracking in challenging scenarios" ?

Q2. What have the authors stated for future works in "Robust lane detection and tracking in challenging scenarios" ?

Q3. How is the pixel value of the rectified image calculated?

Q4. What is the way to classify a hypothesis?

Q5. How do the authors normalize the size of the lane marking?

Q6. What is the way to improve the detection performance of lane markings?

Q7. Why is it good to give a reasonable weight to this?

Q8. Why is the motion of the lane boundaries in world (vehicle) coordinates?

Q9. How did the authors obtain the ROC curves for all the classifiers?

Q10. Why did the authors choose a particle filter over the Kalman filter?

Q11. Why was the motion of the vehicle modeled by Gaussian distributions?

Q12. What is the way to reduce the computation time?

Q13. How many hypotheses are selected per lane boundary?

Q14. How many control points are generated from a random set of two line segments?

Q15. What is the way to generate a straight line?

Q16. Why is the second control point examined?

Q17. What is the way to detect a curb?