scispace - formally typeset
Search or ask a question
Journal ArticleDOI

GOLD: a parallel real-time stereo vision system for generic obstacle and lane detection

01 Jan 1998-IEEE Transactions on Image Processing (IEEE Trans Image Process)-Vol. 7, Iss: 1, pp 62-81
TL;DR: The generic obstacle and lane detection system (GOLD), a stereo vision-based hardware and software architecture to be used on moving vehicles to increment road safety, allows to detect both generic obstacles and the lane position in a structured environment at a rate of 10 Hz.
Abstract: This paper describes the generic obstacle and lane detection system (GOLD), a stereo vision-based hardware and software architecture to be used on moving vehicles to increment road safety. Based on a full-custom massively parallel hardware, it allows to detect both generic obstacles (without constraints on symmetry or shape) and the lane position in a structured environment (with painted lane markings) at a rate of 10 Hz. Thanks to a geometrical transform supported by a specific hardware module, the perspective effect is removed from both left and right stereo images; the left is used to detect lane markings with a series of morphological filters, while both remapped stereo images are used for the detection of free-space in front of the vehicle. The output of the processing is displayed on both an on-board monitor and a control-panel to give visual feedbacks to the driver. The system was tested on the mobile laboratory (MOB-LAB) experimental land vehicle, which was driven for more than 3000 km along extra-urban roads and freeways at speeds up to 80 km/h, and demonstrated its robustness with respect to shadows and changing illumination conditions, different road textures, and vehicle movement.

Summary (3 min read)

A. Lane Detection

  • Lane detection is performed assuming that a road marking in the plane of the space (i.e., in the remapped image) is represented by a quasivertical bright line of constant width surrounded by a darker region (the road).
  • Thus, the pixels belonging to a road marking have a brightness value higher than their left and right neighbors at a given horizontal distance.
  • The result of the geodesic dilation is the product between the control image and the maximum value computed among all the pixels belonging to the neighborhood described by the structuring element.
  • Subsequently, all pairs with are considered: the image is scanned line by line from its bottom (where it is more probable to be able to detect the center of the road) to its top, and the longest chain of road centers is built, exploiting the image vertical correlation.
  • Fig. 12(a) shows the final result starting from the binary image shown in Fig. 9(e).

B. Obstacle Detection

  • As shown in Section III, the stereo IPM technique can produce a difference image in which ideal square obstacles are transformed into two triangles.
  • The focus of the polar histogram is placed in the middle of in this case the polar histogram presents an appreciable peak corresponding to each triangle.
  • Since the presence of an obstacle produces two disjoint triangles (corresponding to its edges) in the difference image, obstacle detection is reduced to the search for pairs of adjacent peaks; the position of a peak, in fact, determines the angle of view under which the obstacle edge is seen (see Fig. 17).
  • According to the notations of Fig. 19, is defined as the ratio between areas and If is greater than a threshold, two adjacent peaks are considered as generated by the same obstacle, and thus joined (see Fig. 20).
  • Fig. 22 shows the results obtained in a number of different situations.

A. Removing the Perspective Effect

  • The procedure aimed to remove the perspective effect resamples the incoming image, remapping each pixel toward a different position and producing a new two-dimensional (2- D) array of pixels.
  • The resulting image represents a top view of the road region in front of the vehicle, as it was observed from a significant height.
  • , representing the 3-D world space (world-coordinate), where the real world is defined.

III. STEREO INVERSE PERSPECTIVE MAPPING

  • A 3-D description of the world using a single 2-D image is impossible without a priori knowledge, due to the depth loss during acquisition; for many years stereo vision has been investigated as an answer to this problem.
  • The intrinsic complexity of the determination of homologous points can be reduced with the introduction of some domainspecific constraints, such as the assumption of a flat road in front of the cameras.
  • The set of points where and represent the projection of in the space of the left and right camera respectively, is called horopter and represents the zero disparity surface of the stereo system [11].
  • This concept is extremely useful when the horopter coincides with a model of the road surface, since any deviation from this model can be easily detected.
  • The flat road hypothesis can be verified computing the difference between the two remapped images: a generic obstacle (anything raising out from the road) is detected if the difference image presents sufficiently large clusters of nonzero pixels having a specific shape.

A. Camera Calibration

  • From the above description, it can be seen that the calibration of the vision system plays a basic role.
  • Recalling the definitions and notations given in Section II-A1, the calibration parameters can be divided into the following two categories.
  • Extrinsic parameters (view point and viewing direction), which can be determined by measurements and possibly tuned.
  • After the independent calibration of both cameras, a fine tuning of the and parameters is obtained applying the stereo IPM algorithm iteratively, and minimizing the disparities between the two remapped images of a flat road acquired with the vehicle standing still.
  • The acquisition parameters of the camera installed onto MOB-LAB are shown in Table I. Fig. 8(a) shows the horizontal calibration of the left camera installed onto MOB-LAB, while Fig. 8(b) shows the remapped image with an aspect ratio of one.

IV. DRIVING ASSISTANCE FUNCTIONS

  • In the following section the lane detection and obstacle detection functionalities are discussed.
  • Both of them are divided into a low-level phase that can be efficiently expressed with a SIMD computational paradigm and a serial high- and medium-level phase.

V. THE COMPUTING ARCHITECTURE

  • Due to the specific field of application, the response time of the system is a major critical point, since it affects directly the maximum speed allowed for the vehicle; the choice of the computing architecture is, thus, a key design issue [10].
  • These systems, using slower device speeds, provide an effective mechanism to trade power consumption for silicon area, while maintaining the computational power unchanged.
  • 4) Since the number of processing units must be high, if the system has size constraints the PE’s must be extremely simple, performing only simple basic operations.
  • The result of the graphical operator is then either stored in a destination bit-plane or used as the first operand of the following logical operation.

VI. PERFORMANCE ANALYSIS

  • Since the GOLD system is composed of two independent computational engines (the PAPRICA system, running the low-level processing, and its host computer, running the medium-level processing), it can work in pipelined.
  • As shown in Fig. 24, the lane detection and obstacle detection tasks are divided into the following categories.
  • 1) Data Acquisition and Output: A pair of grey-level stereo images of size 512 256 pixels are acquired simultaneously and written directly into PAPRICA image memory.
  • At the same time, the result of previous computations are displayed on an on-board monitor to generate a visual feedback to the driver.
  • This phase, again managed by PAPRICA system, takes 25 ms; the result is then transferred (in 3 ms) to the host computer.

VII. DISCUSSION

  • A system (hardware and software) for lane and obstacle detection has been presented, satisfying the hard realtime constraints imposed by the automotive field.
  • The farther the obstacle, the smaller the portion of triangles detectable in the difference image, and thus the lower the amplitude of peaks in the polar histogram; nevertheless, for sufficiently high obstacles (e.g., vehicles at about 50 m far from the cameras), the main problem is not the detection of peaks, but their joining, as shown in Fig. 28(a)–(c).
  • Considering an operational vehicle speed of 100 km/h and the MOB-LAB calibration setup, the vertical shift between two subsequent remapped images corresponding to two frames acquired with a temporal shift of 100 ms is only 7 pixels.
  • This high correlation allows to average in time the results of the processing, thus reducing the problems of the incomplete detection of obstacles explained above.
  • An extension to the GOLD system that is able to exploit temporal correlations and to perform a deeper datafusion between the two functionalities of lane detection and obstacle detection is currently under test [2] on ARGO.

ACKNOWLEDGMENT

  • The authors express their gratitude to E. Dickmanns for his outstanding suggestions, to F. Gregoretti, L. Reyneri, C. Sansoé, and R. Passerone of the Polytechnic Institute of Torino, for the enthusiastic joint development of the PAPRICA system, and to G. Quaglia and all the friends from IEN Galileo Ferraris, Torino, for their help during the tests on MOB-LAB.
  • The authors also acknowledge the significative contribution of all the students who were involved in this project, in particular, A. Fascioli.
  • Finally, the authors are also in debt to G. Conte and G. Adorni for their support in this research.

Did you find this useful? Give us your feedback

Figures (26)

Content maybe subject to copyright    Report

62 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 7, NO. 1, JANUARY 1998
GOLD: A Parallel Real-Time Stereo Vision
System for Generic Obstacle and Lane Detection
Massimo Bertozzi, Student Member, IEEE, and Alberto Broggi, Associate Member, IEEE
AbstractThis paper describes the Generic Obstacle and Lane
Detection system (GOLD), a stereo vision-based hardware and
software architecture to be used on moving vehicles to increment
road safety. Based on a full-custom massively parallel hardware,
it allows to detect both generic obstacles (without constraints
on symmetry or shape) and the lane position in a structured
environment (with painted lane markings) at a rate of 10 Hz.
Thanks to a geometrical transform supported by a specific hard-
ware module, the perspective effect is removed from both left
and right stereo images; the left is used to detect lane markings
with a series of morphological filters, while both remapped stereo
images are used for the detection of free-space in front of the
vehicle. The output of the processing is displayed on both an on-
board monitor and a control-panel to give visual feedbacks to the
driver. The system was tested on the mobile laboratory (MOB-
LAB) experimental land vehicle, which was driven for more than
3000 km along extra-urban roads and freeways at speeds up to 80
km/h, and demonstrated its robustness with respect to shadows
and changing illumination conditions, different road textures, and
vehicle movement.
I. INTRODUCTION
T
HE MAIN issues addressed in this work are lane de-
tection and obstacle detection, both implemented using
only visual data acquired from standard cameras installed on
a mobile vehicle.
A. Lane Detection
Road following, namely the closing of the control loop that
enables a vehicle to drive within a given portion of the road,
has been differently approached and implemented in research
prototype vehicles. Most of the systems developed worldwide
are based on lane detection: first, the relative position of the
vehicle with respect to the lane is computed, and then actuators
are driven to keep the vehicle in a safe position. Others [15],
[28], [38] are not based on the preliminary detection of the
road position, but, as in the case of ALVINN [43], [44], derive
the commands to issue to the actuators (steering wheel angles)
directly from visual patterns detected in the incoming images.
In any case, the knowledge of the lane position can be of use
for other purposes, such as the determination of the regions of
interest for other driving assistance functions.
Manuscript received April 5, 1996; revised March 24, 1997. This work
was supported in part by the Italian National Research Council under the
framework of the Progetto Finalizzato Trasporti 2. The associate editor
coordinating the review of this manuscript and approving it for publication
was Prof. Jeffrey J. Rodriguez.
The authors are with the Department of Information Technology, Uni-
versity of Parma, I-43100 Parma, Italy (e-mail: bertozzi@CE.UniPR.IT;
broggi@CE.UniPR.IT).
Publisher Item Identifier S 1057-7149(98)00313-3.
The main problems that must be faced in the detection
of road boundaries or lane markings are: 1) the presence of
shadows, producing artifacts onto the road surface, and thus
altering its texture, and 2) the presence of other vehicles on the
path, partly occluding the visibility of the road. Although some
systems have been designed to work on nonstructured roads
(without painted lane markings) [28] or on unstructured terrain
[39], [52], generally lane detection relies on the presence of
painted road markings on the road surface. Therefore, since
lane detection is generally based on the localization of a
specific pattern (the lane markings) in the acquired image,
it can be performed with the analysis of a single still image.
In addition, some assumptions can aid the detection algorithm
and/or speed-up the processing. They range from the analysis
of specific regions of interest in the image (in which, due to
both physical and continuity constraints, it is more probable
to find the lane markings) [18] to the assumption of a fixed-
width lane (thus dealing with only parallel lane markings), to
the assumption of a precise road geometry (such as a clothoid)
[18], [33], [58], to the assumption of a flat road (the one
considered in this work).
The techniques implemented in the previously mentioned
systems range from the determination of the characteristics
of painted lane markings [30] eventually aided by color
information [19] to the use of deformable templates (such as
LOIS [31], DBS [7], or ARCADE [29]), to an edge-based
recognition using a morphological paradigm [3], [5], [59], to
a model-based approach (as implemented in VaMoRs [26]
or SCARF [17]). A model-based analysis of road markings
has also been used to perform the analysis of intersections
in city traffic images [21], [32]; nevertheless, as discussed in
[46], the use of a model-based search approach has several
drawbacks, such as the problem of using and maintaining an
appropriate geometrical road model, the difficulty in detecting
and matching complex road features, and the complexity of
the computations involved.
Moreover, some systems (such as [46]) work in the velocity
domain instead of the image domain, thus using optical-
flow techniques in order to minimize the horizontal relative
movement of the lane markings with respect to the vehicle.
Unfortunately, such a solution requires both the preliminary
detection of lane markings and the following computation of
the optical flow field.
B. Obstacle Detection
The techniques used in the detection of obstacles may vary
according to the definition of “obstacle.” If “obstacle” means
1057–7149/98$10.00 1998 IEEE

BERTOZZI AND BROGGI: PARALLEL REAL-TIME STEREO VISION SYSTEM 63
a vehicle, then the detection is based on a search for specific
patterns, possibly supported by other features, such as shape
[56], symmetry [61], or the use of a bounding box [1]. Also,
in this case, the processing can be based on the analysis of a
single still image.
Conversely, if we intend as obstacle any object that can
obstruct the vehicle’s driving path or anything raising out
significantly from the road surface, obstacle detection is gen-
erally reduced to the detection of free-space instead of the
recognition of specific patterns. In this case, different tech-
niques can be used, such as 1) the analysis of the optical
flow field, and 2) the processing of stereo images; both
of these require two or more images, thus leading to a
higher computational complexity, which is further increased
by the necessity to handle noise caused by vehicle movements.
Obstacle detection using the optical flow approach [13], [20] is
generally divided into two steps: first, ego-motion is computed
from the analysis of optical flow [25] or obtained from
odometry [35]; then obstacles are detected by the analysis
of the differences between the expected and the real velocity
field.
On the other hand, the main problem of stereo vision
techniques is the detection of correspondences between two
stereo images (or three images, in case of trinocular vision
[49]). The advantage of the analysis of stereo images instead
of a monocular sequence of images is the possibility to
detect directly the presence of obstacles, which, in case of
an optical flow-based approach, is indirectly derived from the
analysis of the velocity field. Moreover, in a limit condition
where both vehicle and obstacles have small or null speeds,
the second approach fails while the former still can detect
obstacles. Furthermore, to decrease the intrinsic complexity of
stereo vision, some domain specific constraints are generally
adopted.
As in [33], the Generic Obstacle and Lane Detection
(GOLD) system addresses both lane detection and obstacle
detection at the same time: lane detection is based on a
pattern-matching technique that relies on the presence of road
markings, while the localization of obstacles in front of the
vehicle is performed by the processing of pairs of stereo
images: in order to be fast and robust with respect to camera
calibration and vehicle movements, the detection of a generic
obstacle is reduced to the determination of the free-space
in front of the vehicle without any three-dimensional (3-D)
world reconstruction.
Both functionalities share the same underlying approach
(image warping), which is based on the assumption of a flat
road. Such a technique has been successfully used for the
computation of the optical flow field [36], for the detection
of obstacles in a structured environment [34], [60], or in the
automotive field [37], [42], [45] (using standard cameras) or
[50], [57] (using linear cameras). It is based on a transform
that, given a model of the road in front of the vehicle (e.g.
flat road), remaps the right image onto the left; any disparity
is caused by a deviation from the road model, thus detecting
possible obstacles.
Contrary to other works [33], [37], [42], GOLD performs
two warpings instead of one, remapping both images into
(a)
(b)
(c)
Fig. 1. (a) MOB-LAB land vehicle. (b) Control panel used as output to
display the processing results. (c) ARGO autonomous passengers car.
a different domain (road domain), in which the following
processings are extremely simplified. Hence, the reprojection

64 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 7, NO. 1, JANUARY 1998
(a) (b)
Fig. 2. (a) Road markings width changes according to their position within the image. (b) Due to the perspective effect, different pixels represent
different portions of the road.
[33], [58] of the results in the road domain is no more required.
Moreover, since both GOLD functionalities are based on the
processing of images remapped into the same domain, the
fusion of the result of the two independent processings is
straightforward.
The GOLD system has been tested on mobile laboratory
(MOB-LAB) experimental land vehicle, integrating the results
of the Italian Research Units involved in the PROMETHEUS
project. MOB-LAB [see Fig. 1(a)] is equipped with four
cameras, two of which are used for this experiment, several
computers, monitors, and a control-panel [see Fig. 1(b)] to
give a visual feedback and warnings to the driver. The GOLD
system is now being ported to ARGO [2] [see Fig. 1(c)], a
Lancia Thema passenger car with automatic steering capabil-
ities.
This work is organized as follows: Section II presents
the basics of the underlying approach used to remove the
perspective effect from a monocular image, while Section III
describes its application to the processing of stereo images.
Section IV describes the lane detection and obstacle detec-
tion functionalities; Section V presents the computing engine
that has been developed as a support to the GOLD system;
Section VI presents the analysis of the time performance of
the current implementation; finally, Section VII ends the paper
with a discussion about the problems of the system, their
possible solutions, and future developments.
II. I
NVERSE PERSPECTIVE MAPPING
Due to its intrinsic nature, low-level image processing
is efficiently performed on single instruction multiple data
(SIMD) systems by means of a massively parallel compu-
tational paradigm. Anyway, this approach is meaningful in
the case of generic filterings (such as noise reduction, edge
detection, and image enhancement), which consider the image
as a mere collection of pixels, independent of their semantic
content.
On the other hand, the implementation of more sophisticated
filters requires some semantic knowledge. As an example, let
us consider the specific problem of road markings detection in
an image acquired from a vehicle. Due to the perspective effect
introduced by the acquisition conditions, the road markings
width changes according to their distance from the camera [see
Fig. 2(a)]. Therefore, the correct detection of road markings
Fig. 3. Relationship between the two coordinate systems.
should be based on matchings with patterns with different
size, according to the specific position within the image.
Unfortunately, this differentiated low-level processing cannot
be efficiently performed on SIMD massively parallel systems,
which by definition perform the same processing on each pixel
of the image.
The perspective effect associates different meanings to
different image pixels, depending on their position in the
image [see Fig. 2(b)]. Conversely, after the removal of the
perspective effect, each pixel represents the same portion of the
road,
1
allowing a homogeneous distribution of the information
among all image pixels; to remove the perspective effect, it is
necessary to know the specific acquisition conditions (camera
position, orientation, optics, etc.) and the scene represented in
the image (the road, which is now assumed to be flat). This
constitutes the a priori knowledge.
Now, recalling the example of road markings detection, the
size and shape of the matching template can be independent
of the pixel position. Therefore, road markings detection can
be conveniently divided into two steps: the first, exploiting the
a priori knowledge, is a transform that generates an image in
a new domain where the detection of the features of interest
is extremely simplified; the second, exploiting the sensorial
1
A pixel in the lower part of the image of Fig. 2(a) represents a few cm
2
of the road, while a pixel in the middle of the same image represents a few
tens of cm
2
, or even more.

BERTOZZI AND BROGGI: PARALLEL REAL-TIME STEREO VISION SYSTEM 65
(a) (b)
Fig. 4. (a) The
xy
plane in the
W
space and (b) the
z
plane.
(a) (b)
Fig. 5. (a) Original and remapped images. (b) In grey, the visible portion of the road.
data, consists of a mere low-level morphological processing.
The removal of the perspective effect allows to detect road
markings through an extremely simple and fast morphological
processing that can be efficiently implemented on massively
parallel SIMD architectures.
A. Removing the Perspective Effect
The procedure aimed to remove the perspective effect
resamples the incoming image, remapping each pixel toward
a different position and producing a new two-dimensional (2-
D) array of pixels. The resulting image represents a top view
of the road region in front of the vehicle, as it was observed
from a significant height.
Two Euclidean spaces are defined, as follows.
, representing the 3-D world space
(world-coordinate), where the real world is defined.
, representing the 2-D image space
(screen-coordinate), where the 3-D scene is projected.
The image acquired by the camera belongs to the
space,
while the remapped image is defined as the
plane of
the
space (according to the assumption of a flat road). The
remapping process projects the acquired image onto the
plane of the 3-D world space Fig. 3 shows the relationships
between the two spaces
and
1) Mapping: In order to generate a 2-D view of a
3-D scene, the following parameters must be known [41].
1) Viewpoint: camera position is
.
2) Viewing Direction: optical axis
is determined by the
following angles:
the angle formed by the projection (defined by versor
) of the optical axis on the plane and the
axis [as shown in Fig. 4(a)];
the angle formed by the optical axis and versor
[as shown in Fig. 4(b)].
3) Aperture: camera angular aperture is
.
4) Resolution: camera resolution is
After simple manipulations [6], the final mapping
as a function of and is given by
(1)

66 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 7, NO. 1, JANUARY 1998
Fig. 6. Horopter surface corresponding to different angles between the
optical axes of two stereo cameras.
with Given the coordinates of
a generic point
in the space, (1) return the coordinates
of the corresponding point in the space (see
Fig. 3).
2)
Mapping: The inverse transform
(the dual mapping) is given as follows [6]:
and
(2)
The remapping process defined by (2) removes the perspec-
tive effect and recovers the texture of the
plane of the
space. It is implemented scanning the array of pixels of
coordinates
which form the remapped image,
in order to associate to each of them the corresponding value
assumed by the point of coordinates
As an example, Fig. 5(a) shows the original and remapped
images: it is clearly visible that in this case the road markings
width is almost invariant within the whole image. The reso-
lution of the remapped image has been chosen as a trade-off
between information loss and processing time; the remapped
image shown in Fig. 5(a) has been obtained without preserving
the original aspect-ratio. Note that the lower portion of the
remapped image is undefined: this is due to the specific camera
position and orientation [see Fig. 5(b)].
III. S
TEREO INVERSE PERSPECTIVE MAPPING
A 3-D description of the world using a single 2-D image
is impossible without a priori knowledge, due to the depth
loss during acquisition; for many years stereo vision has
been investigated as an answer to this problem. Generally,
traditional techniques for the processing of pairs of stereo
images are divided into the following four main steps:
1) calibration of the two cameras;
2) localization of a feature in an image;
3) identification and localization of the same feature in the
other image;
4) reconstruction of the 3-D scene.
Whenever the mapping between points corresponding to the
same feature (homologous points) can be determined, the prob-
lem of 3-D reconstruction can be solved using triangulations.
The intrinsic complexity of the determination of homologous
points can be reduced with the introduction of some domain-
specific constraints, such as the assumption of a flat road in
front of the cameras.
The set of points
where and represent the pro-
jection of
in the space of the left and right
camera respectively, is called horopter and represents the zero
disparity surface of the stereo system [11]. This means that the
two stereo views of an object whose shape and displacement
matches the horopter are identical. This concept is extremely
useful when the horopter coincides with a model of the road
surface, since any deviation from this model can be easily
detected. The horopter is a spherical surface, the smaller the
difference between the orientation of the two cameras (camera
vergence) the larger the radius [22]. Assuming a small camera
vergence, as generally happens in the automotive field, the
horopter can be considered planar. As shown in Fig. 6, the
horopter can be moved acting on camera vergence parameters.
Unfortunately, the horopter cannot be overlapped with the
plane (representing the flat road model) using only
camera vergence; for this purpose, electronic vergence, such
as inverse perspective mapping (IPM), is required.
In this way the search for homologous points is reduced to a
simple verification (check) of the shape of the horopter: in fact
under the flat road hypothesis, the IPM algorithm can be used
to produce an image representing the road as seen from the top.
Using the IPM algorithm with appropriate parameters on stereo
images, different patches of the road surface can be obtained.
Moreover the knowledge of the parameters of the whole vision
system allows to bring the two road patches to correspondence.
This means that, under the flat road hypothesis, pairs of
pixels having the same image coordinates in the two remapped
images are homologous points and represent the same points
in the road plane.
The flat road hypothesis can be verified computing the dif-
ference between the two remapped images: a generic obstacle
(anything raising out from the road) is detected if the difference
image presents sufficiently large clusters of nonzero pixels
having a specific shape. Due to the different position of the
two cameras, the difference image can be computed only for
the overlapping area of the two road patches.
In addition, it is easily demonstrable that the IPM algorithm
maps straight lines perpendicular to the road plane into straight
lines passing through the projection
of the
camera onto the plane
(see Fig. 4): using formula
(1), a vertical straight line is represented by the set of pixels

Citations
More filters
Journal ArticleDOI
TL;DR: A review of recent vision-based on-road vehicle detection systems where the camera is mounted on the vehicle rather than being fixed such as in traffic/driveway monitoring systems is presented.
Abstract: Developing on-board automotive driver assistance systems aiming to alert drivers about driving environments, and possible collision with other vehicles has attracted a lot of attention lately. In these systems, robust and reliable vehicle detection is a critical step. This paper presents a review of recent vision-based on-road vehicle detection systems. Our focus is on systems where the camera is mounted on the vehicle rather than being fixed such as in traffic/driveway monitoring systems. First, we discuss the problem of on-road vehicle detection using optical sensors followed by a brief review of intelligent vehicle research worldwide. Then, we discuss active and passive sensors to set the stage for vision-based vehicle detection. Methods aiming to quickly hypothesize the location of vehicles in an image as well as to verify the hypothesized locations are reviewed next. Integrating detection with tracking is also reviewed to illustrate the benefits of exploiting temporal continuity for vehicle detection. Finally, we present a critical overview of the methods discussed, we assess their potential for future deployment, and we present directions for future research.

1,181 citations


Cites background from "GOLD: a parallel real-time stereo v..."

  • ...Goerick et al. [43] andNoli et al. [72] usedthe (LOC) method (see Section 5.1.5) to extract edge information....

    [...]

  • ...A brief review of active and passive sensors is presented in Section 3....

    [...]

  • ...Detailed reviews of Hypothesis Generation (HG) and Hypothesis Verification (HV) methods are presented in Sections 5 and 6 while exploring temporal continuity by integrating detection with tracking is discussed in Section 7....

    [...]

  • ...We have described in Section 5 a multiresolution scheme addressing these issues....

    [...]

  • ...Several national and international projects have been launched over the past several years to investigate new technologies for improving safety and accident prevention (see Section 2)....

    [...]

Journal ArticleDOI
TL;DR: A comparison of a wide variety of methods, pointing out the similarities and differences between methods as well as when and where various methods are most useful, is presented.
Abstract: Driver-assistance systems that monitor driver intent, warn drivers of lane departures, or assist in vehicle guidance are all being actively considered. It is therefore important to take a critical look at key aspects of these systems, one of which is lane-position tracking. It is for these driver-assistance objectives that motivate the development of the novel "video-based lane estimation and tracking" (VioLET) system. The system is designed using steerable filters for robust and accurate lane-marking detection. Steerable filters provide an efficient method for detecting circular-reflector markings, solid-line markings, and segmented-line markings under varying lighting and road conditions. They help in providing robustness to complex shadowing, lighting changes from overpasses and tunnels, and road-surface variations. They are efficient for lane-marking extraction because by computing only three separable convolutions, we can extract a wide variety of lane markings. Curvature detection is made more robust by incorporating both visual cues (lane markings and lane texture) and vehicle-state information. The experiment design and evaluation of the VioLET system is shown using multiple quantitative metrics over a wide variety of test conditions on a large test path using a unique instrumented vehicle. A justification for the choice of metrics based on a previous study with human-factors applications as well as extensive ground-truth testing from different times of day, road conditions, weather, and driving scenarios is also presented. In order to design the VioLET system, an up-to-date and comprehensive analysis of the current state of the art in lane-detection research was first performed. In doing so, a comparison of a wide variety of methods, pointing out the similarities and differences between methods as well as when and where various methods are most useful, is presented

1,056 citations


Cites background or methods from "GOLD: a parallel real-time stereo v..."

  • ...[20] M. Bertozzi and A. Broggi, “GOLD: A parallel real-time stereo vision system for generic obstacle and lane detection,” IEEE Trans....

    [...]

  • ...Bertozzi and Broggi [20] assumed that the road markings form...

    [...]

  • ...The generic obstacle and lane detection (GOLD) system [20] combined lane-position tracking...

    [...]

  • ...[15] M. Bertozzi, A. Broggi, M. Cellario, A. Fascioli, P. Lombardi, and M. Porta, “Artificial vision in road vehicles,” Proc....

    [...]

  • ...Bertozzi and Broggi [20] assumed that the road markings form parallel lines in an inverse-perspective-warped image....

    [...]

Journal ArticleDOI
TL;DR: A robust algorithm, called CHEVP, is presented for providing a good initial position for the B-Snake model, and a minimum error method by Minimum Mean Square Error (MMSE) is proposed to determine the control points of the B -Snake model by the overall image forces on two sides of lane.

812 citations


Cites methods from "GOLD: a parallel real-time stereo v..."

  • ...The featurebased technique localizes the lanes in the road images by combining the low-level features, such as painted lines [5–10] or lane edges [1,2], etc....

    [...]

Patent
26 Oct 2015
TL;DR: In this article, a forward-facing vision system for a vehicle includes a forwardfacing camera disposed in a windshield electronics module attached at a windshield of the vehicle and viewing through the windshield.
Abstract: A forward-facing vision system for a vehicle includes a forward-facing camera disposed in a windshield electronics module attached at a windshield of the vehicle and viewing through the windshield. A control includes a processor that, responsive to processing of captured image data, detects taillights of leading vehicles during nighttime conditions and, responsive to processing of captured image data, detects lane markers on a road being traveled by the vehicle. The control, responsive to lane marker detection and a determination that the vehicle is drifting out of a traffic lane, may control a steering system of the vehicle to mitigate such drifting, with the steering system manually controllable by a driver of the vehicle irrespective of control by the control. The processor, based at least in part on detection of lane markers via processing of captured image data, determines curvature of the road being traveled by the vehicle.

615 citations

Journal ArticleDOI
TL;DR: This paper presents an overview of image processing and analysis tools used in traffic applications and relates these tools with complete systems developed for specific traffic applications, and categorizes processing methods based on the intrinsic organization of their input data and the domain of processing.

606 citations


Cites background or methods from "GOLD: a parallel real-time stereo v..."

  • ...Using this approach, the localization of the lane and the detection of generic obstacles on the road can be performed without any 3Dworld reconstruction [4]....

    [...]

  • ...GOLD system [4] for ARGO and MOB-LAB vehicles (Prometheus project) † Spatial-domain processing for alf and object detection † Temporal projection of lane locations † Feature-driven approach † Autonomous vehicle guidance † Temporal estimation of vehicle’s state variables † Edge detection constrained on lane width in each stereo image for alf † Moving camera...

    [...]

  • ...Morphological edge-detection schemes have been extensively applied, since they exhibit superior performance [4,18,50]....

    [...]

  • ...Thus, an important task of video systems is to remove the inherent perspective effect from acquired images [3,4]....

    [...]

  • ...Furthermore, the inverse perspective mapping can be used to simplify the process of lane detection, similar to the process of object detection considered in Section 3 [4]....

    [...]

References
More filters
Book
11 Feb 1984
TL;DR: This invaluable reference helps readers assess and simplify problems and their essential requirements and complexities, giving them all the necessary data and methodology to master current theoretical developments and applications, as well as create new ones.
Abstract: Image Processing and Mathematical Morphology-Frank Y. Shih 2009-03-23 In the development of digital multimedia, the importance and impact of image processing and mathematical morphology are well documented in areas ranging from automated vision detection and inspection to object recognition, image analysis and pattern recognition. Those working in these ever-evolving fields require a solid grasp of basic fundamentals, theory, and related applications—and few books can provide the unique tools for learning contained in this text. Image Processing and Mathematical Morphology: Fundamentals and Applications is a comprehensive, wide-ranging overview of morphological mechanisms and techniques and their relation to image processing. More than merely a tutorial on vital technical information, the book places this knowledge into a theoretical framework. This helps readers analyze key principles and architectures and then use the author’s novel ideas on implementation of advanced algorithms to formulate a practical and detailed plan to develop and foster their own ideas. The book: Presents the history and state-of-the-art techniques related to image morphological processing, with numerous practical examples Gives readers a clear tutorial on complex technology and other tools that rely on their intuition for a clear understanding of the subject Includes an updated bibliography and useful graphs and illustrations Examines several new algorithms in great detail so that readers can adapt them to derive their own solution approaches This invaluable reference helps readers assess and simplify problems and their essential requirements and complexities, giving them all the necessary data and methodology to master current theoretical developments and applications, as well as create new ones.

9,566 citations


"GOLD: a parallel real-time stereo v..." refers background or methods in this paper

  • ...The enhancement of the filtered image is performed through a few iterations of a geodesic morphological dilation[ 53 ] with the following binary structuring element:...

    [...]

  • ...otherwise (7) is the control image [ 53 ]. The result of the geodesic dilation is the product between the control image and the maximum value computed among all the pixels belonging to the neighborhood described by the structuring element....

    [...]

  • ...Graphical operators derive from mathematical morphology [27], [ 53 ], a bit-map approach to image processing based on set theory....

    [...]

Journal ArticleDOI
TL;DR: In this paper, techniques for low power operation are presented which use the lowest possible supply voltage coupled with architectural, logic style, circuit, and technology optimizations to reduce power consumption in CMOS digital circuits while maintaining computational throughput.
Abstract: Motivated by emerging battery-operated applications that demand intensive computation in portable environments, techniques are investigated which reduce power consumption in CMOS digital circuits while maintaining computational throughput. Techniques for low-power operation are shown which use the lowest possible supply voltage coupled with architectural, logic style, circuit, and technology optimizations. An architecturally based scaling strategy is presented which indicates that the optimum voltage is much lower than that determined by other scaling considerations. This optimum is achieved by trading increased silicon area for reduced power consumption. >

2,690 citations

Journal ArticleDOI
TL;DR: The tutorial provided in this paper reviews both binary morphology and gray scale morphology, covering the operations of dilation, erosion, opening, and closing and their relations.
Abstract: For the purposes of object or defect identification required in industrial vision applications, the operations of mathematical morphology are more useful than the convolution operations employed in signal processing because the morphological operators relate directly to shape. The tutorial provided in this paper reviews both binary morphology and gray scale morphology, covering the operations of dilation, erosion, opening, and closing and their relations. Examples are given for each morphological concept and explanations are given for many of their interrelationships.

2,676 citations


"GOLD: a parallel real-time stereo v..." refers background in this paper

  • ...Graphical operators derive from mathematical morphology [ 27 ], [53], a bit-map approach to image processing based on set theory....

    [...]

  • ...The low-level portion of the processing, detailed in Fig. 14, is thus reduced to the difference between the two remapped images, a threshold, and a morphological opening [ 27 ] aimed to the removal of small-sized details in the thresholded image....

    [...]

Journal Article
TL;DR: An architecturally based scaling strategy is presented which indicates that the optimum voltage is much lower than that determined by other scaling considerations, and is achieved by trading increased silicon area for reduced power consumption.
Abstract: Motivated by emerging battery-operated applications that demand intensive computation in portable environments, techniques are investigated which reduce power consumption in CMOS digital circuits while maintaining computational throughput Techniques for low-power operation are shown which use the lowest possible supply voltage coupled with architectural, logic style, circuit, and technology optimizations An architecturally based scaling strategy is presented which indicates that the optimum voltage is much lower than that determined by other scaling considerations This optimum is achieved by trading increased silicon area for reduced power consumption >

2,337 citations


"GOLD: a parallel real-time stereo v..." refers background in this paper

  • ...Thus, for power saving reasons, it is desirable to operate at the lowest possible speed, but, in order to maintain the overall system performance, compensation for these increased delays is required [14], [16]....

    [...]

Book
01 Jan 1973
TL;DR: The principles of interactive computer graphics are discussed in this article, where the authors propose a set of principles for the development of computer graphics systems, including the principles of Interactive Computer Graphics (ICG).
Abstract: Principles of interactive computer graphics , Principles of interactive computer graphics , مرکز فناوری اطلاعات و اطلاع رسانی کشاورزی

1,243 citations

Frequently Asked Questions (14)
Q1. What are the contributions in "Gold: a parallel real-time stereo vision system for generic obstacle and lane detection" ?

This paper describes the Generic Obstacle and Lane Detection system ( GOLD ), a stereo vision-based hardware and software architecture to be used on moving vehicles to increment road safety. 

The remapping process takes three 50 ns clock cycles per pixel, giving a total of about 3 ms togenerate a 128 128 remapped image. 

The removal of the perspective effect allows to detect road markings through an extremely simple and fast morphological processing that can be efficiently implemented on massively parallel SIMD architectures. 

Since the GOLD system is composed of two independent computational engines (the PAPRICA system, running the low-level processing, and its host computer, running the medium-level processing), it can work in pipelined. 

The choice of depends on the road markings width, on the image acquisition process, and on the parameters used in the remapping phase. 

The last phase of the whole computational cycle is the displaying of results on the control panel, issuing warnings to the driver. 

The main problems that must be faced in the detection of road boundaries or lane markings are: 1) the presence of shadows, producing artifacts onto the road surface, and thus altering its texture, and 2) the presence of other vehicles on the path, partly occluding the visibility of the road. 

In order to allow a nonfixed road geometry (and also the handling of curves) the histogram is lowpass filtered; finally, its maximum value is determined. 

Due to the small distance between and instead of computing two different polar histograms (having focus on and , a single one is considered. 

the horopter cannot be overlapped with the plane (representing the flat road model) using only camera vergence; for this purpose, electronic vergence, such as inverse perspective mapping (IPM), is required. 

The power consumption of dynamic systems can be considered proportional to where represents the capacitance of the circuit, is the clock frequency, and is the voltage swing. 

Considering an operational vehicle speed of 100 km/h and the MOB-LAB calibration setup, the vertical shift between two subsequent remapped images corresponding to two frames acquired with a temporal shift of 100 ms is only 7 pixels. 

The farther the obstacle, the smaller the portion of triangles detectable in the difference image, and thus the lower the amplitude of peaks in the polar histogram; nevertheless, for sufficiently high obstacles (e.g., vehicles at about 50 m far from the cameras), the main problem is not the detection of peaks, but their joining, as shown in Fig. 28(a)–(c). 

As shown in Fig. 24, the whole processing (lane and obstacles detection) requires five time slots (100 ms);2 the GOLD system works at a rate of 10 Hz.