GOLD: a parallel real-time stereo vision system for generic obstacle and lane detection
Summary (3 min read)
A. Lane Detection
- Lane detection is performed assuming that a road marking in the plane of the space (i.e., in the remapped image) is represented by a quasivertical bright line of constant width surrounded by a darker region (the road).
- Thus, the pixels belonging to a road marking have a brightness value higher than their left and right neighbors at a given horizontal distance.
- The result of the geodesic dilation is the product between the control image and the maximum value computed among all the pixels belonging to the neighborhood described by the structuring element.
- Subsequently, all pairs with are considered: the image is scanned line by line from its bottom (where it is more probable to be able to detect the center of the road) to its top, and the longest chain of road centers is built, exploiting the image vertical correlation.
- Fig. 12(a) shows the final result starting from the binary image shown in Fig. 9(e).
B. Obstacle Detection
- As shown in Section III, the stereo IPM technique can produce a difference image in which ideal square obstacles are transformed into two triangles.
- The focus of the polar histogram is placed in the middle of in this case the polar histogram presents an appreciable peak corresponding to each triangle.
- Since the presence of an obstacle produces two disjoint triangles (corresponding to its edges) in the difference image, obstacle detection is reduced to the search for pairs of adjacent peaks; the position of a peak, in fact, determines the angle of view under which the obstacle edge is seen (see Fig. 17).
- According to the notations of Fig. 19, is defined as the ratio between areas and If is greater than a threshold, two adjacent peaks are considered as generated by the same obstacle, and thus joined (see Fig. 20).
- Fig. 22 shows the results obtained in a number of different situations.
A. Removing the Perspective Effect
- The procedure aimed to remove the perspective effect resamples the incoming image, remapping each pixel toward a different position and producing a new two-dimensional (2- D) array of pixels.
- The resulting image represents a top view of the road region in front of the vehicle, as it was observed from a significant height.
- , representing the 3-D world space (world-coordinate), where the real world is defined.
III. STEREO INVERSE PERSPECTIVE MAPPING
- A 3-D description of the world using a single 2-D image is impossible without a priori knowledge, due to the depth loss during acquisition; for many years stereo vision has been investigated as an answer to this problem.
- The intrinsic complexity of the determination of homologous points can be reduced with the introduction of some domainspecific constraints, such as the assumption of a flat road in front of the cameras.
- The set of points where and represent the projection of in the space of the left and right camera respectively, is called horopter and represents the zero disparity surface of the stereo system [11].
- This concept is extremely useful when the horopter coincides with a model of the road surface, since any deviation from this model can be easily detected.
- The flat road hypothesis can be verified computing the difference between the two remapped images: a generic obstacle (anything raising out from the road) is detected if the difference image presents sufficiently large clusters of nonzero pixels having a specific shape.
A. Camera Calibration
- From the above description, it can be seen that the calibration of the vision system plays a basic role.
- Recalling the definitions and notations given in Section II-A1, the calibration parameters can be divided into the following two categories.
- Extrinsic parameters (view point and viewing direction), which can be determined by measurements and possibly tuned.
- After the independent calibration of both cameras, a fine tuning of the and parameters is obtained applying the stereo IPM algorithm iteratively, and minimizing the disparities between the two remapped images of a flat road acquired with the vehicle standing still.
- The acquisition parameters of the camera installed onto MOB-LAB are shown in Table I. Fig. 8(a) shows the horizontal calibration of the left camera installed onto MOB-LAB, while Fig. 8(b) shows the remapped image with an aspect ratio of one.
IV. DRIVING ASSISTANCE FUNCTIONS
- In the following section the lane detection and obstacle detection functionalities are discussed.
- Both of them are divided into a low-level phase that can be efficiently expressed with a SIMD computational paradigm and a serial high- and medium-level phase.
V. THE COMPUTING ARCHITECTURE
- Due to the specific field of application, the response time of the system is a major critical point, since it affects directly the maximum speed allowed for the vehicle; the choice of the computing architecture is, thus, a key design issue [10].
- These systems, using slower device speeds, provide an effective mechanism to trade power consumption for silicon area, while maintaining the computational power unchanged.
- 4) Since the number of processing units must be high, if the system has size constraints the PE’s must be extremely simple, performing only simple basic operations.
- The result of the graphical operator is then either stored in a destination bit-plane or used as the first operand of the following logical operation.
VI. PERFORMANCE ANALYSIS
- Since the GOLD system is composed of two independent computational engines (the PAPRICA system, running the low-level processing, and its host computer, running the medium-level processing), it can work in pipelined.
- As shown in Fig. 24, the lane detection and obstacle detection tasks are divided into the following categories.
- 1) Data Acquisition and Output: A pair of grey-level stereo images of size 512 256 pixels are acquired simultaneously and written directly into PAPRICA image memory.
- At the same time, the result of previous computations are displayed on an on-board monitor to generate a visual feedback to the driver.
- This phase, again managed by PAPRICA system, takes 25 ms; the result is then transferred (in 3 ms) to the host computer.
VII. DISCUSSION
- A system (hardware and software) for lane and obstacle detection has been presented, satisfying the hard realtime constraints imposed by the automotive field.
- The farther the obstacle, the smaller the portion of triangles detectable in the difference image, and thus the lower the amplitude of peaks in the polar histogram; nevertheless, for sufficiently high obstacles (e.g., vehicles at about 50 m far from the cameras), the main problem is not the detection of peaks, but their joining, as shown in Fig. 28(a)–(c).
- Considering an operational vehicle speed of 100 km/h and the MOB-LAB calibration setup, the vertical shift between two subsequent remapped images corresponding to two frames acquired with a temporal shift of 100 ms is only 7 pixels.
- This high correlation allows to average in time the results of the processing, thus reducing the problems of the incomplete detection of obstacles explained above.
- An extension to the GOLD system that is able to exploit temporal correlations and to perform a deeper datafusion between the two functionalities of lane detection and obstacle detection is currently under test [2] on ARGO.
ACKNOWLEDGMENT
- The authors express their gratitude to E. Dickmanns for his outstanding suggestions, to F. Gregoretti, L. Reyneri, C. Sansoé, and R. Passerone of the Polytechnic Institute of Torino, for the enthusiastic joint development of the PAPRICA system, and to G. Quaglia and all the friends from IEN Galileo Ferraris, Torino, for their help during the tests on MOB-LAB.
- The authors also acknowledge the significative contribution of all the students who were involved in this project, in particular, A. Fascioli.
- Finally, the authors are also in debt to G. Conte and G. Adorni for their support in this research.
Did you find this useful? Give us your feedback
Citations
1,181 citations
Cites background from "GOLD: a parallel real-time stereo v..."
...Goerick et al. [43] andNoli et al. [72] usedthe (LOC) method (see Section 5.1.5) to extract edge information....
[...]
...A brief review of active and passive sensors is presented in Section 3....
[...]
...Detailed reviews of Hypothesis Generation (HG) and Hypothesis Verification (HV) methods are presented in Sections 5 and 6 while exploring temporal continuity by integrating detection with tracking is discussed in Section 7....
[...]
...We have described in Section 5 a multiresolution scheme addressing these issues....
[...]
...Several national and international projects have been launched over the past several years to investigate new technologies for improving safety and accident prevention (see Section 2)....
[...]
1,056 citations
Cites background or methods from "GOLD: a parallel real-time stereo v..."
...[20] M. Bertozzi and A. Broggi, “GOLD: A parallel real-time stereo vision system for generic obstacle and lane detection,” IEEE Trans....
[...]
...Bertozzi and Broggi [20] assumed that the road markings form...
[...]
...The generic obstacle and lane detection (GOLD) system [20] combined lane-position tracking...
[...]
...[15] M. Bertozzi, A. Broggi, M. Cellario, A. Fascioli, P. Lombardi, and M. Porta, “Artificial vision in road vehicles,” Proc....
[...]
...Bertozzi and Broggi [20] assumed that the road markings form parallel lines in an inverse-perspective-warped image....
[...]
812 citations
Cites methods from "GOLD: a parallel real-time stereo v..."
...The featurebased technique localizes the lanes in the road images by combining the low-level features, such as painted lines [5–10] or lane edges [1,2], etc....
[...]
615 citations
606 citations
Cites background or methods from "GOLD: a parallel real-time stereo v..."
...Using this approach, the localization of the lane and the detection of generic obstacles on the road can be performed without any 3Dworld reconstruction [4]....
[...]
...GOLD system [4] for ARGO and MOB-LAB vehicles (Prometheus project) † Spatial-domain processing for alf and object detection † Temporal projection of lane locations † Feature-driven approach † Autonomous vehicle guidance † Temporal estimation of vehicle’s state variables † Edge detection constrained on lane width in each stereo image for alf † Moving camera...
[...]
...Morphological edge-detection schemes have been extensively applied, since they exhibit superior performance [4,18,50]....
[...]
...Thus, an important task of video systems is to remove the inherent perspective effect from acquired images [3,4]....
[...]
...Furthermore, the inverse perspective mapping can be used to simplify the process of lane detection, similar to the process of object detection considered in Section 3 [4]....
[...]
References
9,566 citations
"GOLD: a parallel real-time stereo v..." refers background or methods in this paper
...The enhancement of the filtered image is performed through a few iterations of a geodesic morphological dilation[ 53 ] with the following binary structuring element:...
[...]
...otherwise (7) is the control image [ 53 ]. The result of the geodesic dilation is the product between the control image and the maximum value computed among all the pixels belonging to the neighborhood described by the structuring element....
[...]
...Graphical operators derive from mathematical morphology [27], [ 53 ], a bit-map approach to image processing based on set theory....
[...]
2,690 citations
2,676 citations
"GOLD: a parallel real-time stereo v..." refers background in this paper
...Graphical operators derive from mathematical morphology [ 27 ], [53], a bit-map approach to image processing based on set theory....
[...]
...The low-level portion of the processing, detailed in Fig. 14, is thus reduced to the difference between the two remapped images, a threshold, and a morphological opening [ 27 ] aimed to the removal of small-sized details in the thresholded image....
[...]
2,337 citations
"GOLD: a parallel real-time stereo v..." refers background in this paper
...Thus, for power saving reasons, it is desirable to operate at the lowest possible speed, but, in order to maintain the overall system performance, compensation for these increased delays is required [14], [16]....
[...]
1,243 citations
Related Papers (5)
Frequently Asked Questions (14)
Q2. How long does it take to generate a remapped image?
The remapping process takes three 50 ns clock cycles per pixel, giving a total of about 3 ms togenerate a 128 128 remapped image.
Q3. What is the purpose of the removal of the perspective effect?
The removal of the perspective effect allows to detect road markings through an extremely simple and fast morphological processing that can be efficiently implemented on massively parallel SIMD architectures.
Q4. How can the GOLD system work in pipelined?
Since the GOLD system is composed of two independent computational engines (the PAPRICA system, running the low-level processing, and its host computer, running the medium-level processing), it can work in pipelined.
Q5. What is the choice of for the remapping phase?
The choice of depends on the road markings width, on the image acquisition process, and on the parameters used in the remapping phase.
Q6. What is the last phase of the whole computational cycle?
The last phase of the whole computational cycle is the displaying of results on the control panel, issuing warnings to the driver.
Q7. What is the main problem in the detection of road boundaries?
The main problems that must be faced in the detection of road boundaries or lane markings are: 1) the presence of shadows, producing artifacts onto the road surface, and thus altering its texture, and 2) the presence of other vehicles on the path, partly occluding the visibility of the road.
Q8. What is the simplest way to determine the maximum value of the histogram?
In order to allow a nonfixed road geometry (and also the handling of curves) the histogram is lowpass filtered; finally, its maximum value is determined.
Q9. Why is the polar histogram used for the detection of triangles so small?
Due to the small distance between and instead of computing two different polar histograms (having focus on and , a single one is considered.
Q10. What is the way to verify the shape of the horopter?
the horopter cannot be overlapped with the plane (representing the flat road model) using only camera vergence; for this purpose, electronic vergence, such as inverse perspective mapping (IPM), is required.
Q11. How can the power consumption of dynamic systems be considered proportional to where represents the capacitance?
The power consumption of dynamic systems can be considered proportional to where represents the capacitance of the circuit, is the clock frequency, and is the voltage swing.
Q12. How many pixels is the vertical shift between two subsequent remapped images?
Considering an operational vehicle speed of 100 km/h and the MOB-LAB calibration setup, the vertical shift between two subsequent remapped images corresponding to two frames acquired with a temporal shift of 100 ms is only 7 pixels.
Q13. What is the main problem in the detection of obstacles?
The farther the obstacle, the smaller the portion of triangles detectable in the difference image, and thus the lower the amplitude of peaks in the polar histogram; nevertheless, for sufficiently high obstacles (e.g., vehicles at about 50 m far from the cameras), the main problem is not the detection of peaks, but their joining, as shown in Fig. 28(a)–(c).
Q14. How many time slots does the GOLD system require?
As shown in Fig. 24, the whole processing (lane and obstacles detection) requires five time slots (100 ms);2 the GOLD system works at a rate of 10 Hz.