scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Looking at Vehicles on the Road: A Survey of Vision-Based Vehicle Detection, Tracking, and Behavior Analysis

TL;DR: This paper provides a review of the literature in on-road vision-based vehicle detection, tracking, and behavior understanding, and discusses the nascent branch of intelligent vehicles research concerned with utilizing spatiotemporal measurements, trajectories, and various features to characterize on- road behavior.
Abstract: This paper provides a review of the literature in on-road vision-based vehicle detection, tracking, and behavior understanding. Over the past decade, vision-based surround perception has progressed from its infancy into maturity. We provide a survey of recent works in the literature, placing vision-based vehicle detection in the context of sensor-based on-road surround analysis. We detail advances in vehicle detection, discussing monocular, stereo vision, and active sensor-vision fusion for on-road vehicle detection. We discuss vision-based vehicle tracking in the monocular and stereo-vision domains, analyzing filtering, estimation, and dynamical models. We discuss the nascent branch of intelligent vehicles research concerned with utilizing spatiotemporal measurements, trajectories, and various features to characterize on-road behavior. We provide a discussion on the state of the art, detail common performance metrics and benchmarks, and provide perspective on future research directions in the field.
Citations
More filters
Journal ArticleDOI
TL;DR: A comprehensive review of the state-of-the-art AV perception technology available today, which highlights future research areas and draws conclusions about the most effective methods for AV perception and its effect on localization and mapping.
Abstract: Perception system design is a vital step in the development of an autonomous vehicle (AV). With the vast selection of available off-the-shelf schemes and seemingly endless options of sensor systems implemented in research and commercial vehicles, it can be difficult to identify the optimal system for one’s AV application. This article presents a comprehensive review of the state-of-the-art AV perception technology available today. It provides up-to-date information about the advantages, disadvantages, limits, and ideal applications of specific AV sensors; the most prevalent sensors in current research and commercial AVs; autonomous features currently on the market; and localization and mapping methods currently implemented in AV research. This information is useful for newcomers to the AV field to gain a greater understanding of the current AV solution landscape and to guide experienced researchers towards research areas requiring further development. Furthermore, this paper highlights future research areas and draws conclusions about the most effective methods for AV perception and its effect on localization and mapping. Topics discussed in the Perception and Automotive Sensors section focus on the sensors themselves, whereas topics discussed in the Localization and Mapping section focus on how the vehicle perceives where it is on the road, providing context for the use of the automotive sensors. By improving on current state-of-the-art perception systems, AVs will become more robust, reliable, safe, and accessible, ultimately providing greater efficiency, mobility, and safety benefits to the public.

486 citations

Journal ArticleDOI
TL;DR: Experimental results demonstrate that the proposed approaches not only speed up the training and incremental learning processes of AdaBoost, but also yield better or competitive vehicle classification accuracies compared with several state-of-the-art methods, showing their potential for real-time applications.

460 citations

Journal ArticleDOI
17 Feb 2017
TL;DR: In this paper, the authors provide a general overview of the recent developments in the realm of autonomous vehicle software systems, and discuss the fundamental components of the software, as well as recent developments of each area.
Abstract: Autonomous vehicles are expected to play a key role in the future of urban transportation systems, as they offer potential for additional safety, increased productivity, greater accessibility, better road efficiency, and positive impact on the environment. Research in autonomous systems has seen dramatic advances in recent years, due to the increases in available computing power and reduced cost in sensing and computing technologies, resulting in maturing technological readiness level of fully autonomous vehicles. The objective of this paper is to provide a general overview of the recent developments in the realm of autonomous vehicle software systems. Fundamental components of autonomous vehicle software are reviewed, and recent developments in each area are discussed.

434 citations


Cites background from "Looking at Vehicles on the Road: A ..."

  • ...For more information on conventional hand-crafted feature/cue based approaches, interested readers may refer to the following survey papers: [68,69] for lane line marking detection, [70] for road surface detection, [71,72] for vehicle detection and [73] for pedestrian detection....

    [...]

Journal ArticleDOI
TL;DR: This paper presents an overview of 3D object detection methods and prevalently used sensors and datasets in AVs, and discusses and categorizes the recent works based on sensors modalities into monocular, point cloud-based, and fusion methods.
Abstract: An autonomous vehicle (AV) requires an accurate perception of its surrounding environment to operate reliably. The perception system of an AV, which normally employs machine learning (e.g., deep learning), transforms sensory data into semantic information that enables autonomous driving. Object detection is a fundamental function of this perception system, which has been tackled by several works, most of them using 2D detection methods. However, the 2D methods do not provide depth information, which is required for driving tasks, such as path planning, collision avoidance, and so on. Alternatively, the 3D object detection methods introduce a third dimension that reveals more detailed object’s size and location information. Nonetheless, the detection accuracy of such methods needs to be improved. To the best of our knowledge, this is the first survey on 3D object detection methods used for autonomous driving applications. This paper presents an overview of 3D object detection methods and prevalently used sensors and datasets in AVs. It then discusses and categorizes the recent works based on sensors modalities into monocular, point cloud-based, and fusion methods. We then summarize the results of the surveyed works and identify the research gaps and future research directions.

403 citations


Cites background from "Looking at Vehicles on the Road: A ..."

  • ...Such configuration uses matching algorithms to find correspondences in both images and calculate the depth of each point relative to the camera, demanding more processing power [18]....

    [...]

Proceedings ArticleDOI
26 Jun 2018
TL;DR: In this article, an LSTM model for interaction aware motion prediction of surrounding vehicles on freeways is presented, which assigns confidence values to maneuvers being performed by vehicles and outputs a multi-modal distribution over future motion based on them.
Abstract: To safely and efficiently navigate through complex traffic scenarios, autonomous vehicles need to have the ability to predict the future motion of surrounding vehicles. Multiple interacting agents, the multi-modal nature of driver behavior, and the inherent uncertainty involved in the task make motion prediction of surrounding vehicles a challenging problem. In this paper, we present an LSTM model for interaction aware motion prediction of surrounding vehicles on freeways. Our model assigns confidence values to maneuvers being performed by vehicles and outputs a multi-modal distribution over future motion based on them. We compare our approach with the prior art for vehicle motion prediction on the publicly available NGSIM US-101 and I-80 datasets. Our results show an improvement in terms of RMS values of prediction error. We also present an ablative analysis of the components of our proposed model and analyze the predictions made by the model in complex traffic scenarios.

364 citations

References
More filters
Journal ArticleDOI
TL;DR: High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated and the performance of the support- vector network is compared to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Abstract: The support-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data. High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.

37,861 citations


"Looking at Vehicles on the Road: A ..." refers background in this paper

  • ...Discriminative classifiers, which learn a decision boundary between two classes, have been more widely used in vehicle detection....

    [...]

Proceedings ArticleDOI
20 Jun 2005
TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.
Abstract: We study the question of feature sets for robust visual object recognition; adopting linear SVM based human detection as a test case. After reviewing existing edge and gradient based descriptors, we show experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection. We study the influence of each stage of the computation on performance, concluding that fine-scale gradients, fine orientation binning, relatively coarse spatial binning, and high-quality local contrast normalization in overlapping descriptor blocks are all important for good results. The new approach gives near-perfect separation on the original MIT pedestrian database, so we introduce a more challenging dataset containing over 1800 annotated human images with a large range of pose variations and backgrounds.

31,952 citations


"Looking at Vehicles on the Road: A ..." refers methods in this paper

  • ...The tracking problem is formulated as a maximum a posteriori inference problem over a random Markov field....

    [...]

  • ...Symmetry and edges were also used in [32] and [33], with longitudinal distance and time to collision (TTC) estimated using assumptions on the 3-...

    [...]

Journal ArticleDOI
TL;DR: New results are derived on the minimum number of landmarks needed to obtain a solution, and algorithms are presented for computing these minimum-landmark solutions in closed form that provide the basis for an automatic system that can solve the Location Determination Problem under difficult viewing.
Abstract: A new paradigm, Random Sample Consensus (RANSAC), for fitting a model to experimental data is introduced. RANSAC is capable of interpreting/smoothing data containing a significant percentage of gross errors, and is thus ideally suited for applications in automated image analysis where interpretation is based on the data provided by error-prone feature detectors. A major portion of this paper describes the application of RANSAC to the Location Determination Problem (LDP): Given an image depicting a set of landmarks with known locations, determine that point in space from which the image was obtained. In response to a RANSAC requirement, new results are derived on the minimum number of landmarks needed to obtain a solution, and algorithms are presented for computing these minimum-landmark solutions in closed form. These results provide the basis for an automatic system that can solve the LDP under difficult viewing

23,396 citations


"Looking at Vehicles on the Road: A ..." refers background or methods in this paper

  • ...In [3], Kalman filtering is used to estimate the vehicles’ yaw rate, as well as position and velocity....

    [...]

  • ...D ranging and for estimating the ground plane....

    [...]

  • ...Candidate vehicles’ locations were predicted using Kalman filtering in the image plane....

    [...]

Proceedings ArticleDOI
01 Dec 2001
TL;DR: A machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates and the introduction of a new image representation called the "integral image" which allows the features used by the detector to be computed very quickly.
Abstract: This paper describes a machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates. This work is distinguished by three key contributions. The first is the introduction of a new image representation called the "integral image" which allows the features used by our detector to be computed very quickly. The second is a learning algorithm, based on AdaBoost, which selects a small number of critical visual features from a larger set and yields extremely efficient classifiers. The third contribution is a method for combining increasingly more complex classifiers in a "cascade" which allows background regions of the image to be quickly discarded while spending more computation on promising object-like regions. The cascade can be viewed as an object specific focus-of-attention mechanism which unlike previous approaches provides statistical guarantees that discarded regions are unlikely to contain the object of interest. In the domain of face detection the system yields detection rates comparable to the best previous systems. Used in real-time applications, the detector runs at 15 frames per second without resorting to image differencing or skin color detection.

18,620 citations


"Looking at Vehicles on the Road: A ..." refers background or methods in this paper

  • ...Haar-like features have been extensively used to detect the rear faces of preceding vehicles, using a forward-facing camera [37], [41]– [49]....

    [...]

  • ...Symmetry and edges were also used in [32] and [33], with longitudinal distance and time to collision (TTC) estimated using assumptions on the 3-...

    [...]

  • ...HOG features are descriptive image features, exhibiting good detection performance in a variety of computer vision tasks, including vehicle detection, but they are generally slow to compute....

    [...]