Are we ready for autonomous driving? The KITTI vision benchmark suite

doi:10.1109/CVPR.2012.6248074

Proceedings ArticleDOI

Are we ready for autonomous driving? The KITTI vision benchmark suite

- pp 3354-3361

TLDR

The autonomous driving platform is used to develop novel challenging benchmarks for the tasks of stereo, optical flow, visual odometry/SLAM and 3D object detection, revealing that methods ranking high on established datasets such as Middlebury perform below average when being moved outside the laboratory to the real world.

Abstract:

Today, visual recognition systems are still rarely employed in robotics applications. Perhaps one of the main reasons for this is the lack of demanding benchmarks that mimic such scenarios. In this paper, we take advantage of our autonomous driving platform to develop novel challenging benchmarks for the tasks of stereo, optical flow, visual odometry/SLAM and 3D object detection. Our recording platform is equipped with four high resolution video cameras, a Velodyne laser scanner and a state-of-the-art localization system. Our benchmarks comprise 389 stereo and optical flow image pairs, stereo visual odometry sequences of 39.2 km length, and more than 200k 3D object annotations captured in cluttered scenarios (up to 15 cars and 30 pedestrians are visible per image). Results from state-of-the-art algorithms reveal that methods ranking high on established datasets such as Middlebury perform below average when being moved outside the laboratory to the real world. Our goal is to reduce this bias by providing challenging benchmarks with novel difficulties to the computer vision community. Our benchmarks are available online at: www.cvlibs.net/datasets/kitti

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Going deeper with convolutions

Christian Szegedy, +8 more

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

...read moreread less

Journal ArticleDOI

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Vijay Badrinarayanan, +2 more

- 01 Dec 2017 -

IEEE Transactions on Pattern Analysis an...

TL;DR: Quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures, including FCN and DeconvNet.

...read moreread less

Journal ArticleDOI

Vision meets robotics: The KITTI dataset

Andreas Geiger, +3 more

- 01 Sep 2013 -

The International Journal of Robotics Re...

TL;DR: A novel dataset captured from a VW station wagon for use in mobile robotics and autonomous driving research, using a variety of sensor modalities such as high-resolution color and grayscale stereo cameras and a high-precision GPS/IMU inertial navigation system.

...read moreread less

Proceedings ArticleDOI

A benchmark for the evaluation of RGB-D SLAM systems

Jrgen Sturm, +4 more

TL;DR: A large set of image sequences from a Microsoft Kinect with highly accurate and time-synchronized ground truth camera poses from a motion capture system is recorded for the evaluation of RGB-D SLAM systems.

...read moreread less

The PASCAL Visual Object Classes Challenge

Jianguo Zhang

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

LIBSVM: A library for support vector machines

Chih-Chung Chang, +1 more

- 06 May 2011 -

ACM Transactions on Intelligent Systems ...

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

...read moreread less

Proceedings ArticleDOI

Histograms of oriented gradients for human detection

Navneet Dalal, +1 more

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.

...read moreread less

Book

Gaussian Processes for Machine Learning

Carl Edward Rasmussen, +1 more

TL;DR: The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics, and deals with the supervised learning problem for both regression and classification.

...read moreread less

Journal ArticleDOI

Object Detection with Discriminatively Trained Part-Based Models

Pedro F. Felzenszwalb, +3 more

- 01 Sep 2010 -

IEEE Transactions on Pattern Analysis an...

TL;DR: An object detection system based on mixtures of multiscale deformable part models that is able to represent highly variable object classes and achieves state-of-the-art results in the PASCAL object detection challenges is described.

...read moreread less

Journal ArticleDOI

A taxonomy and evaluation of dense two-frame stereo correspondence algorithms

Daniel Scharstein, +2 more

- 09 Dec 2001 -

International Journal of Computer Vision

TL;DR: This paper has designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms.

...read moreread less

Collapse

Are we ready for autonomous driving? The KITTI vision benchmark suite

Citations

Going deeper with convolutions

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Vision meets robotics: The KITTI dataset

A benchmark for the evaluation of RGB-D SLAM systems

The PASCAL Visual Object Classes Challenge

References

LIBSVM: A library for support vector machines

Histograms of oriented gradients for human detection

Gaussian Processes for Machine Learning

Object Detection with Discriminatively Trained Part-Based Models

A taxonomy and evaluation of dense two-frame stereo correspondence algorithms

Related Papers (5)

Deep Residual Learning for Image Recognition

SSD: Single Shot MultiBox Detector

Microsoft COCO: Common Objects in Context

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Fully convolutional networks for semantic segmentation