BoxCars: 3D Boxes as CNN Input for Improved Fine-Grained Vehicle Recognition

doi:10.1109/CVPR.2016.328

Proceedings ArticleDOI

BoxCars: 3D Boxes as CNN Input for Improved Fine-Grained Vehicle Recognition

Jakub Sochor, +2 more

- pp 3006-3015

Chats0

TLDR

This work is showing that extracting additional data from the video stream and feeding it into the deep convolutional neural network boosts the recognition performance considerably, and can considerably improve the performance of traffic surveillance systems.

Abstract:

We are dealing with the problem of fine-grained vehicle make&model recognition and verification. Our contribution is showing that extracting additional data from the video stream – besides the vehicle image itself – and feeding it into the deep convolutional neural network boosts the recognition performance considerably. This additional information includes: 3D vehicle bounding box used for "unpacking" the vehicle image, its rasterized low-resolution shape, and information about the 3D vehicle orientation. Experiments show that adding such information decreases classification error by 26% (the accuracy is improved from 0.772 to 0.832) and boosts verification average precision by 208% (0.378 to 0.785) compared to baseline pure CNN without any input modifications. Also, the pure baseline CNN outperforms the recent state of the art solution by 0.081. We provide an annotated set "BoxCars" of surveillance vehicle images augmented by various automatically extracted auxiliary information. Our approach and the dataset can considerably improve the performance of traffic surveillance systems.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Learning a Discriminative Filter Bank Within a CNN for Fine-Grained Recognition

Yaming Wang, +2 more

TL;DR: The authors proposed a bank of convolutional filters to capture class-specific discriminative patches without extra part or bounding box annotations, which achieves state-of-the-art performance on three publicly available fine-grained recognition datasets (CUB-200-2011, Stanford Cars and FGVC-Aircraft).

...read moreread less

Proceedings ArticleDOI

Orientation Invariant Feature Embedding and Spatial Temporal Regularization for Vehicle Re-identification

Zhongdao Wang, +9 more

TL;DR: Both the orientation invariant feature embedding and the spatio-temporal regularization achieve considerable improvements in the vehicle Re-identification problem.

...read moreread less

Proceedings ArticleDOI

Learning Deep Neural Networks for Vehicle Re-ID with Visual-spatio-Temporal Path Proposals

Yantao Shen, +4 more

TL;DR: In this article, a Siamese-CNN+Path-LSTM model was proposed to incorporate complex spatio-temporal information for regularizing the re-ID results.

...read moreread less

Book ChapterDOI

Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition

Chaojian Yu, +4 more

TL;DR: A cross-layer bilinear pooling approach is proposed to capture the inter-layer part feature relations, which results in superior performance compared with other bilinears pooling based approaches.

...read moreread less

Journal ArticleDOI

Deep Attention-Based Spatially Recursive Networks for Fine-Grained Visual Recognition

Lin Wu, +3 more

- 01 May 2019 -

IEEE Transactions on Systems, Man, and C...

TL;DR: A deep attention-based spatially recursive model that can learn to attend to critical object parts and encode them into spatially expressive representations and is end-to-end trainable to serve as the part detector and feature extractor.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Proceedings ArticleDOI

Going deeper with convolutions

Christian Szegedy, +8 more

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

...read moreread less

Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Dec 2015 -

International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Journal ArticleDOI

The Pascal Visual Object Classes (VOC) Challenge

Mark Everingham, +4 more

- 01 Jun 2010 -

International Journal of Computer Vision

TL;DR: The state-of-the-art in evaluated methods for both classification and detection are reviewed, whether the methods are statistically different, what they are learning from the images, and what the methods find easy or confuse.

...read moreread less

Proceedings ArticleDOI

Are we ready for autonomous driving? The KITTI vision benchmark suite

Andreas Geiger, +2 more

TL;DR: The autonomous driving platform is used to develop novel challenging benchmarks for the tasks of stereo, optical flow, visual odometry/SLAM and 3D object detection, revealing that methods ranking high on established datasets such as Middlebury perform below average when being moved outside the laboratory to the real world.

...read moreread less

Collapse

BoxCars: 3D Boxes as CNN Input for Improved Fine-Grained Vehicle Recognition

Citations

Learning a Discriminative Filter Bank Within a CNN for Fine-Grained Recognition

Orientation Invariant Feature Embedding and Spatial Temporal Regularization for Vehicle Re-identification

Learning Deep Neural Networks for Vehicle Re-ID with Visual-spatio-Temporal Path Proposals

Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition

Deep Attention-Based Spatially Recursive Networks for Fine-Grained Visual Recognition

References

ImageNet Classification with Deep Convolutional Neural Networks

Going deeper with convolutions

ImageNet Large Scale Visual Recognition Challenge

The Pascal Visual Object Classes (VOC) Challenge

Are we ready for autonomous driving? The KITTI vision benchmark suite

Related Papers (5)

Deep Residual Learning for Image Recognition

3D Object Representations for Fine-Grained Categorization

Going deeper with convolutions

ImageNet Classification with Deep Convolutional Neural Networks

ImageNet: A large-scale hierarchical image database