PointPillars: Fast Encoders for Object Detection from Point Clouds

Open AccessPosted Content

PointPillars: Fast Encoders for Object Detection from Point Clouds

- 14 Dec 2018 -

TLDR

PointPillars as mentioned in this paper utilizes PointNets to learn a representation of point clouds organized in vertical columns (pillars), which can be used with any standard 2D convolutional detection architecture.

Abstract:

Object detection in point clouds is an important aspect of many robotics applications such as autonomous driving. In this paper we consider the problem of encoding a point cloud into a format appropriate for a downstream detection pipeline. Recent literature suggests two types of encoders; fixed encoders tend to be fast but sacrifice accuracy, while encoders that are learned from data are more accurate, but slower. In this work we propose PointPillars, a novel encoder which utilizes PointNets to learn a representation of point clouds organized in vertical columns (pillars). While the encoded features can be used with any standard 2D convolutional detection architecture, we further propose a lean downstream network. Extensive experimentation shows that PointPillars outperforms previous encoders with respect to both speed and accuracy by a large margin. Despite only using lidar, our full detection pipeline significantly outperforms the state of the art, even among fusion methods, with respect to both the 3D and bird's eye view KITTI benchmarks. This detection performance is achieved while running at 62 Hz: a 2 - 4 fold runtime improvement. A faster version of our method matches the state of the art at 105 Hz. These benchmarks suggest that PointPillars is an appropriate encoding for object detection in point clouds.

Citations

PDF

Open Access

More filters

Posted Content

nuScenes: A multimodal dataset for autonomous driving

Holger Caesar, +9 more

- 26 Mar 2019 -

arXiv: Learning

TL;DR: nuScenes as mentioned in this paper is the first dataset to carry the full autonomous vehicle sensor suite: 6 cameras, 5 radars and 1 lidar, all with full 360 degree field of view.

...read moreread less

Journal ArticleDOI

Deep Learning for 3D Point Clouds: A Survey

Yulan Guo, +5 more

- 01 Dec 2021 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This paper presents a comprehensive review of recent progress in deep learning methods for point clouds, covering three major tasks, including 3D shape classification, 3D object detection and tracking, and 3D point cloud segmentation.

...read moreread less

Posted Content

Scalability in Perception for Autonomous Driving: Waymo Open Dataset

Pei Sun, +24 more

- 10 Dec 2019 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work introduces a new large scale, high quality, diverse dataset, consisting of well synchronized and calibrated high quality LiDAR and camera data captured across a range of urban and suburban geographies, and studies the effects of dataset size and generalization across geographies on 3D detection methods.

...read moreread less

Proceedings ArticleDOI

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds

Qingyong Hu, +7 more

TL;DR: This paper introduces RandLA-Net, an efficient and lightweight neural architecture to directly infer per-point semantics for large-scale point clouds, and introduces a novel local feature aggregation module to progressively increase the receptive field for each 3D point, thereby effectively preserving geometric details.

...read moreread less

Proceedings ArticleDOI

Argoverse: 3D Tracking and Forecasting With Rich Maps

Ming-Fang Chang, +10 more

TL;DR: Argoverse includes sensor data collected by a fleet of autonomous vehicles in Pittsburgh and Miami as well as 3D tracking annotations, 300k extracted interesting vehicle trajectories, and rich semantic maps, which contain rich geometric and semantic metadata which are not currently available in any public dataset.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, +1 more

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

...read moreread less

Book ChapterDOI

Microsoft COCO: Common Objects in Context

Tsung-Yi Lin, +7 more

TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.

...read moreread less

Posted Content

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, +3 more

- 04 Jun 2015 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Faster R-CNN as discussed by the authors proposes a Region Proposal Network (RPN) to generate high-quality region proposals, which are used by Fast R-NN for detection.

...read moreread less

Posted Content

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, +1 more

- 11 Feb 2015 -

arXiv: Learning

TL;DR: Batch Normalization as mentioned in this paper normalizes layer inputs for each training mini-batch to reduce the internal covariate shift in deep neural networks, and achieves state-of-the-art performance on ImageNet.

...read moreread less

Journal ArticleDOI

The Pascal Visual Object Classes (VOC) Challenge

Mark Everingham, +4 more

- 01 Jun 2010 -

International Journal of Computer Vision

TL;DR: The state-of-the-art in evaluated methods for both classification and detection are reviewed, whether the methods are statistically different, what they are learning from the images, and what the methods find easy or confuse.

...read moreread less

Collapse

PointPillars: Fast Encoders for Object Detection from Point Clouds

Citations

nuScenes: A multimodal dataset for autonomous driving

Deep Learning for 3D Point Clouds: A Survey

Scalability in Perception for Autonomous Driving: Waymo Open Dataset

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds

Argoverse: 3D Tracking and Forecasting With Rich Maps

References

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Microsoft COCO: Common Objects in Context

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

The Pascal Visual Object Classes (VOC) Challenge

Related Papers (5)

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection

Multi-view 3D Object Detection Network for Autonomous Driving

Frustum PointNets for 3D Object Detection from RGB-D Data