Open AccessPosted Content
PointPillars: Fast Encoders for Object Detection from Point Clouds
TLDR
PointPillars as mentioned in this paper utilizes PointNets to learn a representation of point clouds organized in vertical columns (pillars), which can be used with any standard 2D convolutional detection architecture.Abstract:
Object detection in point clouds is an important aspect of many robotics applications such as autonomous driving. In this paper we consider the problem of encoding a point cloud into a format appropriate for a downstream detection pipeline. Recent literature suggests two types of encoders; fixed encoders tend to be fast but sacrifice accuracy, while encoders that are learned from data are more accurate, but slower. In this work we propose PointPillars, a novel encoder which utilizes PointNets to learn a representation of point clouds organized in vertical columns (pillars). While the encoded features can be used with any standard 2D convolutional detection architecture, we further propose a lean downstream network. Extensive experimentation shows that PointPillars outperforms previous encoders with respect to both speed and accuracy by a large margin. Despite only using lidar, our full detection pipeline significantly outperforms the state of the art, even among fusion methods, with respect to both the 3D and bird's eye view KITTI benchmarks. This detection performance is achieved while running at 62 Hz: a 2 - 4 fold runtime improvement. A faster version of our method matches the state of the art at 105 Hz. These benchmarks suggest that PointPillars is an appropriate encoding for object detection in point clouds.read more
Citations
More filters
Posted Content
nuScenes: A multimodal dataset for autonomous driving
Holger Caesar,Varun Bankiti,Alex H. Lang,Sourabh Vora,Venice Erin Liong,Qiang Xu,Anush Krishnan,Yu Pan,Giancarlo Baldan,Oscar Beijbom +9 more
TL;DR: nuScenes as mentioned in this paper is the first dataset to carry the full autonomous vehicle sensor suite: 6 cameras, 5 radars and 1 lidar, all with full 360 degree field of view.
Journal ArticleDOI
Deep Learning for 3D Point Clouds: A Survey
TL;DR: This paper presents a comprehensive review of recent progress in deep learning methods for point clouds, covering three major tasks, including 3D shape classification, 3D object detection and tracking, and 3D point cloud segmentation.
Posted Content
Scalability in Perception for Autonomous Driving: Waymo Open Dataset
Pei Sun,Henrik Kretzschmar,Xerxes Dotiwalla,Aurelien Chouard,Vijaysai Patnaik,Paul Tsui,James Guo,Yin Zhou,Yuning Chai,Benjamin Caine,Vijay K. Vasudevan,Wei Han,Jiquan Ngiam,Hang Zhao,Aleksei Timofeev,Scott Ettinger,Maxim Krivokon,Amy Gao,Aditya Joshi,Sheng Zhao,Shuyang Cheng,Yu Zhang,Jonathon Shlens,Zhifeng Chen,Dragomir Anguelov +24 more
TL;DR: This work introduces a new large scale, high quality, diverse dataset, consisting of well synchronized and calibrated high quality LiDAR and camera data captured across a range of urban and suburban geographies, and studies the effects of dataset size and generalization across geographies on 3D detection methods.
Proceedings ArticleDOI
RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds
Qingyong Hu,Bo Yang,Linhai Xie,Stefano Rosa,Yulan Guo,Zhihua Wang,Niki Trigoni,Andrew Markham +7 more
TL;DR: This paper introduces RandLA-Net, an efficient and lightweight neural architecture to directly infer per-point semantics for large-scale point clouds, and introduces a novel local feature aggregation module to progressively increase the receptive field for each 3D point, thereby effectively preserving geometric details.
Proceedings ArticleDOI
Argoverse: 3D Tracking and Forecasting With Rich Maps
Ming-Fang Chang,Deva Ramanan,James Hays,John Lambert,Patsorn Sangkloy,Jasvinder A. Singh,Slawomir Bak,Andrew Hartnett,De Wang,Peter W. Carr,Simon Lucey +10 more
TL;DR: Argoverse includes sensor data collected by a fleet of autonomous vehicles in Pittsburgh and Miami as well as 3D tracking annotations, 300k extracted interesting vehicle trajectories, and rich semantic maps, which contain rich geometric and semantic metadata which are not currently available in any public dataset.
References
More filters
Proceedings Article
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe,Christian Szegedy +1 more
TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Book ChapterDOI
Microsoft COCO: Common Objects in Context
Tsung-Yi Lin,Michael Maire,Serge Belongie,James Hays,Pietro Perona,Deva Ramanan,Piotr Dollár,C. Lawrence Zitnick +7 more
TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.
Posted Content
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
TL;DR: Faster R-CNN as discussed by the authors proposes a Region Proposal Network (RPN) to generate high-quality region proposals, which are used by Fast R-NN for detection.
Posted Content
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe,Christian Szegedy +1 more
TL;DR: Batch Normalization as mentioned in this paper normalizes layer inputs for each training mini-batch to reduce the internal covariate shift in deep neural networks, and achieves state-of-the-art performance on ImageNet.
Journal ArticleDOI
The Pascal Visual Object Classes (VOC) Challenge
TL;DR: The state-of-the-art in evaluated methods for both classification and detection are reviewed, whether the methods are statistically different, what they are learning from the images, and what the methods find easy or confuse.