Frustum ConvNet: Sliding Frustums to Aggregate Local Point-Wise Features for Amodal

doi:10.1109/IROS40897.2019.8968513

Open AccessProceedings ArticleDOI

Frustum ConvNet: Sliding Frustums to Aggregate Local Point-Wise Features for Amodal

- pp 1742-1749

TLDR

A novel method termed Frustum ConvNet (F-ConvNet), which aggregates point-wise features as frustum-level feature vectors, and arrays these feature vectors as a feature map for use of its subsequent component of fully convolutional network (FCN).

Abstract:

In this work, we propose a novel method termed Frustum ConvNet (F-ConvNet) for amodal 3D object detection from point clouds. Given 2D region proposals in an RGB image, our method first generates a sequence of frustums for each region proposal, and uses the obtained frustums to group local points. F-ConvNet aggregates point-wise features as frustum-level feature vectors, and arrays these feature vectors as a feature map for use of its subsequent component of fully convolutional network (FCN), which spatially fuses frustum-level features and supports an end-to-end and continuous estimation of oriented boxes in the 3D space. We also propose component variants of F-ConvNet, including an FCN variant that extracts multi-resolution frustum features, and a refined use of F-ConvNet over a reduced 3D space. Careful ablation studies verify the efficacy of these component variants. F-ConvNet assumes no prior knowledge of the working 3D environment and is thus dataset-agnostic. We present experiments on both the indoor SUN-RGBD and outdoor KITTI datasets. F-ConvNet outperforms all existing methods on SUN-RGBD, and at the time of submission it outperforms all published works on the KITTI benchmark. Code has been made available at: https://github.com/zhixinwang/frustum-convnet.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Deep Multi-Modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges

Di Feng, +7 more

- 01 Mar 2021 -

IEEE Transactions on Intelligent Transpo...

TL;DR: In this article, the authors systematically summarize methodologies and discuss challenges for deep multi-modal object detection and semantic segmentation in autonomous driving and provide an overview of on-board sensors on test vehicles, open datasets, and background information for object detection.

...read moreread less

Book ChapterDOI

3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-view Spatial Feature Fusion for 3D Object Detection

Jin Hyeok Yoo, +3 more

TL;DR: Li et al. as discussed by the authors proposed a 3D-CVF that combines the camera and LiDAR features using the cross-view spatial feature fusion strategy, which achieved state-of-the-art performance in the KITTI benchmark.

...read moreread less

Journal ArticleDOI

Deep Learning for Image and Point Cloud Fusion in Autonomous Driving: A Review

Yaodong Cui, +5 more

- 10 Apr 2020 -

IEEE Transactions on Intelligent Transpo...

TL;DR: A review of recent deep-learning-based data fusion approaches that leverage both image and point cloud data processing and identifies gaps and over-looked challenges between current academic researches and real-world applications.

...read moreread less

Journal ArticleDOI

Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: a review

Guoguang Du, +3 more

- 01 Mar 2021 -

Artificial Intelligence Review

TL;DR: Three key tasks during vision-based robotic grasping are concluded, which are object localization, object pose estimation and grasp estimation, which include 2D planar grasp methods and 6DoF grasp methods.

...read moreread less

Proceedings ArticleDOI

Joint 3D Instance Segmentation and Object Detection for Autonomous Driving

Dingfu Zhou, +7 more

TL;DR: A simple but practical detection framework to jointly predict the 3D BBox and instance segmentation and a Spatial Embeddings (SEs) strategy to assemble all foreground points into their corresponding object centers is proposed.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Book ChapterDOI

SSD: Single Shot MultiBox Detector

Wei Liu, +6 more

TL;DR: The approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, which makes SSD easy to train and straightforward to integrate into systems that require a detection component.

...read moreread less

Proceedings ArticleDOI

Fast R-CNN

Ross Girshick

TL;DR: Fast R-CNN as discussed by the authors proposes a Fast Region-based Convolutional Network method for object detection, which employs several innovations to improve training and testing speed while also increasing detection accuracy and achieves a higher mAP on PASCAL VOC 2012.

...read moreread less

Proceedings ArticleDOI

Are we ready for autonomous driving? The KITTI vision benchmark suite

Andreas Geiger, +2 more

TL;DR: The autonomous driving platform is used to develop novel challenging benchmarks for the tasks of stereo, optical flow, visual odometry/SLAM and 3D object detection, revealing that methods ranking high on established datasets such as Middlebury perform below average when being moved outside the laboratory to the real world.

...read moreread less

Proceedings ArticleDOI

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

R. Qi Charles, +3 more

TL;DR: This paper designs a novel type of neural network that directly consumes point clouds, which well respects the permutation invariance of points in the input and provides a unified architecture for applications ranging from object classification, part segmentation, to scene semantic parsing.

...read moreread less

Collapse

Frustum ConvNet: Sliding Frustums to Aggregate Local Point-Wise Features for Amodal

Citations

Deep Multi-Modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges

3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-view Spatial Feature Fusion for 3D Object Detection

Deep Learning for Image and Point Cloud Fusion in Autonomous Driving: A Review

Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: a review

Joint 3D Instance Segmentation and Object Detection for Autonomous Driving

References

Deep Residual Learning for Image Recognition

SSD: Single Shot MultiBox Detector

Fast R-CNN

Are we ready for autonomous driving? The KITTI vision benchmark suite

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

Related Papers (5)

Frustum PointNets for 3D Object Detection from RGB-D Data

Multi-view 3D Object Detection Network for Autonomous Driving

SECOND: Sparsely Embedded Convolutional Detection

VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection

Are we ready for autonomous driving? The KITTI vision benchmark suite