2D-Driven 3D Object Detection in RGB-D Images

doi:10.1109/ICCV.2017.495

Proceedings ArticleDOI

2D-Driven 3D Object Detection in RGB-D Images

Jean Lahoud, +1 more

- pp 4632-4640

Chats0

TLDR

The approach makes best use of the 2D information to quickly reduce the search space in 3D, benefiting from state-of-the-art 2D object detection techniques.

Abstract:

In this paper, we present a technique that places 3D bounding boxes around objects in an RGB-D scene. Our approach makes best use of the 2D information to quickly reduce the search space in 3D, benefiting from state-of-the-art 2D object detection techniques. We then use the 3D information to orient, place, and score bounding boxes around objects. We independently estimate the orientation for every object, using previous techniques that utilize normal information. Object locations and sizes in 3D are learned using a multilayer perceptron (MLP). In the final step, we refine our detections based on object class relations within a scene. When compared to state-of-the-art detection methods that operate almost entirely in the sparse 3D domain, extensive experiments on the well-known SUN RGB-D dataset [29] show that our proposed method is much faster (4.1s per image) in detecting 3D objects in RGB-D images and performs better (3 mAP higher) than the state-of-the-art method that is 4.7 times slower and comparably to the method that is two orders of magnitude slower. This work hints at the idea that 2D-driven object detection in 3D should be further explored, especially in cases where the 3D input is sparse.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Frustum PointNets for 3D Object Detection from RGB-D Data

Charles R. Qi, +4 more

TL;DR: This work directly operates on raw point clouds by popping up RGBD scans and leverages both mature 2D object detectors and advanced 3D deep learning for object localization, achieving efficiency as well as high recall for even small objects.

...read moreread less

Proceedings ArticleDOI

Joint 3D Proposal Generation and Object Detection from View Aggregation

Jason Ku, +4 more

TL;DR: This work presents AVOD, an Aggregate View Object Detection network for autonomous driving scenarios that uses LIDAR point clouds and RGB images to generate features that are shared by two subnetworks: a region proposal network (RPN) and a second stage detector network.

...read moreread less

Proceedings ArticleDOI

Deep Hough Voting for 3D Object Detection in Point Clouds

Charles R. Qi, +3 more

TL;DR: VoteNet as mentioned in this paper proposes an end-to-end 3D object detection network based on a synergy of deep point set networks and Hough voting, which achieves state-of-the-art performance on two large datasets of real 3D scans.

...read moreread less

Proceedings ArticleDOI

PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation

Danfei Xu, +2 more

TL;DR: PointFusion as mentioned in this paper is a generic 3D object detection method that leverages both image and 3D point cloud information, which predicts multiple 3D box hypotheses and their confidences using the input 3D points as spatial anchors.

...read moreread less

Posted Content

Deep Hough Voting for 3D Object Detection in Point Clouds

Charles R. Qi, +3 more

- 21 Apr 2019 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work proposes VoteNet, an end-to-end 3D object detection network based on a synergy of deep point set networks and Hough voting that achieves state-of-the-art 3D detection on two large datasets of real 3D scans, ScanNet and SUN RGB-D with a simple design, compact model size and high efficiency.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Dec 2015 -

International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Book ChapterDOI

Microsoft COCO: Common Objects in Context

Tsung-Yi Lin, +7 more

TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.

...read moreread less

Proceedings ArticleDOI

You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, +3 more

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

Journal ArticleDOI

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, +3 more

- 01 Jun 2017 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.

...read moreread less

Collapse

2D-Driven 3D Object Detection in RGB-D Images

Citations

Frustum PointNets for 3D Object Detection from RGB-D Data

Joint 3D Proposal Generation and Object Detection from View Aggregation

Deep Hough Voting for 3D Object Detection in Point Clouds

PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation

Deep Hough Voting for 3D Object Detection in Point Clouds

References

Deep Residual Learning for Image Recognition

ImageNet Large Scale Visual Recognition Challenge

Microsoft COCO: Common Objects in Context

You Only Look Once: Unified, Real-Time Object Detection

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Related Papers (5)

Multi-view 3D Object Detection Network for Autonomous Driving

PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

SSD: Single Shot MultiBox Detector

Faster R-CNN: towards real-time object detection with region proposal networks