SSD-6D: Making RGB-Based 3D Detection and 6D Pose Estimation Great Again

doi:10.1109/ICCV.2017.169

Open AccessProceedings ArticleDOI

SSD-6D: Making RGB-Based 3D Detection and 6D Pose Estimation Great Again

Wadim Kehl, +4 more

- pp 1530-1538

Chats0

TLDR

In this paper, a novel method for detecting 3D model instances and estimating their 6D pose from RGB data in a single shot is presented, which outperforms state-of-the-art methods that leverage RGBD data on multiple challenging datasets.

Abstract:

We present a novel method for detecting 3D model instances and estimating their 6D poses from RGB data in a single shot. To this end, we extend the popular SSD paradigm to cover the full 6D pose space and train on synthetic model data only. Our approach competes or surpasses current state-of-the-art methods that leverage RGBD data on multiple challenging datasets. Furthermore, our method produces these results at around 10Hz, which is many times faster than the related methods. For the sake of reproducibility, we make our trained networks and detection code publicly available.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion

Chen Wang, +6 more

TL;DR: DenseFusion as mentioned in this paper proposes a heterogeneous architecture that processes the two complementary data sources individually and uses a novel dense fusion network to extract pixel-wise dense feature embedding, from which the pose is estimated.

...read moreread less

Proceedings ArticleDOI

Real-Time Seamless Single Shot 6D Object Pose Prediction

Bugra Tekin, +2 more

TL;DR: A single-shot approach for simultaneously detecting an object in an RGB image and predicting its 6D pose without requiring multiple stages or having to examine multiple hypotheses is proposed, which substantially outperforms other recent CNN-based approaches when they are all used without postprocessing.

...read moreread less

Book ChapterDOI

Implicit 3D Orientation Learning for 6D Object Detection from RGB Images

Martin Sundermeyer, +4 more

TL;DR: This work proposes a real-time RGB-based pipeline for object detection and 6D pose estimation based on a variant of the Denoising Autoencoder trained on simulated views of a 3D model using Domain Randomization.

...read moreread less

Proceedings ArticleDOI

PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet

Yasuhiro Aoki, +3 more

TL;DR: PointNetLK as mentioned in this paper unrolls PointNet and the Lucas & Kanade (LK) algorithm into a single trainable recurrent deep neural network for point cloud registration.

...read moreread less

Proceedings ArticleDOI

PVNet: Pixel-Wise Voting Network for 6DoF Pose Estimation

Sida Peng, +4 more

TL;DR: A Pixel-wise Voting Network (PVNet) is introduced to regress pixel-wise vectors pointing to the keypoints and use these vectors to vote for keypoint locations, which creates a flexible representation for localizing occluded or truncated keypoints.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Dec 2015 -

International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Proceedings ArticleDOI

You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, +3 more

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

Book ChapterDOI

SSD: Single Shot MultiBox Detector

Wei Liu, +6 more

TL;DR: The approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, which makes SSD easy to train and straightforward to integrate into systems that require a detection component.

...read moreread less

Journal ArticleDOI

A method for registration of 3-D shapes

Paul J. Besl, +1 more

- 01 Feb 1992 -

IEEE Transactions on Pattern Analysis an...

TL;DR: In this paper, the authors describe a general-purpose representation-independent method for the accurate and computationally efficient registration of 3D shapes including free-form curves and surfaces, based on the iterative closest point (ICP) algorithm, which requires only a procedure to find the closest point on a geometric entity to a given point.

...read moreread less

Proceedings ArticleDOI

Feature Pyramid Networks for Object Detection

Tsung-Yi Lin, +5 more

TL;DR: This paper exploits the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost and achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles.

...read moreread less

Collapse

SSD-6D: Making RGB-Based 3D Detection and 6D Pose Estimation Great Again

Citations

DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion

Real-Time Seamless Single Shot 6D Object Pose Prediction

Implicit 3D Orientation Learning for 6D Object Detection from RGB Images

PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet

PVNet: Pixel-Wise Voting Network for 6DoF Pose Estimation

References

ImageNet Large Scale Visual Recognition Challenge

You Only Look Once: Unified, Real-Time Object Detection

SSD: Single Shot MultiBox Detector

A method for registration of 3-D shapes

Feature Pyramid Networks for Object Detection

Related Papers (5)

PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes

Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes

Learning 6D Object Pose Estimation Using 3D Object Coordinates

Deep Residual Learning for Image Recognition

SSD: Single Shot MultiBox Detector