GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation

doi:10.1109/CVPR46437.2021.01634

Open AccessProceedings ArticleDOI

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation

Gu Wang, +3 more

- pp 16611-16621

Chats0

TLDR

GDR-Net as mentioned in this paper proposes a geometry-guided direct regression network to learn the 6D pose in an end-to-end manner from dense correspondence-based intermediate geometric representations.

Abstract:

6D pose estimation from a single RGB image is a fundamental task in computer vision. The current top-performing deep learning-based methods rely on an indirect strategy, i.e., first establishing 2D-3D correspondences between the coordinates in the image plane and object coordinate system, and then applying a variant of the PnP/RANSAC algorithm. However, this two-stage pipeline is not end-to-end trainable, thus is hard to be employed for many tasks requiring differentiable poses. On the other hand, methods based on direct regression are currently inferior to geometry-based methods. In this work, we perform an in-depth investigation on both direct and indirect methods, and propose a simple yet effective Geometry-guided Direct Regression Network (GDR-Net) to learn the 6D pose in an end-to-end manner from dense correspondence-based intermediate geometric representations. Extensive experiments show that our approach remarkably outperforms state-of-the-art methods on LM, LM-O and YCB-V datasets. Code is available at https://git.io/GDR-Net.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

OnePose: One-Shot Object Pose Estimation without CAD Models

Jiaming Sun, +6 more

TL;DR: A new graph attention network is proposed that directly matches 2D interest points in the query image with the 3D Points in the SfM model, resulting in efficient and robust pose estimation and is able to stably detect and track 6D poses of everyday household objects in real-time.

...read moreread less

Proceedings ArticleDOI

ZebraPose: Coarse to Fine Surface Encoding for 6DoF Object Pose Estimation

Yongzhi Su, +7 more

TL;DR: This work presents a discrete descriptor, which can represent the object surface densely by incorporating a hierarchical binary grouping, and proposes a coarse to fine training strategy, which enables fine-grained correspondence prediction of the 6DoF pose.

...read moreread less

Proceedings ArticleDOI

EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

Hansheng Chen, +5 more

TL;DR: The EPro-PnP is proposed, a probabilistic PnP layer for general end-to-end pose estimation, which outputs a distribution of pose on the SE(3) manifold, essentially bringing categorical Softmax to the continuous domain.

...read moreread less

Posted Content

SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

Yan Di, +5 more

- 18 Aug 2021 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: In this paper, a two-layer representation for 3D objects is proposed to enhance the accuracy of end-to-end 6D pose estimation by using self-occlusion.

...read moreread less

Proceedings ArticleDOI

GPV-Pose: Category-level Object Pose Estimation via Geometry-guided Point-wise Voting

Yan Di, +6 more

TL;DR: GPV-Pose is proposed, a novel framework for robust category-level pose estimation, harnessing geometric insights to enhance the learning of category- level pose-sensitive features, which produces superior results to state-of-the-art competitors on common public benchmarks, whilst almost achieving real-time inference speed at 20 FPS.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Faster R-CNN: towards real-time object detection with region proposal networks

Shaoqing Ren, +3 more

TL;DR: Ren et al. as discussed by the authors proposed a region proposal network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals.

...read moreread less

Posted Content

YOLOv3: An Incremental Improvement.

Joseph Redmon, +1 more

- 08 Apr 2018 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: The authors present some updates to YOLO!

...read moreread less

Proceedings Article

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, +20 more

TL;DR: This paper details the principles that drove the implementation of PyTorch and how they are reflected in its architecture, and explains how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance.

...read moreread less

Proceedings ArticleDOI

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

R. Qi Charles, +3 more

TL;DR: This paper designs a novel type of neural network that directly consumes point clouds, which well respects the permutation invariance of points in the input and provides a unified architecture for applications ranging from object classification, part segmentation, to scene semantic parsing.

...read moreread less

Posted Content

YOLOv4: Optimal Speed and Accuracy of Object Detection

Alexey Bochkovskiy, +2 more

- 23 Apr 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work uses new features: WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, C mBN, DropBlock regularization, and CIoU loss, and combine some of them to achieve state-of-the-art results: 43.5% AP for the MS COCO dataset at a realtime speed of ~65 FPS on Tesla V100.

...read moreread less