GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation
Gu Wang,Fabian Manhardt,Federico Tombari,Xiangyang Ji +3 more
- pp 16611-16621
Reads0
Chats0
TLDR
GDR-Net as mentioned in this paper proposes a geometry-guided direct regression network to learn the 6D pose in an end-to-end manner from dense correspondence-based intermediate geometric representations.Abstract:
6D pose estimation from a single RGB image is a fundamental task in computer vision. The current top-performing deep learning-based methods rely on an indirect strategy, i.e., first establishing 2D-3D correspondences between the coordinates in the image plane and object coordinate system, and then applying a variant of the PnP/RANSAC algorithm. However, this two-stage pipeline is not end-to-end trainable, thus is hard to be employed for many tasks requiring differentiable poses. On the other hand, methods based on direct regression are currently inferior to geometry-based methods. In this work, we perform an in-depth investigation on both direct and indirect methods, and propose a simple yet effective Geometry-guided Direct Regression Network (GDR-Net) to learn the 6D pose in an end-to-end manner from dense correspondence-based intermediate geometric representations. Extensive experiments show that our approach remarkably outperforms state-of-the-art methods on LM, LM-O and YCB-V datasets. Code is available at https://git.io/GDR-Net.read more
Citations
More filters
Proceedings ArticleDOI
OnePose: One-Shot Object Pose Estimation without CAD Models
TL;DR: A new graph attention network is proposed that directly matches 2D interest points in the query image with the 3D Points in the SfM model, resulting in efficient and robust pose estimation and is able to stably detect and track 6D poses of everyday household objects in real-time.
Proceedings ArticleDOI
ZebraPose: Coarse to Fine Surface Encoding for 6DoF Object Pose Estimation
Yongzhi Su,Mahdi Saleh,Torben Fetzer,Jason Rambach,Nassir Navab,Benjamin Busam,Didier Stricker,Federico Tombari +7 more
TL;DR: This work presents a discrete descriptor, which can represent the object surface densely by incorporating a hierarchical binary grouping, and proposes a coarse to fine training strategy, which enables fine-grained correspondence prediction of the 6DoF pose.
Proceedings ArticleDOI
EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation
TL;DR: The EPro-PnP is proposed, a probabilistic PnP layer for general end-to-end pose estimation, which outputs a distribution of pose on the SE(3) manifold, essentially bringing categorical Softmax to the continuous domain.
Posted Content
SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation
TL;DR: In this paper, a two-layer representation for 3D objects is proposed to enhance the accuracy of end-to-end 6D pose estimation by using self-occlusion.
Proceedings ArticleDOI
GPV-Pose: Category-level Object Pose Estimation via Geometry-guided Point-wise Voting
TL;DR: GPV-Pose is proposed, a novel framework for robust category-level pose estimation, harnessing geometric insights to enhance the learning of category- level pose-sensitive features, which produces superior results to state-of-the-art competitors on common public benchmarks, whilst almost achieving real-time inference speed at 20 FPS.
References
More filters
Proceedings Article
Faster R-CNN: towards real-time object detection with region proposal networks
TL;DR: Ren et al. as discussed by the authors proposed a region proposal network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals.
Posted Content
YOLOv3: An Incremental Improvement.
Joseph Redmon,Ali Farhadi +1 more
TL;DR: The authors present some updates to YOLO!
Proceedings Article
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke,Sam Gross,Francisco Massa,Adam Lerer,James Bradbury,Gregory Chanan,Trevor Killeen,Zeming Lin,Natalia Gimelshein,Luca Antiga,Alban Desmaison,Andreas Kopf,Edward Z. Yang,Zachary DeVito,Martin Raison,Alykhan Tejani,Sasank Chilamkurthy,Benoit Steiner,Lu Fang,Junjie Bai,Soumith Chintala +20 more
TL;DR: This paper details the principles that drove the implementation of PyTorch and how they are reflected in its architecture, and explains how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance.
Proceedings ArticleDOI
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
TL;DR: This paper designs a novel type of neural network that directly consumes point clouds, which well respects the permutation invariance of points in the input and provides a unified architecture for applications ranging from object classification, part segmentation, to scene semantic parsing.
Posted Content
YOLOv4: Optimal Speed and Accuracy of Object Detection
TL;DR: This work uses new features: WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, C mBN, DropBlock regularization, and CIoU loss, and combine some of them to achieve state-of-the-art results: 43.5% AP for the MS COCO dataset at a realtime speed of ~65 FPS on Tesla V100.