scispace - formally typeset
Open AccessProceedings ArticleDOI

PVN3D: A Deep Point-Wise 3D Keypoints Voting Network for 6DoF Pose Estimation

Reads0
Chats0
TLDR
PVN3D as mentioned in this paper proposes a deep Hough voting network to detect 3D keypoints of objects and then estimate the 6D pose parameters within a least-squares fitting manner.
Abstract
In this work, we present a novel data-driven method for robust 6DoF object pose estimation from a single RGBD image. Unlike previous methods that directly regressing pose parameters, we tackle this challenging task with a keypoint-based approach. Specifically, we propose a deep Hough voting network to detect 3D keypoints of objects and then estimate the 6D pose parameters within a least-squares fitting manner. Our method is a natural extension of 2D-keypoint approaches that successfully work on RGB based 6DoF estimation. It allows us to fully utilize the geometric constraint of rigid objects with the extra depth information and is easy for a network to learn and optimize. Extensive experiments were conducted to demonstrate the effectiveness of 3D-keypoint detection in the 6D pose estimation task. Experimental results also show our method outperforms the state-of-the-art methods by large margins on several benchmarks. Code and video are available at https://github.com/ethnhe/PVN3D.git.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation

TL;DR: FFB6D as discussed by the authors proposes a bidirectional fusion network to combine appearance and geometry information for representation learning as well as output representation selection, which can leverage local and global complementary in-formation from the other one to obtain better representations.
Journal ArticleDOI

Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: a review

TL;DR: Three key tasks during vision-based robotic grasping are concluded, which are object localization, object pose estimation and grasp estimation, which include 2D planar grasp methods and 6DoF grasp methods.
Proceedings ArticleDOI

FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism

TL;DR: Li et al. as discussed by the authors proposed a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation from a monocular RGB-D image.
Book ChapterDOI

Cascade Graph Neural Networks for RGB-D Salient Object Detection

TL;DR: Cascade Graph Neural Networks (Cas-GNN) as mentioned in this paper is a unified framework which is capable of comprehensively distilling and reasoning the mutual benefits between these two data sources through a set of cascade graphs, to learn powerful representations for RGB-D salient object detection.
Proceedings ArticleDOI

RGB Matters: Learning 7-DoF Grasp Poses on Monocular RGBD Images

TL;DR: RGBD-Grasp as discussed by the authors decouples the grasp detection into two sub-tasks where RGB and depth information are processed separately, and achieves state-of-the-art results on GraspNet-1Billion dataset compared with several baselines.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Proceedings ArticleDOI

Object recognition from local scale-invariant features

TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.
Proceedings ArticleDOI

Mask R-CNN

TL;DR: This work presents a conceptually simple, flexible, and general framework for object instance segmentation, which extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.
Book ChapterDOI

SURF: speeded up robust features

TL;DR: A novel scale- and rotation-invariant interest point detector and descriptor, coined SURF (Speeded Up Robust Features), which approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster.
Related Papers (5)