Journal ArticleDOI
WS-OPE: Weakly Supervised 6-D Object Pose Regression Using Relative Multi-Camera Pose Constraints
- Vol. 7, Iss: 2, pp 3703-3710
TLDR
In this article , the authors use 2-D bounding boxes and object sizes as the only labels and constrain the problem with multiple images of known relative poses during training, which leads to better learning of 6-D pose embeddings in comparison to fully supervised methods.Abstract:
Precise annotation of 6-D poses in real data is intricate and time-consuming, however, an essential requirement to train pose estimation pipelines. We propose a way for scalable, end-to-end 6-D pose regression with weak supervision to avoid this problem. Our method requires neither 3-D models nor 6-D object poses as ground truth. Instead, we use 2-D bounding boxes and object sizes as the only labels and constrain the problem with multiple images of known relative poses during training. A novel Rotated-IoU loss brings together a pose prediction from an image with labeled 2-D bounding boxes of the corresponding object in other views. Our rotation estimation combines an initial coarse pose classification with an offset regression using a continuous rotation parametrization that allows for direct pose estimation. At test time, the model still uses only a single image to predict a 6-D pose. We observe that multi-view constraints and our rotation representation used during training lead to better learning of 6-D pose embeddings in comparison to fully supervised methods. Experiments on several datasets show that the proposed method is capable of predicting poses of good quality, in spite being trained with only weak labels. Direct pose regression without the need for a consecutive refinement stage thereby ensures real-time performance. read more
Citations
More filters
Proceedings ArticleDOI
OSOP: A Multi-Stage One Shot Object Pose Estimation Framework
TL;DR: In this article , a one-shot method for object detection and 6 DoF pose estimation is proposed, which does not require training on target objects and uses a textured 3D query model.
Journal ArticleDOI
Ambiguity-Aware Multi-Object Pose Optimization for Visually-Assisted Robot Manipulation
TL;DR: Zhang et al. as discussed by the authors presented an ambiguity-aware 6D object pose estimation network, PrimA6D++, as a generic uncertainty prediction method, which can handle the major challenges in pose estimation, such as occlusion and symmetry, based on the measured ambiguity of the prediction.
Journal ArticleDOI
3D hand pose estimation from a single RGB image by weighting the occlusion and classification
TL;DR: Wang et al. as discussed by the authors proposed a new framework for 3D hand pose estimation using a single RGB image, which is composed of two blocks: the first block formulates the pose estimation as a classification problem and the second block estimates the 3D coordinates of the hand joints and focuses more on the details of the image pattern.
Journal ArticleDOI
Weak6D: Weakly Supervised 6D Pose Estimation With Iterative Annotation Resolver
TL;DR: Weak6D as mentioned in this paper employs a weak refinement loss to optimize the pose estimation network with refined object poses, which can directly utilize the captured RGB-D data through the training process.
Book ChapterDOI
WeLSA: Learning to Predict 6D Pose from Weakly Labeled Data Using Shape Alignment
TL;DR: In this article , a weakly supervised approach for object pose estimation from RGB-D data using training sets composed of very few labeled images with pose annotations along with weakly-labeled images with ground truth segmentation masks without pose labels is proposed.
References
More filters
Proceedings ArticleDOI
You Only Look Once: Unified, Real-Time Object Detection
TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.
Journal ArticleDOI
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
TL;DR: This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.
Proceedings ArticleDOI
Fast R-CNN
TL;DR: Fast R-CNN as discussed by the authors proposes a Fast Region-based Convolutional Network method for object detection, which employs several innovations to improve training and testing speed while also increasing detection accuracy and achieves a higher mAP on PASCAL VOC 2012.
Proceedings ArticleDOI
Mask R-CNN
TL;DR: This work presents a conceptually simple, flexible, and general framework for object instance segmentation, which extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.
Journal ArticleDOI
ORB-SLAM: a Versatile and Accurate Monocular SLAM System
TL;DR: A survival of the fittest strategy that selects the points and keyframes of the reconstruction leads to excellent robustness and generates a compact and trackable map that only grows if the scene content changes, allowing lifelong operation.