scispace - formally typeset
Journal ArticleDOI

WS-OPE: Weakly Supervised 6-D Object Pose Regression Using Relative Multi-Camera Pose Constraints

- Vol. 7, Iss: 2, pp 3703-3710
TLDR
In this article , the authors use 2-D bounding boxes and object sizes as the only labels and constrain the problem with multiple images of known relative poses during training, which leads to better learning of 6-D pose embeddings in comparison to fully supervised methods.
Abstract
Precise annotation of 6-D poses in real data is intricate and time-consuming, however, an essential requirement to train pose estimation pipelines. We propose a way for scalable, end-to-end 6-D pose regression with weak supervision to avoid this problem. Our method requires neither 3-D models nor 6-D object poses as ground truth. Instead, we use 2-D bounding boxes and object sizes as the only labels and constrain the problem with multiple images of known relative poses during training. A novel Rotated-IoU loss brings together a pose prediction from an image with labeled 2-D bounding boxes of the corresponding object in other views. Our rotation estimation combines an initial coarse pose classification with an offset regression using a continuous rotation parametrization that allows for direct pose estimation. At test time, the model still uses only a single image to predict a 6-D pose. We observe that multi-view constraints and our rotation representation used during training lead to better learning of 6-D pose embeddings in comparison to fully supervised methods. Experiments on several datasets show that the proposed method is capable of predicting poses of good quality, in spite being trained with only weak labels. Direct pose regression without the need for a consecutive refinement stage thereby ensures real-time performance.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

OSOP: A Multi-Stage One Shot Object Pose Estimation Framework

TL;DR: In this article , a one-shot method for object detection and 6 DoF pose estimation is proposed, which does not require training on target objects and uses a textured 3D query model.
Journal ArticleDOI

Ambiguity-Aware Multi-Object Pose Optimization for Visually-Assisted Robot Manipulation

TL;DR: Zhang et al. as discussed by the authors presented an ambiguity-aware 6D object pose estimation network, PrimA6D++, as a generic uncertainty prediction method, which can handle the major challenges in pose estimation, such as occlusion and symmetry, based on the measured ambiguity of the prediction.
Journal ArticleDOI

3D hand pose estimation from a single RGB image by weighting the occlusion and classification

TL;DR: Wang et al. as discussed by the authors proposed a new framework for 3D hand pose estimation using a single RGB image, which is composed of two blocks: the first block formulates the pose estimation as a classification problem and the second block estimates the 3D coordinates of the hand joints and focuses more on the details of the image pattern.
Journal ArticleDOI

Weak6D: Weakly Supervised 6D Pose Estimation With Iterative Annotation Resolver

TL;DR: Weak6D as mentioned in this paper employs a weak refinement loss to optimize the pose estimation network with refined object poses, which can directly utilize the captured RGB-D data through the training process.
Book ChapterDOI

WeLSA: Learning to Predict 6D Pose from Weakly Labeled Data Using Shape Alignment

TL;DR: In this article , a weakly supervised approach for object pose estimation from RGB-D data using training sets composed of very few labeled images with pose annotations along with weakly-labeled images with ground truth segmentation masks without pose labels is proposed.
References
More filters
Proceedings ArticleDOI

You Only Look Once: Unified, Real-Time Object Detection

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.
Journal ArticleDOI

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

TL;DR: This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.
Proceedings ArticleDOI

Fast R-CNN

TL;DR: Fast R-CNN as discussed by the authors proposes a Fast Region-based Convolutional Network method for object detection, which employs several innovations to improve training and testing speed while also increasing detection accuracy and achieves a higher mAP on PASCAL VOC 2012.
Proceedings ArticleDOI

Mask R-CNN

TL;DR: This work presents a conceptually simple, flexible, and general framework for object instance segmentation, which extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.
Journal ArticleDOI

ORB-SLAM: a Versatile and Accurate Monocular SLAM System

TL;DR: A survival of the fittest strategy that selects the points and keyframes of the reconstruction leads to excellent robustness and generates a compact and trackable map that only grows if the scene content changes, allowing lifelong operation.
Related Papers (5)