Deep Learning on Monocular Object Pose Detection and Tracking: A Comprehensive Overview

doi:10.1145/3524496

Open AccessJournal ArticleDOI

Deep Learning on Monocular Object Pose Detection and Tracking: A Comprehensive Overview

- 21 Nov 2022 -

ACM Computing Surveys

- Vol. 55, Iss: 4, pp 1-40

Chats0

TLDR

A comprehensive review of recent progress in object pose detection and tracking that belongs to the deep learning technical route is presented in this article , where the authors take monocular RGB/RGBD data as input and cover three kinds of major tasks: instance-level, category-level and monocular object pose tracking.

Abstract:

Object pose detection and tracking has recently attracted increasing attention due to its wide applications in many areas, such as autonomous driving, robotics, and augmented reality. Among methods for object pose detection and tracking, deep learning is the most promising one that has shown better performance than others. However, survey study about the latest development of deep learning-based methods is lacking. Therefore, this study presents a comprehensive review of recent progress in object pose detection and tracking that belongs to the deep learning technical route. To achieve a more thorough introduction, the scope of this study is limited to methods taking monocular RGB/RGBD data as input and covering three kinds of major tasks: instance-level monocular object pose detection, category-level monocular object pose detection, and monocular object pose tracking. In our work, metrics, datasets, and methods of both detection and tracking are presented in detail. Comparative results of current state-of-the-art methods on several publicly available datasets are also presented, together with insightful observations and inspiring future research directions.

Deep Learning on Monocular Object Pose Detection and Tracking: A Comprehensive Overview

Citations

V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer

Object Level Depth Reconstruction for Category Level 6D Object Pose Estimation from Monocular RGB Image

i2c-net: Using Instance-Level Neural Networks for Monocular Category-Level 6D Pose Estimation

Projecting Product-Aware Cues as Assembly Intentions for Human-Robot Collaboration

Deep Learning-based Marker-less Pose Estimation of Interventional Tools using Surrogate Keypoints

References

Deep Residual Learning for Image Recognition

Long short-term memory

Distinctive Image Features from Scale-Invariant Keypoints

You Only Look Once: Unified, Real-Time Object Detection

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Related Papers (5)

Mono-DCNet: Monocular 3D Object Detection via Depth-based Centroid Refinement and Pose Estimation

3D Object Aided Self-Supervised Monocular Depth Estimation

Deep Learning on Monocular Object Pose Detection and Tracking: A Comprehensive Overview

Deep Learning on Monocular Object Pose Detection and Tracking: A Comprehensive Overview.

Tracking of moving object based on optical flow detection