Multi-view self-supervised deep learning for 6D pose estimation in the Amazon Picking Challenge
Citations
1,021 citations
Additional excerpts
...These datasets have further boosted the research of deep learning on 3D point clouds, with an increasingly number of methods being proposed to address various problems related to point cloud processing, including 3D shape classification, 3D object detection and tracking, 3D point cloud segmentation, 3D point cloud registration, 6-DOF pose estimation, and 3D reconstruction [16], [17], [18]....
[...]
844 citations
Cites background from "Multi-view self-supervised deep lea..."
...Zeng et al.[112] present an object segmentation approach that leverages multi-view RGB-D data and deep learning techniques....
[...]
711 citations
Cites methods from "Multi-view self-supervised deep lea..."
...We evaluate on the testing split of the Shelf&Tote dataset using the error metric from [44], where we report the percentage of pose predictions with error in orientation smaller than 15◦ and translations smaller than 5cm....
[...]
...object models to segmentation results from [44]....
[...]
...In our first experiment, the task is to register pre-scanned object models to RGB-D scanning data for the Shelf & Tote benchmark in the Amazon Picking Challenge (APC) setting [44], as illustrated in Fig....
[...]
555 citations
Cites background from "Multi-view self-supervised deep lea..."
...Several datasets and approaches have been introduced for the specific setting in the APC [14], [15]....
[...]
497 citations
Cites background from "Multi-view self-supervised deep lea..."
...More recently, deep learning based approaches in computer vision are being adopted for the task of pose estimation of specific objects[33, 53, 54]....
[...]
References
55,235 citations
"Multi-view self-supervised deep lea..." refers methods in this paper
...To leverage features trained from a larger image domain, we use the sizable FCN-VGG network architecture from [18] and initialize the network weights using a model pre-trained on ImageNet for 1000-way object classification....
[...]
...More explicitly, we train a VGG architecture [18] Fully Convolutional Network (FCN) [2] to perform 2D object segmentation....
[...]
30,811 citations
28,225 citations
21,729 citations
17,598 citations