DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation
Leonid Pishchulin,Eldar Insafutdinov,Siyu Tang,Bjoern Andres,Mykhaylo Andriluka,Peter V. Gehler,Bernt Schiele +6 more
- pp 4929-4937
TLDR
An approach that jointly solves the tasks of detection and pose estimation: it infers the number of persons in a scene, identifies occluded body parts, and disambiguates body parts between people in close proximity of each other is proposed.Abstract:
This paper considers the task of articulated human pose estimation of multiple people in real world images. We propose an approach that jointly solves the tasks of detection and pose estimation: it infers the number of persons in a scene, identifies occluded body parts, and disambiguates body parts between people in close proximity of each other. This joint formulation is in contrast to previous strategies, that address the problem by first detecting people and subsequently estimating their body pose. We propose a partitioning and labeling formulation of a set of body-part hypotheses generated with CNN-based part detectors. Our formulation, an instance of an integer linear program, implicitly performs non-maximum suppression on the set of part candidates and groups them to form configurations of body parts respecting geometric and appearance constraints. Experiments on four different datasets demonstrate state-of-the-art results for both single person and multi person pose estimation1.read more
Citations
More filters
Proceedings ArticleDOI
Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields
TL;DR: Part Affinity Fields (PAFs) as discussed by the authors uses a nonparametric representation to learn to associate body parts with individuals in the image and achieves state-of-the-art performance on the MPII Multi-Person benchmark.
Book ChapterDOI
Stacked Hourglass Networks for Human Pose Estimation
TL;DR: This work introduces a novel convolutional network architecture for the task of human pose estimation that is described as a “stacked hourglass” network based on the successive steps of pooling and upsampling that are done to produce a final set of predictions.
Posted Content
Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
TL;DR: This work presents an approach to efficiently detect the 2D pose of multiple people in an image using a nonparametric representation, which it refers to as Part Affinity Fields (PAFs), to learn to associate body parts with individuals in the image.
Proceedings ArticleDOI
Deep High-Resolution Representation Learning for Human Pose Estimation
TL;DR: This paper proposes a network that maintains high-resolution representations through the whole process of human pose estimation and empirically demonstrates the effectiveness of the network through the superior pose estimation results over two benchmark datasets: the COCO keypoint detection dataset and the MPII Human Pose dataset.
Journal ArticleDOI
OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields
TL;DR: OpenPose as mentioned in this paper uses Part Affinity Fields (PAFs) to learn to associate body parts with individuals in the image, which achieves high accuracy and real-time performance.
References
More filters
Proceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
Posted Content
Fast R-CNN
TL;DR: This paper proposes a Fast Region-based Convolutional Network method (Fast R-CNN) for object detection that builds on previous work to efficiently classify object proposals using deep convolutional networks.
Posted Content
Rich feature hierarchies for accurate object detection and semantic segmentation
TL;DR: This paper proposes a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%.
Journal ArticleDOI
Selective Search for Object Recognition
TL;DR: This paper introduces selective search which combines the strength of both an exhaustive search and segmentation, and shows that its selective search enables the use of the powerful Bag-of-Words model for recognition.