DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation

doi:10.1109/CVPR.2016.533

Open AccessProceedings ArticleDOI

DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation

- pp 4929-4937

TLDR

An approach that jointly solves the tasks of detection and pose estimation: it infers the number of persons in a scene, identifies occluded body parts, and disambiguates body parts between people in close proximity of each other is proposed.

Abstract:

This paper considers the task of articulated human pose estimation of multiple people in real world images. We propose an approach that jointly solves the tasks of detection and pose estimation: it infers the number of persons in a scene, identifies occluded body parts, and disambiguates body parts between people in close proximity of each other. This joint formulation is in contrast to previous strategies, that address the problem by first detecting people and subsequently estimating their body pose. We propose a partitioning and labeling formulation of a set of body-part hypotheses generated with CNN-based part detectors. Our formulation, an instance of an integer linear program, implicitly performs non-maximum suppression on the set of part candidates and groups them to form configurations of body parts respecting geometric and appearance constraints. Experiments on four different datasets demonstrate state-of-the-art results for both single person and multi person pose estimation1.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields

Zhe Cao, +3 more

TL;DR: Part Affinity Fields (PAFs) as discussed by the authors uses a nonparametric representation to learn to associate body parts with individuals in the image and achieves state-of-the-art performance on the MPII Multi-Person benchmark.

...read moreread less

Book ChapterDOI

Stacked Hourglass Networks for Human Pose Estimation

Alejandro Newell, +2 more

TL;DR: This work introduces a novel convolutional network architecture for the task of human pose estimation that is described as a “stacked hourglass” network based on the successive steps of pooling and upsampling that are done to produce a final set of predictions.

...read moreread less

Posted Content

Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields

Zhe Cao, +3 more

- 24 Nov 2016 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work presents an approach to efficiently detect the 2D pose of multiple people in an image using a nonparametric representation, which it refers to as Part Affinity Fields (PAFs), to learn to associate body parts with individuals in the image.

...read moreread less

Proceedings ArticleDOI

Deep High-Resolution Representation Learning for Human Pose Estimation

Ke Sun, +3 more

TL;DR: This paper proposes a network that maintains high-resolution representations through the whole process of human pose estimation and empirically demonstrates the effectiveness of the network through the superior pose estimation results over two benchmark datasets: the COCO keypoint detection dataset and the MPII Human Pose dataset.

...read moreread less

Journal ArticleDOI

OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields

Zhe Cao, +4 more

- 01 Jan 2021 -

IEEE Transactions on Pattern Analysis an...

TL;DR: OpenPose as mentioned in this paper uses Part Affinity Fields (PAFs) to learn to associate body parts with individuals in the image, which achieves high accuracy and real-time performance.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

Posted Content

Fast R-CNN

Ross Girshick

- 30 Apr 2015 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This paper proposes a Fast Region-based Convolutional Network method (Fast R-CNN) for object detection that builds on previous work to efficiently classify object proposals using deep convolutional networks.

...read moreread less

Posted Content

Rich feature hierarchies for accurate object detection and semantic segmentation

Ross Girshick, +3 more

- 11 Nov 2013 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This paper proposes a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%.

...read moreread less

Journal ArticleDOI

Selective Search for Object Recognition

Jasper Uijlings, +3 more

- 01 Sep 2013 -

International Journal of Computer Vision

TL;DR: This paper introduces selective search which combines the strength of both an exhaustive search and segmentation, and shows that its selective search enables the use of the powerful Bag-of-Words model for recognition.

...read moreread less

Collapse

DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation

Citations

Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields

Stacked Hourglass Networks for Human Pose Estimation

Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields

Deep High-Resolution Representation Learning for Human Pose Estimation

OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields

References

ImageNet Classification with Deep Convolutional Neural Networks

Very Deep Convolutional Networks for Large-Scale Image Recognition

Fast R-CNN

Rich feature hierarchies for accurate object detection and semantic segmentation

Selective Search for Object Recognition

Related Papers (5)

Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields

Stacked Hourglass Networks for Human Pose Estimation

2D Human Pose Estimation: New Benchmark and State of the Art Analysis

Convolutional Pose Machines

Deep Residual Learning for Image Recognition