scispace - formally typeset
Proceedings ArticleDOI

Strong Appearance and Expressive Spatial Models for Human Pose Estimation

Reads0
Chats0
TLDR
This paper demonstrates that even a basic tree-structure spatial human body model achieves state-of-the-art performance when augmented with the proper appearance representation, and shows that the combination of the best performing appearance model with a flexible image-conditioned spatial model achieves the best result.
Abstract
Typical approaches to articulated pose estimation combine spatial modelling of the human body with appearance modelling of body parts. This paper aims to push the state-of-the-art in articulated pose estimation in two ways. First we explore various types of appearance representations aiming to substantially improve the body part hypotheses. And second, we draw on and combine several recently proposed powerful ideas such as more flexible spatial models as well as image-conditioned spatial models. In a series of experiments we draw several important conclusions: (1) we show that the proposed appearance representations are complementary, (2) we demonstrate that even a basic tree-structure spatial human body model achieves state-of-the-art performance when augmented with the proper appearance representation, and (3) we show that the combination of the best performing appearance model with a flexible image-conditioned spatial model achieves the best result, significantly improving over the state of the art, on the ``Leeds Sports Poses'' and ``Parse'' benchmarks.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book ChapterDOI

Stacked Hourglass Networks for Human Pose Estimation

TL;DR: This work introduces a novel convolutional network architecture for the task of human pose estimation that is described as a “stacked hourglass” network based on the successive steps of pooling and upsampling that are done to produce a final set of predictions.
Proceedings ArticleDOI

Convolutional Pose Machines

TL;DR: In this paper, a convolutional network is incorporated into the pose machine framework for learning image features and image-dependent spatial models for the task of pose estimation, which can implicitly model long-range dependencies between variables in structured prediction tasks such as articulated pose estimation.
Proceedings ArticleDOI

2D Human Pose Estimation: New Benchmark and State of the Art Analysis

TL;DR: A novel benchmark "MPII Human Pose" is introduced that makes a significant advance in terms of diversity and difficulty, a contribution that is required for future developments in human body models.
Posted Content

Stacked Hourglass Networks for Human Pose Estimation

TL;DR: Stacked hourglass networks as mentioned in this paper were proposed for human pose estimation, where features are processed across all scales and consolidated to best capture the various spatial relationships associated with the body, and repeated bottom-up, top-down processing with intermediate supervision is critical to improving the performance of the network.
Posted Content

Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation

TL;DR: This paper proposes a new hybrid architecture that consists of a deep Convolu-tional Network and a Markov Random Field and shows how this architecture is successfully applied to the challenging problem of articulated human pose estimation in monocular images.
References
More filters
Proceedings ArticleDOI

Histograms of oriented gradients for human detection

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.
Journal ArticleDOI

Object Detection with Discriminatively Trained Part-Based Models

TL;DR: An object detection system based on mixtures of multiscale deformable part models that is able to represent highly variable object classes and achieves state-of-the-art results in the PASCAL object detection challenges is described.
Journal ArticleDOI

A performance evaluation of local descriptors

TL;DR: It is observed that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best and Moments and steerable filters show the best performance among the low dimensional descriptors.
Proceedings ArticleDOI

Real-time human pose recognition in parts from single depth images

TL;DR: This work takes an object recognition approach, designing an intermediate body parts representation that maps the difficult pose estimation problem into a simpler per-pixel classification problem, and generates confidence-scored 3D proposals of several body joints by reprojecting the classification result and finding local modes.
Proceedings ArticleDOI

A performance evaluation of local descriptors

TL;DR: It is observed that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best and Moments and steerable filters show the best performance among the low dimensional descriptors.