3D Human Pose Estimation from a Single Image via Distance Matrix Regression

doi:10.1109/CVPR.2017.170

Open AccessProceedings ArticleDOI

3D Human Pose Estimation from a Single Image via Distance Matrix Regression

- pp 1561-1570

TLDR

In this paper, a 2D-to-3D distance matrix regression model is proposed for 3D human pose estimation from a single image, where the 2D position of the N body joints is first detected using a CNN-based detector, and then these observations are used to infer 3D pose.

Abstract:

This paper addresses the problem of 3D human pose estimation from a single image. We follow a standard two-step pipeline by first detecting the 2D position of the N body joints, and then using these observations to infer 3D pose. For the first step, we use a recent CNN-based detector. For the second step, most existing approaches perform 2N-to-3N regression of the Cartesian joint coordinates. We show that more precise pose estimates can be obtained by representing both the 2D and 3D human poses using NxN distance matrices, and formulating the problem as a 2D-to-3D distance matrix regression. For learning such a regressor we leverage on simple Neural Network architectures, which by construction, enforce positivity and symmetry of the predicted matrices. The approach has also the advantage to naturally handle missing observations and allowing to hypothesize the position of non-observed joints. Quantitative results on Humaneva and Human3.6M datasets demonstrate consistent performance gains over state-of-the-art. Qualitative evaluation on the images in-the-wild of the LSP dataset, using the regressor learned on Human3.6M, reveals very promising generalization results.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

End-to-End Recovery of Human Shape and Pose

Angjoo Kanazawa, +3 more

TL;DR: This work introduces an adversary trained to tell whether human body shape and pose parameters are real or not using a large database of 3D human meshes, and produces a richer and more useful mesh representation that is parameterized by shape and 3D joint angles.

...read moreread less

Proceedings ArticleDOI

A Simple Yet Effective Baseline for 3d Human Pose Estimation

Julieta Martinez, +3 more

TL;DR: In this paper, a relatively simple deep feed-forward network was proposed to estimate 3D human pose from 2D joint locations with a remarkably low error rate, achieving state-of-the-art results on Human3.6M.

...read moreread less

Proceedings ArticleDOI

Monocular 3D Human Pose Estimation in the Wild Using Improved CNN Supervision

Dushyant Mehta, +6 more

TL;DR: In this article, a CNN-based approach for 3D human body pose estimation from single RGB images is proposed to address the issue of limited generalizability of models trained solely on the starkly limited publicly available 3D pose data.

...read moreread less

Proceedings ArticleDOI

Learning to Estimate 3D Human Pose and Shape from a Single Color Image

Georgios Pavlakos, +3 more

TL;DR: This work addresses the problem of estimating the full body 3D human pose and shape from a single color image and proposes an efficient and effective direct prediction method based on ConvNets, incorporating a parametric statistical body shape model (SMPL) within an end-to-end framework.

...read moreread less

Proceedings ArticleDOI

Learning to Estimate 3D Hand Pose from Single RGB Images

Christian Zimmermann, +1 more

TL;DR: In this paper, the authors propose a deep network that learns a network-implicit 3D articulation prior together with detected keypoints in the images, which yields good estimates of the 3D pose.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

Proceedings ArticleDOI

Fully convolutional networks for semantic segmentation

Jonathan Long, +2 more

TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.

...read moreread less

Proceedings ArticleDOI

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

Kaiming He, +3 more

TL;DR: In this paper, a Parametric Rectified Linear Unit (PReLU) was proposed to improve model fitting with nearly zero extra computational cost and little overfitting risk, which achieved a 4.94% top-5 test error on ImageNet 2012 classification dataset.

...read moreread less

Proceedings ArticleDOI

FlowNet: Learning Optical Flow with Convolutional Networks

Alexey Dosovitskiy, +8 more

TL;DR: In this paper, the authors propose and compare two architectures: a generic architecture and another one including a layer that correlates feature vectors at different image locations, and show that networks trained on this unrealistic data still generalize very well to existing datasets such as Sintel and KITTI.

...read moreread less

Journal ArticleDOI

Modern Multidimensional Scaling: Theory and Applications

Ingwer Borg, +1 more

- 01 Sep 2003 -

Journal of Educational Measurement

TL;DR: The four Purposes of Multidimensional Scaling, Special Solutions, Degeneracies, and Local Minima, and Avoiding Trivial Solutions in Unfolding are explained.

...read moreread less

Collapse

Related Papers (5)

Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments

Catalin Ionescu, +3 more

- 01 Jul 2014 -

IEEE Transactions on Pattern Analysis an...

3D Human Pose Estimation from a Single Image via Distance Matrix Regression

Citations

End-to-End Recovery of Human Shape and Pose

A Simple Yet Effective Baseline for 3d Human Pose Estimation

Monocular 3D Human Pose Estimation in the Wild Using Improved CNN Supervision

Learning to Estimate 3D Human Pose and Shape from a Single Color Image

Learning to Estimate 3D Hand Pose from Single RGB Images

References

Adam: A Method for Stochastic Optimization

Fully convolutional networks for semantic segmentation

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

FlowNet: Learning Optical Flow with Convolutional Networks

Modern Multidimensional Scaling: Theory and Applications

Related Papers (5)

Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments

A Simple Yet Effective Baseline for 3d Human Pose Estimation

Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image

2D Human Pose Estimation: New Benchmark and State of the Art Analysis

Stacked Hourglass Networks for Human Pose Estimation