scispace - formally typeset
Open AccessProceedings ArticleDOI

RepNet: Weakly Supervised Training of an Adversarial Reprojection Network for 3D Human Pose Estimation

TLDR
RepNet as discussed by the authors proposes a reprojection network to reproject the estimated 3D pose back to 2D, which results in a loss function that can be used for weakly supervised training.
Abstract
This paper addresses the problem of 3D human pose estimation from single images. While for a long time human skeletons were parameterized and fitted to the observation by satisfying a reprojection error, nowadays researchers directly use neural networks to infer the 3D pose from the observations. However, most of these approaches ignore the fact that a reprojection constraint has to be satisfied and are sensitive to overfitting. We tackle the overfitting problem by ignoring 2D to 3D correspondences. This efficiently avoids a simple memorization of the training data and allows for a weakly supervised training. One part of the proposed reprojection network (RepNet) learns a mapping from a distribution of 2D poses to a distribution of 3D poses using an adversarial training approach. Another part of the network estimates the camera. This allows for the definition of a network layer that performs the reprojection of the estimated 3D pose back to 2D which results in a reprojection loss function. Our experiments show that RepNet generalizes well to unknown data and outperforms state-of-the-art methods when applied to unseen data. Moreover, our implementation runs in real-time on a standard desktop PC.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Deep Kinematics Analysis for Monocular 3D Human Pose Estimation

TL;DR: It is shown that optimizing the kinematics structure of noisy 2D inputs is critical to obtain accurate 3D estimations and targeted ablation study shows that each former step is critical for the latter one to obtain promising results.
Journal ArticleDOI

Visual Perception Enabled Industry Intelligence: State of the Art, Challenges and Prospects

TL;DR: The previous research and application of visual perception in different industrial fields such as product surface defect detection, intelligent agricultural production, intelligent driving, image synthesis, and event reconstruction are reviewed.
Posted Content

Pose2Mesh: Graph Convolutional Network for 3D Human Pose and Mesh Recovery from a 2D Human Pose

TL;DR: Pose2Mesh is a novel graph convolutional neural network (GraphCNN)-based system that estimates the 3D coordinates of human mesh vertices directly from the 2D human pose, which avoids the representation issues, while fully exploiting the mesh topology using a GraphCNN in a coarse-to-fine manner.
Book ChapterDOI

Pose2Mesh: Graph Convolutional Network for 3D Human Pose and Mesh Recovery from a 2D Human Pose

TL;DR: Pose2Mesh as discussed by the authors proposes a graph convolutional neural network (GraphCNN)-based system that estimates the 3D coordinates of human mesh vertices directly from the 2D human pose.
Journal ArticleDOI

Deep 3D human pose estimation: A review

TL;DR: A thorough review of existing deep learning based works for 3D pose estimation is provided, the advantages and disadvantages of these methods are summarized, and the commonly-used benchmark datasets are explored for comparison and analysis.
References
More filters
Journal ArticleDOI

Generative Adversarial Nets

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
Proceedings ArticleDOI

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

TL;DR: In this paper, a Parametric Rectified Linear Unit (PReLU) was proposed to improve model fitting with nearly zero extra computational cost and little overfitting risk, which achieved a 4.94% top-5 test error on ImageNet 2012 classification dataset.
Proceedings Article

Wasserstein Generative Adversarial Networks

TL;DR: This work introduces a new algorithm named WGAN, an alternative to traditional GAN training that can improve the stability of learning, get rid of problems like mode collapse, and provide meaningful learning curves useful for debugging and hyperparameter searches.
Proceedings ArticleDOI

Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields

TL;DR: Part Affinity Fields (PAFs) as discussed by the authors uses a nonparametric representation to learn to associate body parts with individuals in the image and achieves state-of-the-art performance on the MPII Multi-Person benchmark.
Book ChapterDOI

Stacked Hourglass Networks for Human Pose Estimation

TL;DR: This work introduces a novel convolutional network architecture for the task of human pose estimation that is described as a “stacked hourglass” network based on the successive steps of pooling and upsampling that are done to produce a final set of predictions.
Related Papers (5)