MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction

Open AccessPosted Content

MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction

- 30 Mar 2017 -

arXiv: Computer Vision and Pattern Recog...

TLDR

A novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image and can be trained end-to-end in an unsupervised manner, which renders training on very large real world data feasible.

Abstract:

In this work we propose a novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image. To this end, we combine a convolutional encoder network with an expert-designed generative model that serves as decoder. The core innovation is our new differentiable parametric decoder that encapsulates image formation analytically based on a generative model. Our decoder takes as input a code vector with exactly defined semantic meaning that encodes detailed face pose, shape, expression, skin reflectance and scene illumination. Due to this new way of combining CNN-based with model-based face reconstruction, the CNN-based encoder learns to extract semantically meaningful parameters from a single monocular input image. For the first time, a CNN encoder and an expert-designed generative model can be trained end-to-end in an unsupervised manner, which renders training on very large (unlabeled) real world data feasible. The obtained reconstructions compare favorably to current state-of-the-art approaches in terms of quality and richness of representation.

Citations

PDF

Open Access

More filters

Proceedings Article

A morphable model for the synthesis of 3D faces

Matthew Turk

Proceedings ArticleDOI

RetinaFace: Single-Shot Multi-Level Face Localisation in the Wild

Jiankang Deng, +4 more

TL;DR: A novel single-shot, multi-level face localisation method, named RetinaFace, which unifies face box prediction, 2D facial landmark localisation and 3D vertices regression under one common target: point regression on the image plane.

...read moreread less

Proceedings ArticleDOI

Deepfake Video Detection Using Recurrent Neural Networks

David Guera, +1 more

TL;DR: A temporal-aware pipeline to automatically detect deepfake videos is proposed that uses a convolutional neural network to extract frame-level features and a recurrent neural network that learns to classify if a video has been subject to manipulation or not.

...read moreread less

Proceedings ArticleDOI

Learning to Estimate 3D Human Pose and Shape from a Single Color Image

Georgios Pavlakos, +3 more

TL;DR: This work addresses the problem of estimating the full body 3D human pose and shape from a single color image and proposes an efficient and effective direct prediction method based on ConvNets, incorporating a parametric statistical body shape model (SMPL) within an end-to-end framework.

...read moreread less

Proceedings ArticleDOI

Soft Rasterizer: A Differentiable Renderer for Image-Based 3D Reasoning

Shichen Liu, +3 more

TL;DR: This work proposes a truly differentiable rendering framework that is able to directly render colorized mesh using differentiable functions and back-propagate efficient supervision signals to mesh vertices and their attributes from various forms of image representations, including silhouette, shading and color images.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Journal ArticleDOI

Reducing the Dimensionality of Data with Neural Networks

Geoffrey E. Hinton, +1 more

- 28 Jul 2006 -

Science

TL;DR: In this article, an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data is described.

...read moreread less

Posted Content

Caffe: Convolutional Architecture for Fast Feature Embedding

Yangqing Jia, +7 more

- 20 Jun 2014 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Caffe as discussed by the authors is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.

...read moreread less

Proceedings Article

Spatial transformer networks

Max Jaderberg, +3 more

TL;DR: This work introduces a new learnable module, the Spatial Transformer, which explicitly allows the spatial manipulation of data within the network, and can be inserted into existing convolutional architectures, giving neural networks the ability to actively spatially transform feature maps.

...read moreread less

Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments

Gary B. Huang, +3 more

TL;DR: The database contains labeled face photographs spanning the range of conditions typically encountered in everyday life, and exhibits “natural” variability in factors such as pose, lighting, race, accessories, occlusions, and background.

...read moreread less

Collapse

IEEE Transactions on Visualization and C...

MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction

Citations

A morphable model for the synthesis of 3D faces

RetinaFace: Single-Shot Multi-Level Face Localisation in the Wild

Deepfake Video Detection Using Recurrent Neural Networks

Learning to Estimate 3D Human Pose and Shape from a Single Color Image

Soft Rasterizer: A Differentiable Renderer for Image-Based 3D Reasoning

References

ImageNet Classification with Deep Convolutional Neural Networks

Reducing the Dimensionality of Data with Neural Networks

Caffe: Convolutional Architecture for Fast Feature Embedding

Spatial transformer networks

Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments

Related Papers (5)

A morphable model for the synthesis of 3D faces

Regressing Robust and Discriminative 3D Morphable Models with a Very Deep Neural Network

A 3D Face Model for Pose and Illumination Invariant Face Recognition

Face2Face: Real-Time Face Capture and Reenactment of RGB Videos

FaceWarehouse: A 3D Facial Expression Database for Visual Computing