MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction

doi:10.1109/ICCV.2017.401

Open AccessProceedings ArticleDOI

MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction

Ayush Tewari, +6 more

- pp 3735-3744

Chats0

TLDR

A novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image and can be trained end-to-end in an unsupervised manner, which renders training on very large real world data feasible.

Abstract:

In this work we propose a novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image. To this end, we combine a convolutional encoder network with an expert-designed generative model that serves as decoder. The core innovation is the differentiable parametric decoder that encapsulates image formation analytically based on a generative model. Our decoder takes as input a code vector with exactly defined semantic meaning that encodes detailed face pose, shape, expression, skin reflectance and scene illumination. Due to this new way of combining CNN-based with model-based face reconstruction, the CNN-based encoder learns to extract semantically meaningful parameters from a single monocular input image. For the first time, a CNN encoder and an expert-designed generative model can be trained end-to-end in an unsupervised manner, which renders training on very large (unlabeled) real world data feasible. The obtained reconstructions compare favorably to current state-of-the-art approaches in terms of quality and richness of representation.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Learning Nonlinear Soft-Tissue Dynamics for Interactive Avatars

Dan Casas, +1 more

TL;DR: A novel method to enrich existing vertex-based human body models by adding soft-tissue dynamics, which learns to predict per-vertex 3D offsets that reproduce nonlinear mesh deformation effects as a function of pose information, enabling the synthesis of realistic 3D mesh animations.

...read moreread less

Posted Content

FACSIMILE: Fast and Accurate Scans From an Image in Less Than a Second

David Smith, +4 more

- 02 Sep 2019 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: FACSIMILE (FAX) is proposed, a method that estimates a detailed body from a single photo, lowering the bar for creating virtual representations of humans and is easy to implement and fast to execute, making it easily deployable.

...read moreread less

Proceedings ArticleDOI

Illumination-Invariant Face Recognition With Deep Relit Face Images

Ha Le, +1 more

TL;DR: A deep face relighting algorithm is proposed and employed as a data augmentation method to enrich training data with illumination variations to enhance the robustness of face templates to illumination variations simply by training face recognition algorithms with the authors' relit images.

...read moreread less

Journal ArticleDOI

An empirical rig for jaw animation

Gaspard Zoss, +3 more

- 30 Jul 2018 -

ACM Transactions on Graphics

TL;DR: This work constructs a novel jaw rig that preserves the intuitive control while providing more accurate jaw positioning, and places anatomically correct limits on the motion, preventing the rig from entering physiologically infeasible poses.

...read moreread less

Journal ArticleDOI

Deep incremental learning for efficient high-fidelity face tracking

Chenglei Wu, +2 more

- 04 Dec 2018 -

ACM Transactions on Graphics

TL;DR: This paper introduces the decomposition layer in the Geo-Tex VAE architecture which decomposes the facial deformation into global and local components and trains the global deformation with a fully-connected network and the local deformations with convolutional layers.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Journal ArticleDOI

Reducing the Dimensionality of Data with Neural Networks

Geoffrey E. Hinton, +1 more

- 28 Jul 2006 -

Science

TL;DR: In this article, an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data is described.

...read moreread less

Posted Content

Caffe: Convolutional Architecture for Fast Feature Embedding

Yangqing Jia, +7 more

- 20 Jun 2014 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Caffe as discussed by the authors is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.

...read moreread less

Proceedings ArticleDOI

Caffe: Convolutional Architecture for Fast Feature Embedding

Yangqing Jia, +7 more

TL;DR: Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.

...read moreread less

Proceedings ArticleDOI

Deep Learning Face Attributes in the Wild

Ziwei Liu, +3 more

TL;DR: A novel deep learning framework for attribute prediction in the wild that cascades two CNNs, LNet and ANet, which are fine-tuned jointly with attribute tags, but pre-trained differently.

...read moreread less

Collapse

MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction

Citations

Learning Nonlinear Soft-Tissue Dynamics for Interactive Avatars

FACSIMILE: Fast and Accurate Scans From an Image in Less Than a Second

Illumination-Invariant Face Recognition With Deep Relit Face Images

An empirical rig for jaw animation

Deep incremental learning for efficient high-fidelity face tracking

References

ImageNet Classification with Deep Convolutional Neural Networks

Reducing the Dimensionality of Data with Neural Networks

Caffe: Convolutional Architecture for Fast Feature Embedding

Caffe: Convolutional Architecture for Fast Feature Embedding

Deep Learning Face Attributes in the Wild

Related Papers (5)

A morphable model for the synthesis of 3D faces

A 3D Face Model for Pose and Illumination Invariant Face Recognition

FaceWarehouse: A 3D Facial Expression Database for Visual Computing

Face Alignment Across Large Poses: A 3D Solution

Deep Residual Learning for Image Recognition