MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction

doi:10.1109/ICCV.2017.401

Open AccessProceedings ArticleDOI

MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction

Ayush Tewari, +6 more

- pp 3735-3744

Chats0

TLDR

A novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image and can be trained end-to-end in an unsupervised manner, which renders training on very large real world data feasible.

Abstract:

In this work we propose a novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image. To this end, we combine a convolutional encoder network with an expert-designed generative model that serves as decoder. The core innovation is the differentiable parametric decoder that encapsulates image formation analytically based on a generative model. Our decoder takes as input a code vector with exactly defined semantic meaning that encodes detailed face pose, shape, expression, skin reflectance and scene illumination. Due to this new way of combining CNN-based with model-based face reconstruction, the CNN-based encoder learns to extract semantically meaningful parameters from a single monocular input image. For the first time, a CNN encoder and an expert-designed generative model can be trained end-to-end in an unsupervised manner, which renders training on very large (unlabeled) real world data feasible. The obtained reconstructions compare favorably to current state-of-the-art approaches in terms of quality and richness of representation.

Citations

PDF

Open Access

More filters

Posted Content

Personalized Face Modeling for Improved Face Reconstruction and Motion Retargeting

Bindita Chaudhuri, +3 more

- 14 Jul 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Experimental results show that the end-to-end framework proposed accurately captures fine-grained facial dynamics in a wide range of conditions and efficiently decouples the learned face model from facial motion, resulting in more accurate face reconstruction and facial motion retargeting compared to state-of-the-art methods.

...read moreread less

Posted Content

A Review of 3D Face Reconstruction From a Single Image.

Hanxin Wang

- 13 Oct 2021 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: A review of the recent literature on 3D face reconstruction from a single image can be found in this paper, where a large number of articles have been published on the problem and many researchers put attention to the problem.

...read moreread less

Proceedings ArticleDOI

Spatially Multi-conditional Image Generation

TL;DR: TLAM as mentioned in this paper uses a transformer-like architecture operating pixel-wise, which receives the available labels as input tokens to merge them in a learned homogeneous space of labels, and then the merged labels are used for image generation via conditional generative adversarial training.

...read moreread less

Book ChapterDOI

Learning 3D Face Reconstruction with a Pose Guidance Network

Pengpeng Liu, +4 more

TL;DR: Li et al. as mentioned in this paper proposed a self-supervised learning approach to learn monocular 3D face reconstruction with a pose guidance network (PGN), which can learn from both faces with fully labeled 3D landmarks and unlimited unlabeled in-the-wild face images.

...read moreread less

Proceedings ArticleDOI

Screen-space Regularization on Differentiable Rasterization

Kunyao Chen, +2 more

TL;DR: Wang et al. as discussed by the authors proposed a screen-space regularization method, which targets the unbalanced deformation due to the limited viewpoints, and applied the regularization to both multi-view deformation and single-view reconstruction tasks.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Journal ArticleDOI

Reducing the Dimensionality of Data with Neural Networks

Geoffrey E. Hinton, +1 more

- 28 Jul 2006 -

Science

TL;DR: In this article, an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data is described.

...read moreread less

Posted Content

Caffe: Convolutional Architecture for Fast Feature Embedding

Yangqing Jia, +7 more

- 20 Jun 2014 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Caffe as discussed by the authors is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.

...read moreread less

Proceedings ArticleDOI

Caffe: Convolutional Architecture for Fast Feature Embedding

Yangqing Jia, +7 more

TL;DR: Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.

...read moreread less

Proceedings ArticleDOI

Deep Learning Face Attributes in the Wild

Ziwei Liu, +3 more

TL;DR: A novel deep learning framework for attribute prediction in the wild that cascades two CNNs, LNet and ANet, which are fine-tuned jointly with attribute tags, but pre-trained differently.

...read moreread less

Collapse

MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction

Citations

Personalized Face Modeling for Improved Face Reconstruction and Motion Retargeting

A Review of 3D Face Reconstruction From a Single Image.

Spatially Multi-conditional Image Generation

Learning 3D Face Reconstruction with a Pose Guidance Network

Screen-space Regularization on Differentiable Rasterization

References

ImageNet Classification with Deep Convolutional Neural Networks

Reducing the Dimensionality of Data with Neural Networks

Caffe: Convolutional Architecture for Fast Feature Embedding

Caffe: Convolutional Architecture for Fast Feature Embedding

Deep Learning Face Attributes in the Wild

Related Papers (5)

A morphable model for the synthesis of 3D faces

A 3D Face Model for Pose and Illumination Invariant Face Recognition

FaceWarehouse: A 3D Facial Expression Database for Visual Computing

Face Alignment Across Large Poses: A 3D Solution

Deep Residual Learning for Image Recognition