MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction
Ayush Tewari,Michael Zollhöfer,Hyeongwoo Kim,Pablo Garrido,Florian Bernard,Patrick Pérez,Christian Theobalt +6 more
- pp 3735-3744
Reads0
Chats0
TLDR
A novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image and can be trained end-to-end in an unsupervised manner, which renders training on very large real world data feasible.Abstract:
In this work we propose a novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image. To this end, we combine a convolutional encoder network with an expert-designed generative model that serves as decoder. The core innovation is the differentiable parametric decoder that encapsulates image formation analytically based on a generative model. Our decoder takes as input a code vector with exactly defined semantic meaning that encodes detailed face pose, shape, expression, skin reflectance and scene illumination. Due to this new way of combining CNN-based with model-based face reconstruction, the CNN-based encoder learns to extract semantically meaningful parameters from a single monocular input image. For the first time, a CNN encoder and an expert-designed generative model can be trained end-to-end in an unsupervised manner, which renders training on very large (unlabeled) real world data feasible. The obtained reconstructions compare favorably to current state-of-the-art approaches in terms of quality and richness of representation.read more
Citations
More filters
Posted Content
Personalized Face Modeling for Improved Face Reconstruction and Motion Retargeting
TL;DR: Experimental results show that the end-to-end framework proposed accurately captures fine-grained facial dynamics in a wide range of conditions and efficiently decouples the learned face model from facial motion, resulting in more accurate face reconstruction and facial motion retargeting compared to state-of-the-art methods.
Posted Content
A Review of 3D Face Reconstruction From a Single Image.
TL;DR: A review of the recent literature on 3D face reconstruction from a single image can be found in this paper, where a large number of articles have been published on the problem and many researchers put attention to the problem.
Proceedings ArticleDOI
Spatially Multi-conditional Image Generation
TL;DR: TLAM as mentioned in this paper uses a transformer-like architecture operating pixel-wise, which receives the available labels as input tokens to merge them in a learned homogeneous space of labels, and then the merged labels are used for image generation via conditional generative adversarial training.
Book ChapterDOI
Learning 3D Face Reconstruction with a Pose Guidance Network
TL;DR: Li et al. as mentioned in this paper proposed a self-supervised learning approach to learn monocular 3D face reconstruction with a pose guidance network (PGN), which can learn from both faces with fully labeled 3D landmarks and unlimited unlabeled in-the-wild face images.
Proceedings ArticleDOI
Screen-space Regularization on Differentiable Rasterization
TL;DR: Wang et al. as discussed by the authors proposed a screen-space regularization method, which targets the unbalanced deformation due to the limited viewpoints, and applied the regularization to both multi-view deformation and single-view reconstruction tasks.
References
More filters
Proceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Journal ArticleDOI
Reducing the Dimensionality of Data with Neural Networks
TL;DR: In this article, an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data is described.
Posted Content
Caffe: Convolutional Architecture for Fast Feature Embedding
Yangqing Jia,Evan Shelhamer,Jeff Donahue,Sergey Karayev,Jonathan Long,Ross Girshick,Sergio Guadarrama,Trevor Darrell +7 more
TL;DR: Caffe as discussed by the authors is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.
Proceedings ArticleDOI
Caffe: Convolutional Architecture for Fast Feature Embedding
Yangqing Jia,Evan Shelhamer,Jeff Donahue,Sergey Karayev,Jonathan Long,Ross Girshick,Sergio Guadarrama,Trevor Darrell +7 more
TL;DR: Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.
Proceedings ArticleDOI
Deep Learning Face Attributes in the Wild
TL;DR: A novel deep learning framework for attribute prediction in the wild that cascades two CNNs, LNet and ANet, which are fine-tuned jointly with attribute tags, but pre-trained differently.