scispace - formally typeset
Open AccessProceedings ArticleDOI

MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction

Reads0
Chats0
TLDR
A novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image and can be trained end-to-end in an unsupervised manner, which renders training on very large real world data feasible.
Abstract
In this work we propose a novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image. To this end, we combine a convolutional encoder network with an expert-designed generative model that serves as decoder. The core innovation is the differentiable parametric decoder that encapsulates image formation analytically based on a generative model. Our decoder takes as input a code vector with exactly defined semantic meaning that encodes detailed face pose, shape, expression, skin reflectance and scene illumination. Due to this new way of combining CNN-based with model-based face reconstruction, the CNN-based encoder learns to extract semantically meaningful parameters from a single monocular input image. For the first time, a CNN encoder and an expert-designed generative model can be trained end-to-end in an unsupervised manner, which renders training on very large (unlabeled) real world data feasible. The obtained reconstructions compare favorably to current state-of-the-art approaches in terms of quality and richness of representation.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Learning Nonlinear Soft-Tissue Dynamics for Interactive Avatars

TL;DR: A novel method to enrich existing vertex-based human body models by adding soft-tissue dynamics, which learns to predict per-vertex 3D offsets that reproduce nonlinear mesh deformation effects as a function of pose information, enabling the synthesis of realistic 3D mesh animations.
Posted Content

FACSIMILE: Fast and Accurate Scans From an Image in Less Than a Second

TL;DR: FACSIMILE (FAX) is proposed, a method that estimates a detailed body from a single photo, lowering the bar for creating virtual representations of humans and is easy to implement and fast to execute, making it easily deployable.
Proceedings ArticleDOI

Illumination-Invariant Face Recognition With Deep Relit Face Images

TL;DR: A deep face relighting algorithm is proposed and employed as a data augmentation method to enrich training data with illumination variations to enhance the robustness of face templates to illumination variations simply by training face recognition algorithms with the authors' relit images.
Journal ArticleDOI

An empirical rig for jaw animation

TL;DR: This work constructs a novel jaw rig that preserves the intuitive control while providing more accurate jaw positioning, and places anatomically correct limits on the motion, preventing the rig from entering physiologically infeasible poses.
Journal ArticleDOI

Deep incremental learning for efficient high-fidelity face tracking

TL;DR: This paper introduces the decomposition layer in the Geo-Tex VAE architecture which decomposes the facial deformation into global and local components and trains the global deformation with a fully-connected network and the local deformations with convolutional layers.
References
More filters
Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Journal ArticleDOI

Reducing the Dimensionality of Data with Neural Networks

TL;DR: In this article, an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data is described.
Posted Content

Caffe: Convolutional Architecture for Fast Feature Embedding

TL;DR: Caffe as discussed by the authors is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.
Proceedings ArticleDOI

Caffe: Convolutional Architecture for Fast Feature Embedding

TL;DR: Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.
Proceedings ArticleDOI

Deep Learning Face Attributes in the Wild

TL;DR: A novel deep learning framework for attribute prediction in the wild that cascades two CNNs, LNet and ANet, which are fine-tuned jointly with attribute tags, but pre-trained differently.
Related Papers (5)