X2Face: A network for controlling face generation by using images, audio, and pose codes

Open AccessPosted Content

X2Face: A network for controlling face generation by using images, audio, and pose codes

Olivia Wiles, +2 more

- 27 Jul 2018 -

arXiv: Computer Vision and Pattern Recog...

Chats0

TLDR

A neural network model that controls the pose and expression of a given face, using another face or modality (e.g. audio) is proposed, which can be used for lightweight, sophisticated video and image editing.

Abstract:

The objective of this paper is a neural network model that controls the pose and expression of a given face, using another face or modality (e.g. audio). This model can then be used for lightweight, sophisticated video and image editing. We make the following three contributions. First, we introduce a network, X2Face, that can control a source face (specified by one or more frames) using another face in a driving frame to produce a generated frame with the identity of the source frame but the pose and expression of the face in the driving frame. Second, we propose a method for training the network fully self-supervised using a large collection of video data. Third, we show that the generation process can be driven by other modalities, such as audio or pose codes, without any further training of the network. The generation results for driving a face with another face are compared to state-of-the-art self-supervised/supervised methods. We show that our approach is more robust than other methods, as it makes fewer assumptions about the input data. We also show examples of using our framework for video face editing.

Citations

PDF

Open Access

More filters

Journal Article

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

杉山拓海

- 12 Sep 2017 -

Computers & Graphics

Proceedings Article

A morphable model for the synthesis of 3D faces

Matthew Turk

Journal ArticleDOI

Text-based editing of talking-head video

Ohad Fried, +9 more

- 12 Jul 2019 -

ACM Transactions on Graphics

TL;DR: This work proposes a novel method to edit talking-head video based on its transcript to produce a realistic output video in which the dialogue of the speaker has been modified, while maintaining a seamless audio-visual flow (i.e. no jump cuts).

...read moreread less

Journal ArticleDOI

Neural style-preserving visual dubbing

Hyeongwoo Kim, +6 more

TL;DR: In this article, a recurrent generative adversarial network (GAN) is used to capture the spatio-temporal co-activation of facial expressions and enables generating and modifying the facial expressions of the target actor while preserving their style.

...read moreread less

Proceedings ArticleDOI

StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN

Fei Yin, +9 more

TL;DR: In this paper , a pre-trained StyleGAN is used to generate high-resolution video and audio for one-shot talking face generation, where the latent feature space of a StyleGAN was investigated and some excellent spatial transformation properties were discovered.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Image-to-Image Translation with Conditional Adversarial Networks

Phillip Isola, +3 more

TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.

...read moreread less

Proceedings ArticleDOI

Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks

Jun-Yan Zhu, +3 more

TL;DR: CycleGAN as discussed by the authors learns a mapping G : X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss.

...read moreread less

Proceedings ArticleDOI

Poisson image editing

Patrick Pérez, +2 more

TL;DR: Using generic interpolation machinery based on solving Poisson equations, a variety of novel tools are introduced for seamless editing of image regions, which permits the seamless importation of both opaque and transparent source image regions into a destination region.

...read moreread less

Journal ArticleDOI

Dlib-ml: A Machine Learning Toolkit

Davis E. King

- 01 Dec 2009 -

Journal of Machine Learning Research

TL;DR: dlib-ml contains an extensible linear algebra toolkit with built in BLAS support, and implementations of algorithms for performing inference in Bayesian networks and kernel-based methods for classification, regression, clustering, anomaly detection, and feature ranking.

...read moreread less

Proceedings Article

InfoGAN: interpretable representation learning by information maximizing generative adversarial nets

Xi Chen, +5 more

TL;DR: InfoGAN as mentioned in this paper is an information-theoretic extension to the GAN that is able to learn disentangled representations in a completely unsupervised manner, and it also discovers visual concepts that include hair styles, presence of eyeglasses, and emotions on the CelebA face dataset.

...read moreread less

X2Face: A network for controlling face generation by using images, audio, and pose codes

Citations

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

A morphable model for the synthesis of 3D faces

Text-based editing of talking-head video

Neural style-preserving visual dubbing

StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN

References

Image-to-Image Translation with Conditional Adversarial Networks

Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks

Poisson image editing

Dlib-ml: A Machine Learning Toolkit

InfoGAN: interpretable representation learning by information maximizing generative adversarial nets

Related Papers (5)

A morphable model for the synthesis of 3D faces

Image-to-Image Translation with Conditional Adversarial Networks

Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks

Generative Adversarial Nets

Adam: A Method for Stochastic Optimization