paGAN: real-time avatars using dynamic textures

doi:10.1145/3272127.3275075

Journal ArticleDOI

paGAN: real-time avatars using dynamic textures

Koki Nagano, +8 more

- 04 Dec 2018 -

ACM Transactions on Graphics

- Vol. 37, Iss: 6, pp 258

Chats0

TLDR

This work produces state-of-the-art quality image and video synthesis, and is the first to the knowledge that is able to generate a dynamically textured avatar with a mouth interior, all from a single image.

Abstract:

With the rising interest in personalized VR and gaming experiences comes the need to create high quality 3D avatars that are both low-cost and variegated. Due to this, building dynamic avatars from a single unconstrained input image is becoming a popular application. While previous techniques that attempt this require multiple input images or rely on transferring dynamic facial appearance from a source actor, we are able to do so using only one 2D input image without any form of transfer from a source image. We achieve this using a new conditional Generative Adversarial Network design that allows fine-scale manipulation of any facial input image into a new expression while preserving its identity. Our photoreal avatar GAN (paGAN) can also synthesize the unseen mouth interior and control the eye-gaze direction of the output, as well as produce the final image from a novel viewpoint. The method is even capable of generating fully-controllable temporally stable video sequences, despite not using temporal information during training. After training, we can use our network to produce dynamic image-based avatars that are controllable on mobile devices in real time. To do this, we compute a fixed set of output images that correspond to key blendshapes, from which we extract textures in UV space. Using a subject's expression blendshapes at run-time, we can linearly blend these key textures together to achieve the desired appearance. Furthermore, we can use the mouth interior and eye textures produced by our network to synthesize on-the-fly avatar animations for those regions. Our work produces state-of-the-art quality image and video synthesis, and is the first to our knowledge that is able to generate a dynamically textured avatar with a mouth interior, all from a single image.

paGAN: real-time avatars using dynamic textures

Citations

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

A morphable model for the synthesis of 3D faces

Deferred neural rendering: image synthesis using neural textures

MaskGAN: Towards Diverse and Interactive Facial Image Manipulation

Few-Shot Adversarial Learning of Realistic Neural Talking Head Models

References

Adam: A Method for Stochastic Optimization

Image-to-Image Translation with Conditional Adversarial Networks

A morphable model for the synthesis of 3D faces

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

Facial action coding system: a technique for the measurement of facial movement

Related Papers (5)

A morphable model for the synthesis of 3D faces

Image-to-Image Translation with Conditional Adversarial Networks

Generative Adversarial Nets

Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks

High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs