scispace - formally typeset
Open AccessProceedings ArticleDOI

Face2Face: Real-Time Face Capture and Reenactment of RGB Videos

TLDR
A novel approach for real-time facial reenactment of a monocular target video sequence (e.g., Youtube video) that addresses the under-constrained problem of facial identity recovery from monocular video by non-rigid model-based bundling and re-render the manipulated output video in a photo-realistic fashion.
Abstract
We present a novel approach for real-time facial reenactment of a monocular target video sequence (e.g., Youtube video). The source sequence is also a monocular video stream, captured live with a commodity webcam. Our goal is to animate the facial expressions of the target video by a source actor and re-render the manipulated output video in a photo-realistic fashion. To this end, we first address the under-constrained problem of facial identity recovery from monocular video by non-rigid model-based bundling. At run time, we track facial expressions of both source and target video using a dense photometric consistency measure. Reenactment is then achieved by fast and efficient deformation transfer between source and target. The mouth interior that best matches the re-targeted expression is retrieved from the target sequence and warped to produce an accurate fit. Finally, we convincingly re-render the synthesized target face on top of the corresponding video stream such that it seamlessly blends with the real-world illumination. We demonstrate our method in a live setup, where Youtube videos are reenacted in real time.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

FaceForensics++: Learning to Detect Manipulated Facial Images

TL;DR: In this paper, the realism of state-of-the-art image manipulations, and how difficult it is to detect them, either automatically or by humans, is examined.
Proceedings Article

Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations

TL;DR: The proposed Scene Representation Networks (SRNs), a continuous, 3D-structure-aware scene representation that encodes both geometry and appearance, are demonstrated by evaluating them for novel view synthesis, few-shot reconstruction, joint shape and appearance interpolation, and unsupervised discovery of a non-rigid face model.
Journal ArticleDOI

Synthesizing Obama: learning lip sync from audio

TL;DR: Given audio of President Barack Obama, a high quality video of him speaking with accurate lip sync is synthesized, composited into a target video clip, and a recurrent neural network learns the mapping from raw audio features to mouth shapes to produce photorealistic results.
Posted Content

FaceForensics++: Learning to Detect Manipulated Facial Images

TL;DR: This paper proposes an automated benchmark for facial manipulation detection, and shows that the use of additional domain-specific knowledge improves forgery detection to unprecedented accuracy, even in the presence of strong compression, and clearly outperforms human observers.
Journal ArticleDOI

Deferred neural rendering: image synthesis using neural textures

TL;DR: This work proposes Neural Textures, which are learned feature maps that are trained as part of the scene capture process that can be utilized to coherently re-render or manipulate existing video content in both static and dynamic environments at real-time rates.
References
More filters
Proceedings ArticleDOI

A morphable model for the synthesis of 3D faces

TL;DR: A new technique for modeling textured 3D faces by transforming the shape and texture of the examples into a vector space representation, which regulates the naturalness of modeled faces avoiding faces with an “unlikely” appearance.
Journal ArticleDOI

Deformation transfer for triangle meshes

TL;DR: This paper demonstrates the method offormation transfer by retargeting full body key poses, applying scanned facial deformations onto a digital character, and remapping rigid and non-rigid animation sequences from one mesh onto another.
Journal ArticleDOI

FaceWarehouse: A 3D Facial Expression Database for Visual Computing

TL;DR: There is a much richer matching collection of expressions, enabling depiction of most human facial actions, in FaceWarehouse, a database of 3D facial expressions for visual computing applications.
Journal ArticleDOI

Deformable Model Fitting by Regularized Landmark Mean-Shift

TL;DR: This work proposes a principled optimization strategy where nonparametric representations of these likelihoods are maximized within a hierarchy of smoothed estimates and is shown to outperform some common existing methods on the task of generic face fitting.
Proceedings ArticleDOI

Video Rewrite: driving visual speech with audio

TL;DR: Video Rewrite is the first facial-animation system to automate all the labeling and assembly tasks required to resync existing footage to a new soundtrack.