LookinGood: enhancing performance capture with real-time neural re-rendering

doi:10.1145/3272127.3275099

Open AccessJournal ArticleDOI

LookinGood: enhancing performance capture with real-time neural re-rendering

Ricardo Martin-Brualla, +16 more

- 04 Dec 2018 -

ACM Transactions on Graphics

- Vol. 37, Iss: 6, pp 255

Chats0

TLDR

The novel approach to augment such real-time performance capture systems with a deep architecture that takes a rendering from an arbitrary viewpoint, and jointly performs completion, super resolution, and denoising of the imagery in real- time is taken.

Abstract:

Motivated by augmented and virtual reality applications such as telepresence, there has been a recent focus in real-time performance capture of humans under motion. However, given the real-time constraint, these systems often suffer from artifacts in geometry and texture such as holes and noise in the final rendering, poor lighting, and low-resolution textures. We take the novel approach to augment such real-time performance capture systems with a deep architecture that takes a rendering from an arbitrary viewpoint, and jointly performs completion, super resolution, and denoising of the imagery in real-time. We call this approach neural (re-)rendering, and our live system "LookinGood". Our deep architecture is trained to produce high resolution and high quality images from a coarse rendering in real-time. First, we propose a self-supervised training method that does not require manual ground-truth annotation. We contribute a specialized reconstruction error that uses semantic information to focus on relevant parts of the subject, e.g. the face. We also introduce a salient reweighing scheme of the loss function that is able to discard outliers. We specifically design the system for virtual and augmented reality headsets where the consistency between the left and right eye plays a crucial role in the final user experience. Finally, we generate temporally stable results by explicitly minimizing the difference between two consecutive frames. We tested the proposed system in two different scenarios: one involving a single RGB-D sensor, and upper body reconstruction of an actor, the second consisting of full body 360° capture. Through extensive experimentation, we demonstrate how our system generalizes across unseen sequences and subjects.

LookinGood: enhancing performance capture with real-time neural re-rendering

Citations

Everybody Dance Now

NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections

Neural Sparse Voxel Fields

Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans

Neural volumes: learning dynamic renderable volumes from images

References

Adam: A Method for Stochastic Optimization

ImageNet: A large-scale hierarchical image database

U-Net: Convolutional Networks for Biomedical Image Segmentation

Generative Adversarial Nets

Adam: A Method for Stochastic Optimization

Related Papers (5)

Deferred neural rendering: image synthesis using neural textures

Image-to-Image Translation with Conditional Adversarial Networks

U-Net: Convolutional Networks for Biomedical Image Segmentation

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

The Unreasonable Effectiveness of Deep Features as a Perceptual Metric