scispace - formally typeset
Open AccessJournal ArticleDOI

LookinGood: enhancing performance capture with real-time neural re-rendering

Reads0
Chats0
TLDR
The novel approach to augment such real-time performance capture systems with a deep architecture that takes a rendering from an arbitrary viewpoint, and jointly performs completion, super resolution, and denoising of the imagery in real- time is taken.
Abstract
Motivated by augmented and virtual reality applications such as telepresence, there has been a recent focus in real-time performance capture of humans under motion. However, given the real-time constraint, these systems often suffer from artifacts in geometry and texture such as holes and noise in the final rendering, poor lighting, and low-resolution textures. We take the novel approach to augment such real-time performance capture systems with a deep architecture that takes a rendering from an arbitrary viewpoint, and jointly performs completion, super resolution, and denoising of the imagery in real-time. We call this approach neural (re-)rendering, and our live system "LookinGood". Our deep architecture is trained to produce high resolution and high quality images from a coarse rendering in real-time. First, we propose a self-supervised training method that does not require manual ground-truth annotation. We contribute a specialized reconstruction error that uses semantic information to focus on relevant parts of the subject, e.g. the face. We also introduce a salient reweighing scheme of the loss function that is able to discard outliers. We specifically design the system for virtual and augmented reality headsets where the consistency between the left and right eye plays a crucial role in the final user experience. Finally, we generate temporally stable results by explicitly minimizing the difference between two consecutive frames. We tested the proposed system in two different scenarios: one involving a single RGB-D sensor, and upper body reconstruction of an actor, the second consisting of full body 360° capture. Through extensive experimentation, we demonstrate how our system generalizes across unseen sequences and subjects.

read more

Citations
More filters
Proceedings ArticleDOI

Everybody Dance Now

TL;DR: This paper presents a simple method for “do as I do” motion transfer: given a source video of a person dancing, it is shown that it can transfer that performance to a novel (amateur) target after only a few minutes of the target subject performing standard moves.
Posted Content

NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections

TL;DR: A learning-based method for synthesizing novel views of complex scenes using only unstructured collections of in-the-wild photographs, and applies it to internet photo collections of famous landmarks, to demonstrate temporally consistent novel view renderings that are significantly closer to photorealism than the prior state of the art.
Posted Content

Neural Sparse Voxel Fields

TL;DR: This work introduces Neural Sparse Voxel Fields (NSVF), a new neural scene representation for fast and high-quality free-viewpoint rendering that is over 10 times faster than the state-of-the-art (namely, NeRF) at inference time while achieving higher quality results.
Proceedings ArticleDOI

Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans

TL;DR: In this paper, the authors propose Neural Body, a new human body representation which assumes that learned neural representations at different frames share the same set of latent codes anchored to a deformable mesh, so that the observations across frames can be naturally integrated.
Journal ArticleDOI

Neural volumes: learning dynamic renderable volumes from images

TL;DR: This work presents a learning-based approach to representing dynamic objects inspired by the integral projection model used in tomographic imaging, and learns a latent representation of a dynamic scene that enables us to produce novel content sequences not seen during training.
References
More filters
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Book ChapterDOI

U-Net: Convolutional Networks for Biomedical Image Segmentation

TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
Journal ArticleDOI

Generative Adversarial Nets

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
Posted Content

Adam: A Method for Stochastic Optimization

TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.
Related Papers (5)