scispace - formally typeset
Open AccessProceedings ArticleDOI

Novel View Synthesis of Dynamic Scenes With Globally Coherent Depths From a Monocular Camera

Reads0
Chats0
TLDR
Zhang et al. as mentioned in this paper combine the depth from single view (DSV) and depth from multi-view stereo (DMV), where DSV is complete, i.e., a depth is assigned to every pixel, yet view-variant in its scale, while DMV is view-invariant yet incomplete.
Abstract
This paper presents a new method to synthesize an image from arbitrary views and times given a collection of images of a dynamic scene. A key challenge for the novel view synthesis arises from dynamic scene reconstruction where epipolar geometry does not apply to the local motion of dynamic contents. To address this challenge, we propose to combine the depth from single view (DSV) and the depth from multi-view stereo (DMV), where DSV is complete, i.e., a depth is assigned to every pixel, yet view-variant in its scale, while DMV is view-invariant yet incomplete. Our insight is that although its scale and quality are inconsistent with other views, the depth estimation from a single view can be used to reason about the globally coherent geometry of dynamic contents. We cast this problem as learning to correct the scale of DSV, and to refine each depth with locally consistent motions between views to form a coherent depth estimation. We integrate these tasks into a depth fusion network in a self-supervised fashion. Given the fused depth maps, we synthesize a photorealistic virtual view in a specific location and time with our deep blending network that completes the scene and renders the virtual view. We evaluate our method of depth estimation and view synthesis on a diverse real-world dynamic scenes and show the outstanding performance over existing methods.

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

D-NeRF: Neural Radiance Fields for Dynamic Scenes

TL;DR: D-NeRF is introduced, a method that extends neural radiance fields to a dynamic domain, allowing to reconstruct and render novel images of objects under rigid and non-rigid motions from a single camera moving around the scene.
Posted Content

Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes

TL;DR: A method to perform novel view and time synthesis of dynamic scenes, requiring only a monocular video with known camera poses as input, is presented, and a new representation that models the dynamic scene as a time-variant continuous function of appearance, geometry, and 3D scene motion is introduced.
Proceedings ArticleDOI

D-NeRF: Neural Radiance Fields for Dynamic Scenes

TL;DR: In this paper, a method that extends neural radiance fields to a dynamic domain, allowing to reconstruct and render novel images of objects under rigid and non-rigid motions from a single camera moving around the scene.
Posted Content

Space-time Neural Irradiance Fields for Free-Viewpoint Video

TL;DR: A method that learns a spatiotemporal neural irradiance field for dynamic scenes from a single video using the scene depth estimated from video depth estimation methods, aggregating contents from individual frames into a single global representation.
References
More filters
Book

Multiple view geometry in computer vision

TL;DR: In this article, the authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly in a unified framework, including geometric principles and how to represent objects algebraically so they can be computed and applied.

Multiple View Geometry in Computer Vision.

TL;DR: This book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts and it will show the best book collections and completed collections.
Journal ArticleDOI

"GrabCut": interactive foreground extraction using iterated graph cuts

TL;DR: A more powerful, iterative version of the optimisation of the graph-cut approach is developed and the power of the iterative algorithm is used to simplify substantially the user interaction needed for a given quality of result.
Posted Content

The Unreasonable Effectiveness of Deep Features as a Perceptual Metric

TL;DR: A new dataset of human perceptual similarity judgments is introduced and it is found that deep features outperform all previous metrics by large margins on this dataset, and suggests that perceptual similarity is an emergent property shared across deep visual representations.
Proceedings ArticleDOI

Structure-from-Motion Revisited

TL;DR: This work proposes a new SfM technique that improves upon the state of the art to make a further step towards building a truly general-purpose pipeline.
Related Papers (5)