scispace - formally typeset
Search or ask a question
Topic

View synthesis

About: View synthesis is a research topic. Over the lifetime, 1701 publications have been published within this topic receiving 42333 citations.


Papers
More filters
Posted Content
TL;DR: In this article, a time-conditioned neural radiance field is proposed to represent scene dynamics using a set of compact latent codes, and two strategies for efficient training of the neural network are proposed: hierarchical training and importance sampling.
Abstract: We propose a novel approach for 3D video synthesis that is able to represent multi-view video recordings of a dynamic real-world scene in a compact, yet expressive representation that enables high-quality view synthesis and motion interpolation. Our approach takes the high quality and compactness of static neural radiance fields in a new direction: to a model-free, dynamic setting. At the core of our approach is a novel time-conditioned neural radiance fields that represents scene dynamics using a set of compact latent codes. To exploit the fact that changes between adjacent frames of a video are typically small and locally consistent, we propose two novel strategies for efficient training of our neural network: 1) An efficient hierarchical training scheme, and 2) an importance sampling strategy that selects the next rays for training based on the temporal variation of the input videos. In combination, these two strategies significantly boost the training speed, lead to fast convergence of the training process, and enable high quality results. Our learned representation is highly compact and able to represent a 10 second 30 FPS multi-view video recording by 18 cameras with a model size of just 28MB. We demonstrate that our method can render high-fidelity wide-angle novel views at over 1K resolution, even for highly complex and dynamic scenes. We perform an extensive qualitative and quantitative evaluation that shows that our approach outperforms the current state of the art. We include additional video and information at: this https URL

13 citations

Posted Content
TL;DR: In this paper, a sparse neural radiance grid (SNeRG) is proposed for real-time rendering of 3D scenes from images with the goal of rendering photorealistic images of the scene from unobserved viewpoints.
Abstract: Neural volumetric representations such as Neural Radiance Fields (NeRF) have emerged as a compelling technique for learning to represent 3D scenes from images with the goal of rendering photorealistic images of the scene from unobserved viewpoints. However, NeRF's computational requirements are prohibitive for real-time applications: rendering views from a trained NeRF requires querying a multilayer perceptron (MLP) hundreds of times per ray. We present a method to train a NeRF, then precompute and store (i.e. "bake") it as a novel representation called a Sparse Neural Radiance Grid (SNeRG) that enables real-time rendering on commodity hardware. To achieve this, we introduce 1) a reformulation of NeRF's architecture, and 2) a sparse voxel grid representation with learned feature vectors. The resulting scene representation retains NeRF's ability to render fine geometric details and view-dependent appearance, is compact (averaging less than 90 MB per scene), and can be rendered in real-time (higher than 30 frames per second on a laptop GPU). Actual screen captures are shown in our video.

13 citations

Proceedings ArticleDOI
01 Jul 2020
TL;DR: It is found that large disparities in a scene are the main source of challenge for the light field view interpolation methods and a basic backward warping based on the depth estimation from optical flow provides comparable performance against usually complex learning-based methods.
Abstract: Light field view interpolation provides a solution that reduces the prohibitive size of a dense light field. This paper examines state-of- the-art light field view interpolation methods with a comprehensive benchmark on challenging scenarios specific for interpolation tasks. Each method is analyzed in terms of their strengths and weaknesses in handling different challenges. We find that large disparities in a scene are the main source of challenge for the light field view interpolation methods. We also find that a basic backward warping based on the depth estimation from optical flow provides comparable performance against usually complex learning-based methods.

13 citations

Proceedings ArticleDOI
TL;DR: Results show that usual objective metrics can fail assessing synthesized views, in the sense of human judgment.
Abstract: This paper considers the reliability of usual assessment methods when evaluating virtual synthesized views in the multi-view video context. Virtual views are generated from Depth Image Based Rendering (DIBR) algorithms. Because DIBR algorithms involve geometric transformations, new types of artifacts come up. The question regards the ability of commonly used methods to deal with such artifacts. This paper investigates how correlated usual metrics are to human judgment. The experiments consist in assessing seven different view synthesis algorithms by subjective and objective methods. Three different 3D video sequences are used in the tests. Resulting virtual synthesized sequences are assessed through objective metrics and subjective protocols. Results show that usual objective metrics can fail assessing synthesized views, in the sense of human judgment.

13 citations

Proceedings ArticleDOI
02 Nov 1998
TL;DR: A novel automatic method for view synthesis (or image transfer) from a number of uncalibrated images based on edge transfer that has been used to generate immersive video objects which are called 3D video sprites.
Abstract: 1. ABSTRACT This paper presents a novel automatic method for view synthesis (or image transfer) from a number of uncalibrated images based on edge transfer. The edge-based technique is of general practical relevance because it overcomes most of the problems encountered in other approaches that either rely upon dense correspondence, work in projective space or need explicit camera calibration. The method has been used to generate immersive video objects which we call 3D video sprites. They consists of a number of synchronous video streams and mark up information that allows virtual viewpoints with respect to a live action video to be rendered and combined with traditional 3D virtual environments.

13 citations


Network Information
Related Topics (5)
Image segmentation
79.6K papers, 1.8M citations
86% related
Feature (computer vision)
128.2K papers, 1.7M citations
86% related
Object detection
46.1K papers, 1.3M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Feature extraction
111.8K papers, 2.1M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202354
2022117
2021189
2020158
2019114
2018102