scispace - formally typeset
Search or ask a question
Topic

View synthesis

About: View synthesis is a research topic. Over the lifetime, 1701 publications have been published within this topic receiving 42333 citations.


Papers
More filters
Proceedings Article
01 Jan 2020
TL;DR: Continuous Object Representation Networks (CORN), a conditional architecture that encodes an input image's geometry and appearance that map to a 3D consistent scene representation that achieves performance comparable to state-of-the-art approaches that use direct supervision.
Abstract: Novel View Synthesis (NVS) is concerned with synthesizing views under camera viewpoint transformations from one or multiple input images. NVS requires explicit reasoning about 3D object structure and unseen parts of the scene to synthesize convincing results. As a result, current approaches typically rely on supervised training with either ground truth 3D models or multiple target images. We propose Continuous Object Representation Networks (CORN), a conditional architecture that encodes an input image's geometry and appearance that map to a 3D consistent scene representation. We can train CORN with only two source images per object by combining our model with a neural renderer. A key feature of CORN is that it requires no ground truth 3D models or target view supervision. Regardless, CORN performs well on challenging tasks such as novel view synthesis and single-view 3D reconstruction and achieves performance comparable to state-of-the-art approaches that use direct supervision. For up-to-date information, data, and code, please see our project page: this https URL.

11 citations

Proceedings ArticleDOI
20 Jun 2021
TL;DR: Wang et al. as mentioned in this paper construct a volume under the target view and design a source-view visibility estimation (SVE) module to determine the visibility of the target-view voxels in each source view.
Abstract: We address the problem of novel view synthesis (NVS) from a few sparse source view images. Conventional image-based rendering methods estimate scene geometry and synthesize novel views in two separate steps. However, erroneous geometry estimation will decrease NVS performance as view synthesis highly depends on the quality of estimated scene geometry. In this paper, we propose an end-to-end NVS framework to eliminate the error propagation issue. To be specific, we construct a volume under the target view and design a source-view visibility estimation (SVE) module to determine the visibility of the target-view voxels in each source view. Next, we aggregate the visibility of all source views to achieve a consensus volume. Each voxel in the consensus volume indicates a surface existence probability. Then, we present a soft ray-casting (SRC) mechanism to find the most front surface in the target view (i.e., depth). Specifically, our SRC traverses the consensus volume along viewing rays and then estimates a depth probability distribution. We then warp and aggregate source view pixels to synthesize a novel view based on the estimated source-view visibility and target-view depth. At last, our network is trained in an end-to-end self-supervised fashion, thus significantly alleviating error accumulation in view synthesis. Experimental results demonstrate that our method generates novel views in higher quality compared to the state-of-the-art.

11 citations

Journal ArticleDOI
TL;DR: In this article, a hierarchical image superpixel algorithm is used to maintain structural characteristics of the scene during image reconstruction, which is the state-of-the-art in view synthesis.
Abstract: View synthesis allows observers to explore static scenes using aligned color images and depth maps captured in a preset camera path. Among the options, depth-image-based rendering (DIBR) approaches have been effective and efficient since only one pair of color and depth map is required, saving storage and bandwidth. The present work proposes a novel DIBR pipeline for view synthesis that properly tackles the different artifacts that arise from 3D warping, such as cracks, disocclusions, ghosts, and out-of-field areas. A key aspect of our contributions relies on the adaptation and usage of a hierarchical image superpixel algorithm that helps to maintain structural characteristics of the scene during image reconstruction. We compare our approach with state-of-the-art methods and show that it attains the best average results in two common assessment metrics under public still-image and video-sequence datasets. Visual results are also provided, illustrating the potential of our technique in real-world applications.

11 citations

Proceedings ArticleDOI
01 Nov 2012
TL;DR: An efficient coding algorithm for depth map images and videos, based on view synthesis distortion estimation, is proposed, and a quantization scheme for residual depth data, which adaptively assigns bits according to block complexities is developed.
Abstract: An efficient coding algorithm for depth map images and videos, based on view synthesis distortion estimation, is proposed in this work. We first analyze how a depth error is related to a disparity error and how the disparity vector error affects the energy spectral density of a synthesized color video in the frequency domain. Based on the analysis, we propose an estimation technique to predict the view synthesis distortion without requiring the actual synthesis of intermediate view frames. To encode the depth information efficiently, we employ a Lagrangian cost function to minimize the view synthesis distortion subject to the constraint on a transmission bit rate. In addition, we develop a quantization scheme for residual depth data, which adaptively assigns bits according to block complexities. Simulation results demonstrate that the proposed depth video coding algorithm provides significantly better R-D performance than conventional algorithms.

11 citations

Proceedings ArticleDOI
06 Jul 2020
TL;DR: This work captures a novel light-field dataset featuring both a high spatial resolution and a high dynamic range (HDR) to enable the community to research and develop efficient reconstruction and tone-mapping algorithms for a hyper-realistic visual experience.
Abstract: Light-field (LF) imaging has various advantages over the traditional 2D photography, providing angular information of the real world scene by separately recording light rays in different directions. Despite the directional light information which enables new capabilities such as depth estimation, post-capture refocusing, and 3D modelling, currently available light-field datasets are very restricted in terms of spatial-resolution and dynamic range. In this work, we address this problem by capturing a novel light-field dataset featuring both a high spatial resolution and a high dynamic range (HDR). This dataset should enable the community to research and develop efficient reconstruction and tone-mapping algorithms for a hyper-realistic visual experience. The dataset consists of six static light-fields that are captured by a high-quality digital camera mounted on two precise linear axes using exposure bracketing at each view point. To demonstrate the usefulness of such a dataset, we also performed a thorough analysis on local and global tone-mapping of natural data in the context of novel view-rendering. The rendered results are compared and evaluated both visually and quantitatively. To our knowledge, the recorded dataset is the first attempt to jointly capture high-resolution and HDR light-fields.

11 citations


Network Information
Related Topics (5)
Image segmentation
79.6K papers, 1.8M citations
86% related
Feature (computer vision)
128.2K papers, 1.7M citations
86% related
Object detection
46.1K papers, 1.3M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Feature extraction
111.8K papers, 2.1M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202354
2022117
2021189
2020158
2019114
2018102