scispace - formally typeset
Search or ask a question
Topic

View synthesis

About: View synthesis is a research topic. Over the lifetime, 1701 publications have been published within this topic receiving 42333 citations.


Papers
More filters
Proceedings ArticleDOI
03 Jun 2018
TL;DR: This paper presents a method for view synthesis from multiple views and their depth maps for free navigation in Virtual Reality with six degrees of freedom (6DoF) and 360 video (3DoF+), including synthesizing views corresponding to stepping in or out of the scene.
Abstract: This paper presents a method for view synthesis from multiple views and their depth maps for free navigation in Virtual Reality with six degrees of freedom (6DoF) and 360 video (3DoF+), including synthesizing views corresponding to stepping in or out of the scene. Such scenarios should support large baseline view synthesis, typically going beyond the view synthesis involved in light field displays [1]. Our method allows to input an unlimited number of reference views, instead of the usual left and right reference views. Increasing the number of reference views overcomes problems such as occlusions, tangential surfaces to the cameras axis and artifacts in low quality depth maps. We outperform MPEG’s reference software, VSRS [2], with a gain of up to 2.5 dB in PSNR when using four reference views.

37 citations

Journal ArticleDOI
TL;DR: By utilizing texture found in temporally adjacent frames, this work proposes to fill disocclusions in a faithful way, i.e., using texture that a real camera would observe in place of the virtual camera, to reduce the amount of artifacts introduced into the filling region.
Abstract: Disocclusion filling is a critical problem in depth- based view synthesis. Exposed regions in the target view that correspond to occluded areas in the reference view have to be filled in a meaningful way. Current approaches aim to do this in a plausible way, mostly inspired by image inpainting techniques . However, disocclusion filling is a video-based problem which exhibits more information than just the current frame. By utilizing texture found in temporally adjacent frames, we propose to fill disocclusions in a faithful way, i.e., using texture that a real camera would observe in place of the virtual camera. Only if faithful information is not available we fall back to plausible filling. Our approach is designed for single view video-plus-depth where neighboring camera views are not available for disocclusion filling. In contrast to previous approaches , our method uses superpixels instead of square patches as filling entities to reduce the amount of artifacts introduced into the filling region. Despite its importance , faithfulness has not obtained the due attention yet. Our experiments show that situations are common where a simple plausible filling does not lead to satisfying filling results. Thus, it is important to stress faithful disocclusion filling. Our current work is an attempt in this direction.

37 citations

Proceedings ArticleDOI
01 Dec 2010
TL;DR: This paper manipulates depth values themselves, without causing severe synthesized view distortion, in order to maximize sparsity in the transform domain for compression gain, and designs a heuristic to push resulting LP solution away from constraint boundaries to avoid quantization errors.
Abstract: Compression of depth maps is important for “image plus depth” representation of multiview images, which enables synthesis of novel intermediate views via depth-image-based rendering (DIBR) at decoder. Previous depth map coding schemes exploit unique depth characteristics to compactly and faithfully reproduce the original signal. In contrast, given that depth maps are not directly viewed but are only used for view synthesis, in this paper we manipulate depth values themselves, without causing severe synthesized view distortion, in order to maximize sparsity in the transform domain for compression gain. We formulate the sparsity maximization problem as an l 0 -norm optimization. Given l 0 -norm optimization is hard in general, we first find a sparse representation by iteratively solving a weighted l 1 minimization via linear programming (LP). We then design a heuristic to push resulting LP solution away from constraint boundaries to avoid quantization errors. Using JPEG as an example transform codec, we show that our approach gained up to 2.5dB in rate-distortion performance for the interpolated view.

36 citations

Proceedings ArticleDOI
07 Oct 2018
TL;DR: This paper addresses the problem of depth estimation for every viewpoint of a dense light field, exploiting information from only a sparse set of views, and proposes a method that computes disparity (or equivalently depth) forevery viewpoint taking into account occlusions.
Abstract: This paper addresses the problem of depth estimation for every viewpoint of a dense light field, exploiting information from only a sparse set of views. This problem is particularly relevant for applications such as light field reconstruction from a subset of views, for view synthesis and for compression. Unlike most existing methods for scene depth estimation from light fields, the proposed algorithm computes disparity (or equivalently depth) for every viewpoint taking into account occlusions. In addition, it preserves the continuity of the depth space and does not require prior knowledge on the depth range. The experiments show that, both for synthetic and real light fields, our algorithm achieves competitive performance to state-of-the-art algorithms which exploit the entire light field and usually generate the depth map for the center viewpoint only.

36 citations

Proceedings ArticleDOI
Jae Shin Yoon1, Kihwan Kim2, Orazio Gallo2, Hyun Soo Park1, Jan Kautz2 
14 Jun 2020
TL;DR: Zhang et al. as mentioned in this paper combine the depth from single view (DSV) and depth from multi-view stereo (DMV), where DSV is complete, i.e., a depth is assigned to every pixel, yet view-variant in its scale, while DMV is view-invariant yet incomplete.
Abstract: This paper presents a new method to synthesize an image from arbitrary views and times given a collection of images of a dynamic scene. A key challenge for the novel view synthesis arises from dynamic scene reconstruction where epipolar geometry does not apply to the local motion of dynamic contents. To address this challenge, we propose to combine the depth from single view (DSV) and the depth from multi-view stereo (DMV), where DSV is complete, i.e., a depth is assigned to every pixel, yet view-variant in its scale, while DMV is view-invariant yet incomplete. Our insight is that although its scale and quality are inconsistent with other views, the depth estimation from a single view can be used to reason about the globally coherent geometry of dynamic contents. We cast this problem as learning to correct the scale of DSV, and to refine each depth with locally consistent motions between views to form a coherent depth estimation. We integrate these tasks into a depth fusion network in a self-supervised fashion. Given the fused depth maps, we synthesize a photorealistic virtual view in a specific location and time with our deep blending network that completes the scene and renders the virtual view. We evaluate our method of depth estimation and view synthesis on a diverse real-world dynamic scenes and show the outstanding performance over existing methods.

36 citations


Network Information
Related Topics (5)
Image segmentation
79.6K papers, 1.8M citations
86% related
Feature (computer vision)
128.2K papers, 1.7M citations
86% related
Object detection
46.1K papers, 1.3M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Feature extraction
111.8K papers, 2.1M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202354
2022117
2021189
2020158
2019114
2018102