scispace - formally typeset
Search or ask a question
Topic

View synthesis

About: View synthesis is a research topic. Over the lifetime, 1701 publications have been published within this topic receiving 42333 citations.


Papers
More filters
Proceedings ArticleDOI
17 Nov 2019
TL;DR: In this paper, the authors propose an end-to-end view synthesis from a single image, using R-CNN and SynSin, using dynamic raycasting for neural rendering using Sphere-based Representations.
Abstract: 1. Accelerating 3D Deep Learning with PyTorch3D, arXiv 2007.08501 2. Mesh R-CNN, ICCV 2019 3. SynSin: End-to-end View Synthesis from a Single Image, CVPR 2020 4. Fast Differentiable Raycasting for Neural Rendering using Sphere-based Representations, arXiv 2004.07484

13 citations

Dissertation
01 Jan 2002
TL;DR: This thesis describes methods that generate a digital three-dimensional model of a visual scene's surfaces, using a set of calibrated photographs taken of the scene, and describes post-processing methods that refine surface reconstructions to improve model fidelity.
Abstract: This thesis describes methods that generate a digital three-dimensional (3D) model of a visual scene's surfaces, using a set of calibrated photographs taken of the scene. The 3D model is then rendered to produce views of the scene from new viewpoints. In the literature, this is known as 3D scene reconstruction for new view synthesis. This thesis introduces novel approaches that improve upon the quality, efficiency, and applicability of existing methods. To achieve a high quality reconstruction, it is essential to know which cameras have visibility of local areas on the surface. Accordingly, we present a related pair of techniques for computing visibility during a volumetric reconstruction. We then describe post-processing methods that refine surface reconstructions to improve model fidelity. We explore different representations for modeling the 3D surface during reconstruction. We introduce a method of warping the 3D space to represent large-scale scenes. We also investigate a level set approach, which embeds the 3D surface as the zero level set of volumetrically sampled function. Finally, we present a view-dependent representation that can be computed at interactive rates.

13 citations

Proceedings ArticleDOI
28 Jun 2009
TL;DR: An improved VSP-based MVC scheme based on the following three techniques based on view extrapolation, which allows VSP to be applicable to almost all camera views, projective rectification, which improves the synthesis quality when neighboring camera planes are not parallel, and synthesis bias correction.
Abstract: Current view synthesis prediction (VSP) techniques for multiview video coding (MVC) rely on disparity-based view interpolation or depth-based 3D warping. The former cannot be applied to every camera view, whereas the latter may require coding of the depth information of a scene. To avoid these constraints, we propose an improved VSP-based MVC scheme based on the following three techniques: 1) view extrapolation, which allows VSP to be applicable to almost all camera views, 2) projective rectification, which improves the synthesis quality when neighboring camera planes are not parallel, and 3) synthesis bias correction, which uses the past synthesis biases to improve the synthesis quality of the current frame. Experimental results demonstrate that our scheme offers PSNR gains of up to 1.6 dB compared to the current MVC standard.

13 citations

Proceedings ArticleDOI
30 Jul 2015
TL;DR: This paper proposes an improved DASH-based IMVS scheme over wireless networks that allows virtual views to be generated at either the cloud-based server or the client, and can adaptively select the optimal approach based on the network condition and the cost of the cloud.
Abstract: Interactive multiview video streaming (IMVS) allows viewers to periodically switch viewpoint. Its user experience can be further enhanced by creating virtual views from neighboring coded views using view synthesis techniques. Dynamic adaptive streaming over HTTP (DASH) is a new standard that can adjust the quality of video streaming according to the network condition. In this paper, we propose an improved DASH-based IMVS scheme over wireless networks. The main contributions are twofold. First, our scheme allows virtual views to be generated at either the cloud-based server or the client, and can adaptively select the optimal approach based on the network condition and the cost of the cloud. Second, scalable video coding is used in our system. Simulations with the NS3 tool demonstrate the advantage of our proposed scheme over the existing approach with client-based view synthesis and single-layer video coding.

13 citations

Journal ArticleDOI
TL;DR: A method of dense-view synthesis based on unsupervised learning is presented, which can synthesize arbitrary virtual views with multiple free-posed views captured in the real 3D scene based on an end-to-end trained network.
Abstract: Three-dimensional (3D) light field display, as a potential future display method, has attracted considerable attention. However, there still exist certain issues to be addressed, especially the capture of dense views in real 3D scenes. Using sparse cameras associated with view synthesis algorithm has become a practical method. Supervised convolutional neural network (CNN) is used to synthesize virtual views. However, such a large amount of training target views is sometimes difficult to be obtained and the training position is relatively fixed. Novel views can also be synthesized by unsupervised network MPVN, but the method has strict requirements on capturing multiple uniform horizontal viewpoints, which is not suitable in practice. Here, a method of dense-view synthesis based on unsupervised learning is presented, which can synthesize arbitrary virtual views with multiple free-posed views captured in the real 3D scene based on unsupervised learning. Multiple posed views are reprojected to the target position and input into the neural network. The network outputs a color tower and a selection tower indicating the scene distribution along the depth direction. A single image is yielded by the weighted summation of two towers. The proposed network is end-to-end trained based on unsupervised learning by minimizing errors during reconstructions of posed views. A virtual view can be predicted in a high quality by reprojecting posed views to the desired position. Additionally, a sequence of dense virtual views can be generated for 3D light-field display by repeated predictions. Experimental results demonstrate the validity of our proposed network. PSNR of synthesized views are around 30dB and SSIM are over 0.90. Since multiple cameras are supported to be placed in free-posed positions, there are not strict physical requirements and the proposed method can be flexibly used for the real scene capture. We believe this approach will contribute to the wide applications of 3D light-field display in the future.

13 citations


Network Information
Related Topics (5)
Image segmentation
79.6K papers, 1.8M citations
86% related
Feature (computer vision)
128.2K papers, 1.7M citations
86% related
Object detection
46.1K papers, 1.3M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Feature extraction
111.8K papers, 2.1M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202354
2022117
2021189
2020158
2019114
2018102