scispace - formally typeset
Search or ask a question
Topic

View synthesis

About: View synthesis is a research topic. Over the lifetime, 1701 publications have been published within this topic receiving 42333 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper proposes a novel no-reference image quality assessment method for 3-D synthesized views (called NIQSV+), which can evaluate the quality of synthesizer views by measuring the typical synthesis distortions: blurry regions, black holes, and stretching, with access to neither the reference image nor the depth map.
Abstract: Benefiting from multi-view video plus depth and depth-image-based-rendering technologies, only limited views of a real 3-D scene need to be captured, compressed, and transmitted. However, the quality assessment of synthesized views is very challenging, since some new types of distortions, which are inherently different from the texture coding errors, are inevitably produced by view synthesis and depth map compression, and the corresponding original views (reference views) are usually not available. Thus the full-reference quality metrics cannot be used for synthesized views. In this paper, we propose a novel no-reference image quality assessment method for 3-D synthesized views (called NIQSV+). This blind metric can evaluate the quality of synthesized views by measuring the typical synthesis distortions: blurry regions, black holes, and stretching, with access to neither the reference image nor the depth map. To evaluate the performance of the proposed method, we compare it with four full-reference 3-D (synthesized view dedicated) metrics, five full-reference 2-D metrics, and three no-reference 2-D metrics. In terms of their correlations with subjective scores, our experimental results show that the proposed no-reference metric approaches the best of the state-of-the-art full reference and no-reference 3-D metrics; and outperforms the widely used no-reference and full-reference 2-D metrics significantly. In terms of its approximation of human ranking, the proposed metric achieves the best performance in the experimental test.

68 citations

Journal ArticleDOI
TL;DR: This paper considers the problem of reconstructing visually realistic 3D models of dynamic semitransparent scenes, such as fire, from a very small set of simultaneous views, and reduces reconstruction to a convex combination of sheet-like density fields, each of which is derived from the density sheet of two input views.
Abstract: This paper considers the problem of reconstructing visually realistic 3D models of dynamic semitransparent scenes, such as fire, from a very small set of simultaneous views (even two). We show that this problem is equivalent to a severely underconstrained computerized tomography problem, for which traditional methods break down. Our approach is based on the observation that every pair of photographs of a semitransparent scene defines a unique density field, called a density sheet, that 1) concentrates all its density on one connected, semitransparent surface, 2) reproduces the two photos exactly, and 3) is the most spatially compact density field that does so. From this observation, we reduce reconstruction to the convex combination of sheet-like density fields, each of which is derived from the density sheet of two input views. We have applied this method specifically to the problem of reconstructing 3D models of fire. Experimental results suggest that this method enables high-quality view synthesis without overfitting artifacts

68 citations

01 Jan 2006
TL;DR: It was found that although the use of the texture prior improves the resulting rendered images, the initial photoconsistent estimate withoutUse of the prior is of very good visual quality.
Abstract: For this report, the view synthesis algorithm from the paper of the same title by Fitzgibbon et al. [1] was implemented. In this report, the geometric and probabilistic background of the algorithm, as well as necessary optimizations required to make the problem more tractable, are succinctly detailed. Results are then presented and analyzed. It was found that although the use of the texture prior improves the resulting rendered images, the initial photoconsistent estimate without use of the prior is of very good visual quality. There is still however a lot of room for improvements in terms of computational performance.

68 citations

Book ChapterDOI
23 Aug 2020
TL;DR: A new DeepMPI representation is introduced, motivated by observations on the sparsity structure of the plenoptic function, that allows for real-time synthesis of photorealistic views that are continuous in both space and across changes in lighting.
Abstract: Many popular tourist landmarks are captured in a multitude of online, public photos. These photos represent a sparse and unstructured sampling of the plenoptic function for a particular scene. In this paper, we present a new approach to novel view synthesis under time-varying illumination from such data. Our approach builds on the recent multi-plane image (MPI) format for representing local light fields under fixed viewing conditions. We introduce a new DeepMPI representation, motivated by observations on the sparsity structure of the plenoptic function, that allows for real-time synthesis of photorealistic views that are continuous in both space and across changes in lighting. Our method can synthesize the same compelling parallax and view-dependent effects as previous MPI methods, while simultaneously interpolating along changes in reflectance and illumination with time. We show how to learn a model of these effects in an unsupervised way from an unstructured collection of photos without temporal registration, demonstrating significant improvements over recent work in neural rendering. More information can be found at crowdsampling.io.

68 citations

Posted Content
Yurui Ren1, Xiaoming Yu1, Junming Chen1, Thomas H. Li1, Ge Li1 
TL;DR: A differentiable global-flow local-attention framework to reassemble the inputs at the feature level to transform a source person image to a target pose and the results of both subjective and objective experiments demonstrate the superiority of this model.
Abstract: Pose-guided person image generation is to transform a source person image to a target pose. This task requires spatial manipulations of source data. However, Convolutional Neural Networks are limited by the lack of ability to spatially transform the inputs. In this paper, we propose a differentiable global-flow local-attention framework to reassemble the inputs at the feature level. Specifically, our model first calculates the global correlations between sources and targets to predict flow fields. Then, the flowed local patch pairs are extracted from the feature maps to calculate the local attention coefficients. Finally, we warp the source features using a content-aware sampling method with the obtained local attention coefficients. The results of both subjective and objective experiments demonstrate the superiority of our model. Besides, additional results in video animation and view synthesis show that our model is applicable to other tasks requiring spatial transformation. Our source code is available at this https URL.

67 citations


Network Information
Related Topics (5)
Image segmentation
79.6K papers, 1.8M citations
86% related
Feature (computer vision)
128.2K papers, 1.7M citations
86% related
Object detection
46.1K papers, 1.3M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Feature extraction
111.8K papers, 2.1M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202354
2022117
2021189
2020158
2019114
2018102