scispace - formally typeset
Search or ask a question
Topic

View synthesis

About: View synthesis is a research topic. Over the lifetime, 1701 publications have been published within this topic receiving 42333 citations.


Papers
More filters
Proceedings ArticleDOI
10 Feb 2022
TL;DR: PVSeRF is presented, a learning framework that reconstructs neural radiance fields from single-view RGB images, and it is shown that the introduction of geometry-aware features helps to achieve a better disentanglement between appearance and geometry, i.e. recovering more accurate geometries and synthesizing higher quality images of novel views.
Abstract: We present PVSeRF, a learning framework that reconstructs neural radiance fields from single-view RGB images, for novel view synthesis. Previous solutions, such as pixelNeRF, rely only on pixel-aligned features and suffer from feature ambiguity issues. As a result, they struggle with the disentanglement of geometry and appearance, leading to implausible geometries and blurry results. To address this challenge, we propose to incorporate explicit geometry reasoning and combine it with pixel-aligned features for radiance field prediction. Specifically, in addition to pixel-aligned features, we further constrain the radiance field learning to be conditioned on i) voxel-aligned features learned from a coarse volumetric grid and ii) fine surface-aligned features extracted from a regressed point cloud. We show that the introduction of such geometry-aware features helps to achieve a better disentanglement between appearance and geometry, i.e. recovering more accurate geometries and synthesizing higher quality images of novel views. Extensive experiments against state-of-the-art methods on ShapeNet benchmarks demonstrate the superiority of our approach for single-image novel view synthesis.

5 citations

Proceedings ArticleDOI
06 Jul 2020
TL;DR: This work proposes to leverage a learning-based view synthesis method, which takes into account the light field structure to generate high-quality side information, and demonstrates that the proposed view synthesis-based approach can achieve similar performance, while substantially reducing the number of key views to be transmitted.
Abstract: Light field imaging is becoming a key technology, which provides users with a realistic visual experience through the capability of dynamic viewpoint shifting. This ability comes at the cost of capturing huge amounts of information, leaving the problem of compression and transmission a challenge. The encoder complexity is the key to achieve efficient coding in conventional light field coding schemes, where a complicated prediction process is essentially used at the encoder side to exploit the redundancy present in the light field image. We employ Distributed Source Coding (DSC) for light field images, which can extensively lift the computational requirement from the encoding side at the expense of increased computational complexity at the decoder side. The efficiency of DSC is heavily dependent on the quality of side information at the decoder. Therefore, we propose to leverage a learning-based view synthesis method, which takes into account the light field structure to generate high-quality side information. We compare our approach to Distributed Video Coding and Distributed Multi-view Video Coding schemes adapted to the light field framework and relevant standard-based approach, and demonstrate that the proposed view synthesis-based approach can achieve similar performance, while substantially reducing the number of key views to be transmitted.

5 citations

Proceedings ArticleDOI
10 Jun 2013
TL;DR: This paper proposes a representation that captures the dependencies between pixels in different frames in the form of connections in a graph, which leads to more accurate view synthesis, when compared to conventional lossy coding of depth maps operating at the same bit rate.
Abstract: In this paper, we design a new approach for coding the geometry information in a multiview image scenario. As an alternative to depth-based schemes, we propose a representation that captures the dependencies between pixels in different frames in the form of connections in a graph. In our approach it is possible to directly perform compression by simplifying the graph, which provides a more direct control of the effect of coding on geometry representation. Our method leads to more accurate view synthesis, when compared to conventional lossy coding of depth maps operating at the same bit rate.

5 citations

Proceedings ArticleDOI
16 Feb 2012
TL;DR: This paper proposes a low-complexity computing technology based on group-of-pixels which increases 30 times of performance for depth map generation, and reduces 60% computation time of view synthesis.
Abstract: Choosing one's own viewpoint when watching a video program has long been a desire for viewers. To achieve this goal, view synthesis and depth map generation are two fundamental techniques. View synthesis is a signal processing procedure which creates dense virtual views based on sparse real views. Each object inside a frame is warped to a proper position according to its depth information to form the viewpoint changing perception for viewers. Hence, the correctness of depth map influences the view synthesis quality. To increase the accuracy of depth map, this paper proposes an edge-adaptive block matching scheme cooperated with an unreliable region repairing approach. The former avoid finding local minimum in stereo matching, and the latter repairs the errors caused by occlusion regions. As for view synthesis, this paper proposes a special warping method that can detect errors caused by boundary mismatches of objects between corresponding depth and color images to improve quality of the synthesized view. Besides, we also propose a compensative-filling method that can fix tiny cracks due to round-off errors. Because of these two features, the proposed view synthesis becomes more robust to tolerate errors inside depth maps when compared with previous schemes. Both the depth generation and view synthesis are extremely complex computations. Therefore, this paper also proposes a low-complexity computing technology based on group-of-pixels which increases 30 times of performance for depth map generation, and reduces 60% computation time of view synthesis.

5 citations

Posted Content
TL;DR: This work presents an approach for aggregating a sparse set of views of an object in order to compute a semi-implicit 3D representation in the form of a volumetric feature grid and shows that computing a symmetry-aware mapping from pixels to the canonical coordinate system allows to better propagate information to unseen regions.
Abstract: We present an approach for aggregating a sparse set of views of an object in order to compute a semi-implicit 3D representation in the form of a volumetric feature grid. Key to our approach is an object-centric canonical 3D coordinate system into which views can be lifted, without explicit camera pose estimation, and then combined -- in a manner that can accommodate a variable number of views and is view order independent. We show that computing a symmetry-aware mapping from pixels to the canonical coordinate system allows us to better propagate information to unseen regions, as well as to robustly overcome pose ambiguities during inference. Our aggregate representation enables us to perform 3D inference tasks like volumetric reconstruction and novel view synthesis, and we use these tasks to demonstrate the benefits of our aggregation approach as compared to implicit or camera-centric alternatives.

5 citations


Network Information
Related Topics (5)
Image segmentation
79.6K papers, 1.8M citations
86% related
Feature (computer vision)
128.2K papers, 1.7M citations
86% related
Object detection
46.1K papers, 1.3M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Feature extraction
111.8K papers, 2.1M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202354
2022117
2021189
2020158
2019114
2018102