scispace - formally typeset
Search or ask a question
Topic

View synthesis

About: View synthesis is a research topic. Over the lifetime, 1701 publications have been published within this topic receiving 42333 citations.


Papers
More filters
Patent
17 Aug 2012
TL;DR: In this paper, a video coder is configured to code information indicative of whether view synthesis prediction is enabled for video data, and when the information indicates that view synthesis is not enabled for the video data the coder may code the current picture using at least one of intra-prediction, temporal interprediction and inter-view prediction without reference to any view synthesis pictures.
Abstract: In one example, a video coder is configured to code information indicative of whether view synthesis prediction is enabled for video data. When the information indicates that view synthesis prediction is enabled for the video data, the video coder may generate a view synthesis picture using the video data and code at least a portion of a current picture relative to the view synthesis picture. The at least portion of the current picture may comprise, for example, a block (e.g., a PU, a CU, a macroblock, or a partition of a macroblock), a slice, a tile, a wavefront, or the entirety of the current picture. On the other hand, when the information indicates that view synthesis prediction is not enabled for the video data, the video coder may code the current picture using at least one of intra-prediction, temporal inter-prediction, and inter-view prediction without reference to any view synthesis pictures.

22 citations

Posted Content
TL;DR: In this article, a view synthesis procedure followed by stereo matching is proposed to solve the problem of monocular depth estimation in an end-to-end fashion, where geometrical constraints can be explicitly imposed during inference and demand on labelled depth data can be greatly alleviated.
Abstract: Previous monocular depth estimation methods take a single view and directly regress the expected results. Though recent advances are made by applying geometrically inspired loss functions during training, the inference procedure does not explicitly impose any geometrical constraint. Therefore these models purely rely on the quality of data and the effectiveness of learning to generalize. This either leads to suboptimal results or the demand of huge amount of expensive ground truth labelled data to generate reasonable results. In this paper, we show for the first time that the monocular depth estimation problem can be reformulated as two sub-problems, a view synthesis procedure followed by stereo matching, with two intriguing properties, namely i) geometrical constraints can be explicitly imposed during inference; ii) demand on labelled depth data can be greatly alleviated. We show that the whole pipeline can still be trained in an end-to-end fashion and this new formulation plays a critical role in advancing the performance. The resulting model outperforms all the previous monocular depth estimation methods as well as the stereo block matching method in the challenging KITTI dataset by only using a small number of real training data. The model also generalizes well to other monocular depth estimation benchmarks. We also discuss the implications and the advantages of solving monocular depth estimation using stereo methods.

22 citations

Proceedings ArticleDOI
26 May 2013
TL;DR: This paper enhances the inter-view consistency of multiview depth imagery by classifying the color information in the multiv view color imagery by modeling color with a mixture of Dirichlet distributions where the model parameters are estimated in a Bayesian framework with variational inference.
Abstract: High quality view synthesis is a prerequisite for future free-viewpoint television. It will enable viewers to move freely in a dynamic real world scene. Depth image based rendering algorithms will play a pivotal role when synthesizing an arbitrary number of novel views by using a subset of captured views and corresponding depth maps only. Usually, each depth map is estimated individually by stereo-matching algorithms and, hence, shows lack of inter-view consistency. This inconsistency affects the quality of view synthesis negatively. This paper enhances the inter-view consistency of multiview depth imagery. First, our approach classifies the color information in the multiview color imagery by modeling color with a mixture of Dirichlet distributions where the model parameters are estimated in a Bayesian framework with variational inference. Second, using the resulting color clusters, we classify the corresponding depth values in the multiview depth imagery. Each clustered depth image is subject to further sub-clustering. Finally, the resulting mean of each sub-cluster is used to enhance the depth imagery at multiple viewpoints. Experiments show that our approach improves the average quality of virtual views by up to 0.8 dB when compared to views synthesized by using conventionally estimated depth maps.

22 citations

Book ChapterDOI
23 Aug 2020
TL;DR: In this article, the color and depth of the visible surface of the 3D scene are synthesized to impose explicit constraints on the multiple-plane image (MPI) representation prediction process.
Abstract: We tackle a new problem of semantic view synthesis—generating free-viewpoint rendering of a synthesized scene using a semantic label map as input. We build upon recent advances in semantic image synthesis and view synthesis for handling photographic image content generation and view extrapolation. Direct application of existing image/view synthesis methods, however, results in severe ghosting/blurry artifacts. To address the drawbacks, we propose a two-step approach. First, we focus on synthesizing the color and depth of the visible surface of the 3D scene. We then use the synthesized color and depth to impose explicit constraints on the multiple-plane image (MPI) representation prediction process. Our method produces sharp contents at the original view and geometrically consistent renderings across novel viewpoints. The experiments on numerous indoor and outdoor images show favorable results against several strong baselines and validate the effectiveness of our approach.

22 citations

Journal ArticleDOI
TL;DR: A new approach is proposed that shifts most of the burden due to interactivity from the decoder to the encoder, by anticipating the navigation of thedecoder and sending auxiliary information that guarantees temporal and interview consistency, which leads to an additional cost in terms of transmission rate and storage.
Abstract: Multiview video with interactive 2D look around at the receiver is a challenging application with several issues in terms of effective use of storage and bandwidth resources, reactivity of the system, quality of the viewing experience and system complexity. The impression of 3D immersion is highly dependent on the smoothness of the navigation and thus on the number of 2D viewpoints. The classical decoding system for generating virtual views first projects a reference or encoded frame to a given viewpoint and then fills in the holes due to potential occlusions. This last step still constitutes a complex operation with specific software or hardware at the receiver and requires a certain quantity of information from the neighboring frames for ensuring consistency between the virtual images. In this work we propose a new approach that shifts most of the burden due to interactivity from the decoder to the encoder, by anticipating the navigation of the decoder and sending auxiliary information that guarantees temporal and interview consistency. This leads to an additional cost in terms of transmission rate and storage, which we minimize by using optimization techniques based on the user behavior modeling. We show by experiments that the proposed system represents a valid solution for interactive multiview systems with classical decoders.

22 citations


Network Information
Related Topics (5)
Image segmentation
79.6K papers, 1.8M citations
86% related
Feature (computer vision)
128.2K papers, 1.7M citations
86% related
Object detection
46.1K papers, 1.3M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Feature extraction
111.8K papers, 2.1M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202354
2022117
2021189
2020158
2019114
2018102