scispace - formally typeset
Search or ask a question
Topic

View synthesis

About: View synthesis is a research topic. Over the lifetime, 1701 publications have been published within this topic receiving 42333 citations.


Papers
More filters
Book ChapterDOI
08 Sep 2018
TL;DR: A two-stage deep network for pose-guided human-image generation that performs coarse view prediction followed by a refinement stage that introduces a masked version of the structural similarity loss that facilitates the network to focus on generating a higher quality view.
Abstract: View synthesis aims at generating a novel, unseen view of an object. This is a challenging task in the presence of occlusions and asymmetries. In this paper, we present View-Disentangled Generator (VDG), a two-stage deep network for pose-guided human-image generation that performs coarse view prediction followed by a refinement stage. In the first stage, the network predicts the output from a target human pose, the source-image and the corresponding human pose, which are processed in different branches separately. This enables the network to learn a disentangled representation from the source and target view. In the second stage, the coarse output from the first stage is refined by adversarial training. Specifically, we introduce a masked version of the structural similarity loss that facilitates the network to focus on generating a higher quality view. Experiments on Market-1501 and DeepFashion demonstrate the effectiveness of the proposed generator.

16 citations

Proceedings ArticleDOI
20 Jul 2022
TL;DR: This work proposes an end-to-end framework that addresses two core challenges in modeling and driving full-body avatars of real people, and introduces texel-aligned features—a localised representation which can leverage both the structural prior of a skeleton-based parametric model and observed sparse image signals at the same time.
Abstract: Photorealistic telepresence requires both high-fidelity body modeling and faithful driving to enable dynamically synthesized appearance that is indistinguishable from reality. In this work, we propose an end-to-end framework that addresses two core challenges in modeling and driving full-body avatars of real people. One challenge is driving an avatar while staying faithful to details and dynamics that cannot be captured by a global low-dimensional parameterization such as body pose. Our approach supports driving of clothed avatars with wrinkles and motion that a real driving performer exhibits beyond the training corpus. Unlike existing global state representations or non-parametric screen-space approaches, we introduce texel-aligned features—a localised representation which can leverage both the structural prior of a skeleton-based parametric model and observed sparse image signals at the same time. Another challenge is modeling a temporally coherent clothed avatar, which typically requires precise surface tracking. To circumvent this, we propose a novel volumetric avatar representation by extending mixtures of volumetric primitives to articulated objects. By explicitly incorporating articulation, our approach naturally generalizes to unseen poses. We also introduce a localized viewpoint conditioning, which leads to a large improvement in generalization of view-dependent appearance. The proposed volumetric representation does not require high-quality mesh tracking as a prerequisite and brings significant quality improvements compared to mesh-based counterparts. In our experiments, we carefully examine our design choices and demonstrate the efficacy of our approach, outperforming the state-of-the-art methods on challenging driving scenarios.

16 citations

Proceedings ArticleDOI
01 Sep 2012
TL;DR: This work proposes a DIBR algorithm with advanced inpainting methods that enhances visual experience by taking spatial and temporal texture consistency problems into account and shows objective and subjective gains compared to the state-of-the-art.
Abstract: Depth image-based rendering (DIBR) techniques are advanced tools in 3-D video (3DV) applications that are used to synthesize a number of additional views in a multiview-video-plus-depth (MVD) representation. The MVD format consists of video and depth sequences for a limited number of original camera views of the same scene. An inherent problem of the view synthesis concept is given by image information that is occluded in the original views and becomes visible, in the extrapolated views. To handle these disoc-clussions, we propose a DIBR algorithm with advanced inpainting methods. Our renderer enhances visual experience by taking spatial and temporal texture consistency problems into account. In order to compensate the global motion in a sequence, image registration is incorporated into the framework. The proposed method shows objective and subjective gains compared to the state-of-the-art.

16 citations

Proceedings ArticleDOI
02 Nov 2015
TL;DR: A depth-aided patch based inpainting method to perform the disocclusion of holes that appear when synthesizing virtual views from RGB-D scenes is proposed, which is efficient compared to state-of-the-art approaches.
Abstract: In this paper we propose a depth-aided patch based inpainting method to perform the disocclusion of holes that appear when synthesizing virtual views from RGB-D scenes. Depth information is added to each key step of the classical patch-based algorithm from [Criminisi et al. 2004] to guide the synthesis of missing structures and textures. These contributions result in a new inpainting method which is efficient compared to state-of-the-art approaches (both in visual quality and computational burden), while requiring only a single easy-to-adjust additional parameter.

16 citations

Journal ArticleDOI
TL;DR: In this letter, a fast MRF-based hole filling method is proposed for view synthesis, which is formulated as an energy minimization problem and is solved with loopy belief propagation (LBP).
Abstract: Hole filling is one of the key issues in generating virtual view from video-plus-depth sequence by depth-image-based rendering. Hole filling method based on Markov random fields (MRF) is a practical way for view synthesis, but the traditional ones might introduce some foreground textures to the hole regions, and suffer from high computational complexity. In this letter, a fast MRF-based hole filling method is proposed for view synthesis, which is formulated as an energy minimization problem and is solved with loopy belief propagation (LBP). The energy function is optimized by employing the depth information to prevent the foreground textures filling holes. Furthermore, the LBP process maintains the visual consistency in the synthesized view by reserving all useful candidate labels. In addition, efficient belief propagation strategy is developed to optimize the LBP process, whose computational complexity is reduced to be linear with the number of candidate labels. Experimental results demonstrate the effectiveness of the proposed method with low running time and good visual consistency.

16 citations


Network Information
Related Topics (5)
Image segmentation
79.6K papers, 1.8M citations
86% related
Feature (computer vision)
128.2K papers, 1.7M citations
86% related
Object detection
46.1K papers, 1.3M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Feature extraction
111.8K papers, 2.1M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202354
2022117
2021189
2020158
2019114
2018102