scispace - formally typeset
Search or ask a question
Topic

View synthesis

About: View synthesis is a research topic. Over the lifetime, 1701 publications have been published within this topic receiving 42333 citations.


Papers
More filters
Posted Content
TL;DR: This paper proposes bowtie networks that jointly learn 3D geometric and semantic representations with feedback in the loop and instantiates it on the illustrative dual-task of joint few-shot recognition and novel-view synthesis.
Abstract: Generative modeling has recently shown great promise in computer vision, but its success is often limited to separate tasks. In this paper, motivated by multi-task learning of shareable feature representations, we consider a novel problem of learning a shared generative model across various tasks. We instantiate it on the illustrative dual-task of joint few-shot recognition and novel-view synthesis: given only one or few images of a novel object from arbitrary views with only category annotation, we aim to simultaneously learn an object classifier and generate images of the object from new viewpoints. To this end, we propose bowtie networks that jointly learn 3D geometric and semantic representations with feedback in the loop. Experimental evaluation on challenging fine-grained recognition datasets demonstrates that our synthesized images are realistic from multiple viewpoints and significantly improve recognition performance as ways of data augmentation, especially in the low-data regime. We further show that our approach is flexible and can be easily extended to incorporate other tasks, such as style guided synthesis.

6 citations

Posted Content
TL;DR: A novel view synthesis approach based on stereo-vision and CNNs that decomposes the problem into two sub-tasks: view dependent geometry estimation and texture inpainting that could be effectively learned with CNNs is presented.
Abstract: Novel view synthesis is an important problem in computer vision and graphics. Over the years a large number of solutions have been put forward to solve the problem. However, the large-baseline novel view synthesis problem is far from being "solved". Recent works have attempted to use Convolutional Neural Networks (CNNs) to solve view synthesis tasks. Due to the difficulty of learning scene geometry and interpreting camera motion, CNNs are often unable to generate realistic novel views. In this paper, we present a novel view synthesis approach based on stereo-vision and CNNs that decomposes the problem into two sub-tasks: view dependent geometry estimation and texture inpainting. Both tasks are structured prediction problems that could be effectively learned with CNNs. Experiments on the KITTI Odometry dataset show that our approach is more accurate and significantly faster than the current state-of-the-art. The code and supplementary material will be publicly available. Results could be found here this https URL

6 citations

Proceedings ArticleDOI
12 Oct 2020
TL;DR: This paper introduces a novel multi-view supervision and an explicit rotational loss during the learning process, enabling the model to preserve detailed body parts and to achieve consistency between adjacent synthesized views.
Abstract: Human novel view synthesis aims to synthesize target views of a human subject given input images taken from one or more reference viewpoints. Despite significant advances in model-free novel view synthesis, existing methods present two major limitations when applied to complex shapes like humans. First, these methods mainly focus on simple and symmetric objects, e.g., cars and chairs, limiting their performances to fine-grained and asymmetric shapes. Second, existing methods cannot guarantee visual consistency across different adjacent views of the same object. To solve these problems, we present in this paper a learning framework for the novel view synthesis of human subjects, which explicitly enforces consistency across different generated views of the subject. Specifically, we introduce a novel multi-view supervision and an explicit rotational loss during the learning process, enabling the model to preserve detailed body parts and to achieve consistency between adjacent synthesized views. To show the superior performance of our approach, we present qualitative and quantitative results on the Multi-View Human Action (MVHA) dataset we collected (consisting of 3D human models animated with different Mocap sequences and captured from 54 different viewpoints), the Pose-Varying Human Model (PVHM) dataset, and ShapeNet. The qualitative and quantitative results demonstrate that our approach outperforms the state-of-the-art baselines in both per-view synthesis quality, and in preserving rotational consistency and complex shapes (e.g. fine-grained details, challenging poses) across multiple adjacent views in a variety of scenarios, for both humans and rigid objects.

6 citations

Proceedings ArticleDOI
Hochul Cho1, Jangyoon Kim1, Woontack Woo1
23 Mar 2019
TL;DR: A novel view synthesis process that references multiple 360 images via reconstructing a large-scale real world based virtual data map and perform a weighted blending for interpolating multiple novel view images is proposed.
Abstract: We present a novel view synthesis method that allows users to experience a large-scale Six-Degree-of-Freedom (6-DOF) virtual environment. Our main contributions are the construction of a large-scale 6-DOF virtual environment using multiple 360 images as well as synthesis of a scene from novel viewpoints. Novel view synthesis from a single 360 image can give free viewpoint experience with full 6-DOF of head motion to players, but the moveable space is limited within a context of the image. We propose a novel view synthesis process that references multiple 360 images via reconstructing a large-scale real world based virtual data map and perform a weighted blending for interpolating multiple novel view images. Our results show that our approach provides a wider area of virtual environment as well as a smooth transition between each reference 360 images.

6 citations

Journal ArticleDOI
TL;DR: A two-way interaction is built between the analysis and reconstruction stages, which provides the tradeoff between the final image quality and amount of data transmitted, in a low-complexity solution enabling online processing capability while preserving the MPEG-4 compatibility of the I3D representation.
Abstract: A new approach for compact representation, MPEG-4 encoding, and reconstruction of video objects captured by an uncalibrated system of multiple cameras is presented. The method is based on the incomplete 3-D (I3D) technique, which was initially investigated for stereo video objects captured by parallel cameras. Non-overlapping portions of the object are extracted from the reference views, each view having the corresponding portion with the highest resolution. This way, the redundancy of the initial multiview data is reduced. The areas which are extracted from the basis views are denoted as areas of interest. The output of the analysis stage, i.e., the areas of interest and the corresponding parts of the disparity fields are encoded in the MPEG-4 bitstream. Disparity fields define the correspondence relations between the reference views. The view synthesis is performed by disparity-oriented reprojection of the areas of interest into the virtual view plane and can be seen as an intermediate postprocessing stage between the decoder and the scene compositor. This work performs an extension from parallel stereo views to arbitrary configured multi-views with new analysis and synthesis algorithms. Moreover, a two-way interaction is built between the analysis and reconstruction stages, which provides the tradeoff between the final image quality and amount of data transmitted. The focus is on a low-complexity solution enabling online processing capability while preserving the MPEG-4 compatibility of the I3D representation. It is finally shown that our method yields quite convincing results despite the minimal data used and the approximations involved.

6 citations


Network Information
Related Topics (5)
Image segmentation
79.6K papers, 1.8M citations
86% related
Feature (computer vision)
128.2K papers, 1.7M citations
86% related
Object detection
46.1K papers, 1.3M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Feature extraction
111.8K papers, 2.1M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202354
2022117
2021189
2020158
2019114
2018102