scispace - formally typeset
Search or ask a question
Topic

View synthesis

About: View synthesis is a research topic. Over the lifetime, 1701 publications have been published within this topic receiving 42333 citations.


Papers
More filters
Proceedings ArticleDOI
21 Aug 2007
TL;DR: A new approach for generation of super-resolution stereoscopic and multi-view video from monocular video, an extension of the realistic stereo view synthesis (RSVS) approach which is based on structure from motion techniques and image-based rendering to generate the desired stereoscopic views for each point in time.
Abstract: This paper presents a new approach for generation of super-resolution stereoscopic and multi-view video from monocular video. Such multi-view video is used for instance with multi-user 3D displays or auto-stereoscopic displays with head-tracking to create a depth impression of the observed scenery. Our approach is an extension of the realistic stereo view synthesis (RSVS) approach which is based on structure from motion techniques and image-based rendering to generate the desired stereoscopic views for each point in time. The extension relies on an additional super- resolution mode which utilizes a number of frames of the original video sequence to generate a virtual stereo frame with higher resolution. The algorithm is tested on several TV broadcast videos, as well as on sequences captured with a single handheld camera and sequences from the well known BBC documentation "Planet Earth". Finally, some simulation results will show that RSVS is quite suitable for super-resolution 2D-3D conversion.

27 citations

Posted Content
TL;DR: This work explores spherical view synthesis for learning monocular 360 depth in a self-supervised manner and demonstrates its feasibility, and shows how to better exploit the expressiveness of traditional CNNs when applied to the equirectangular domain in an efficient manner.
Abstract: Learning based approaches for depth perception are limited by the availability of clean training data. This has led to the utilization of view synthesis as an indirect objective for learning depth estimation using efficient data acquisition procedures. Nonetheless, most research focuses on pinhole based monocular vision, with scarce works presenting results for omnidirectional input. In this work, we explore spherical view synthesis for learning monocular 360 depth in a self-supervised manner and demonstrate its feasibility. Under a purely geometrically derived formulation we present results for horizontal and vertical baselines, as well as for the trinocular case. Further, we show how to better exploit the expressiveness of traditional CNNs when applied to the equirectangular domain in an efficient manner. Finally, given the availability of ground truth depth data, our work is uniquely positioned to compare view synthesis against direct supervision in a consistent and fair manner. The results indicate that alternative research directions might be better suited to enable higher quality depth perception. Our data, models and code are publicly available at this https URL.

27 citations

01 Jan 2003
TL;DR: The paper presents a multi-state statistical decision models with Kalman filtering based tracking for head pose detection and face orientation estimation for simultaneous capture of the driver's head pose and driving view.
Abstract: Driver distraction is an important issue in developing new generation of telematic systems. Our research is focused on development of novel machine vision systems, which can provide better understanding of the state of the driver and driving conditions. In this paper we discuss in detail on the development of a system which allows simultaneous capture of the driver's head pose and driving view. The system utilizes a full 360 degree panoramic field of view using a single video stream. The integrated machine vision system includes modules of perspective transformation, feature extraction, head detection, head pose estimation, and driving view synthesis. The paper presents a multi-state statistical decision models with Kalman filtering based tracking for head pose detection and face orientation estimation. The basic feasibility and robustness of the approach is demonstrated with the help of a series of systematic experimental studies.

27 citations

Proceedings ArticleDOI
19 May 2013
TL;DR: Experimental results show texture can be recovered in large disocclusions and the proposed method has better visual quality compared to existing methods.
Abstract: Quality of synthesized view by Depth-Image-Based Rendering (DIBR) highly depends on hole filling, especially for synthesized view with large disocclusion. Many hole filling methods are proposed to improve the synthesized view quality and inpainting is the most popular approach to recover the disocclusions. However, the conventional inpainting either makes the hole regions blurred via diffusion or propagates the foreground information to the disoclusion regions. Annoying artifacts are created in the synthesized virtual views. This paper proposes a depth-aided exemplar-based inpainting method for recovering large disoclusion. It consists of two processes, warped depth map filling and warped color image filling. Since depth map can be considered as a grey-scale image without texture, it is much easier to be filled. Disoccluded regions of color image are predicted based on its associated filled depth map information. Regions with texture lying around the background have higher priority to be filled than other regions and disoccluded regions are filled by propagating the background texture through the exemplar-based inpainting. Thus artifacts created by diffusion or using foreground information for prediction can be eliminated. Experimental results show texture can be recovered in large disocclusions and the proposed method has better visual quality compared to existing methods.

27 citations

Proceedings ArticleDOI
03 Dec 2010
TL;DR: This paper presents a complex framework for 3D video, where not only the 3D format and new coding methods are investigated, but also view synthesis and the provision of high-quality depth maps, e.g. via depth estimation.
Abstract: The introduction of first 3D systems for digital cinema and home entertainment is based on stereo technology. For efficiently supporting new display types, depth-enhanced formats and coding technology is required, as introduced in this overview paper. First, we discuss the necessity for a generic 3D video format, as the current state-of-the-art in multi-view video coding cannot support different types of multi-view displays at the same time. Therefore, a generic depth-enhanced 3D format is developed, where any number of views can be generated from one bit stream. This, however, requires a complex framework for 3D video, where not only the 3D format and new coding methods are investigated, but also view synthesis and the provision of high-quality depth maps, e.g. via depth estimation. We present this framework and discuss the interdependencies between the different modules.

27 citations


Network Information
Related Topics (5)
Image segmentation
79.6K papers, 1.8M citations
86% related
Feature (computer vision)
128.2K papers, 1.7M citations
86% related
Object detection
46.1K papers, 1.3M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Feature extraction
111.8K papers, 2.1M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202354
2022117
2021189
2020158
2019114
2018102