scispace - formally typeset
Search or ask a question
Topic

View synthesis

About: View synthesis is a research topic. Over the lifetime, 1701 publications have been published within this topic receiving 42333 citations.


Papers
More filters
Journal ArticleDOI
08 Oct 2021-Sensors
TL;DR: Zhang et al. as mentioned in this paper proposed a multi-view stereo matching method, EnSoft3D (Enhanced Soft 3D Reconstruction) to obtain dense and high-quality depth images.
Abstract: In this paper, we propose a multi-view stereo matching method, EnSoft3D (Enhanced Soft 3D Reconstruction) to obtain dense and high-quality depth images. Multi-view stereo is one of the high-interest research areas and has wide applications. Motivated by the Soft3D reconstruction method, we introduce a new multi-view stereo matching scheme. The original Soft3D method is introduced for novel view synthesis, while occlusion-aware depth is also reconstructed by integrating the matching costs of the Plane Sweep Stereo (PSS) and soft visibility volumes. However, the Soft3D method has an inherent limitation because the erroneous PSS matching costs are not updated. To overcome this limitation, the proposed scheme introduces an update process of the PSS matching costs. From the object surface consensus volume, an inverse consensus kernel is derived, and the PSS matching costs are iteratively updated using the kernel. The proposed EnSoft3D method reconstructs a highly accurate 3D depth image because both the multi-view matching cost and soft visibility are updated simultaneously. The performance of the proposed method is evaluated by using structured and unstructured benchmark datasets. Disparity error is measured to verify 3D reconstruction accuracy, and both PSNR and SSIM are measured to verify the simultaneous enhancement of view synthesis.

5 citations

Posted Content
TL;DR: RGBD-Net as mentioned in this paper predicts the depth map and the color images at the target pose in a multi-scale manner, which enables reconstruction of more accurate 3D point clouds than the existing multi-view stereo methods.
Abstract: We address the problem of novel view synthesis from an unstructured set of reference images. A new method called RGBD-Net is proposed to predict the depth map and the color images at the target pose in a multi-scale manner. The reference views are warped to the target pose to obtain multi-scale plane sweep volumes, which are then passed to our first module, a hierarchical depth regression network which predicts the depth map of the novel view. Second, a depth-aware generator network refines the warped novel views and renders the final target image. These two networks can be trained with or without depth supervision. In experimental evaluation, RGBD-Net not only produces novel views with higher quality than the previous state-of-the-art methods, but also the obtained depth maps enable reconstruction of more accurate 3D point clouds than the existing multi-view stereo methods. The results indicate that RGBD-Net generalizes well to previously unseen data.

5 citations

Journal ArticleDOI
TL;DR: In this article, a semi-automatic depth estimation algorithm for Free-viewpoint TV (FTV) is proposed, which is an extension of an automatic depth estimation method whereby additional manually created data is input for one or multiple frames.
Abstract: In this paper, we propose a semi-automatic depth estimation algorithm for Free-viewpoint TV (FTV). The proposed method is an extension of an automatic depth estimation method whereby additional manually created data is input for one or multiple frames. Automatic depth estimation methods generally have difficulty obtaining good depth results around object edges and in areas with low texture. The goal of our method is to improve the depth in these areas and reduce view synthesis artifacts in Depth Image Based Rendering. High-quality view synthesis is very important in applications such as FTV and 3DTV. We define three types of manual input data providing disparity initialization, object segmentation information, and motion information. This data is input as images, which we refer to as manual disparity map, manual edge map, and manual static map, respectively. For evaluation, we used MPEG multi-view videos to demonstrate that our algorithm can significantly improve the depth maps and, as a result, reduce view synthesis artifacts.

5 citations

Book ChapterDOI
08 Jan 2011
TL;DR: This chapter is especially interested in this chapter that models are developed integrating the information across multiple training images so that the information from different views can be integrated in the modeling phase, and applied in the recognition phase.
Abstract: Stereo correspondence refers to the matches between two images with different viewpoints looking at the same object or scene. It is one of the most active research topics in computer vision as it plays a central role in 3D object recognition, object categorization, view synthesis, scene reconstruction, and many other applications. The image pair with different viewpoints is known as stereo images when the baseline and camera parameters are given. Given stereo images, the approaches for finding stereo correspondences are generally split into two categories: one based on sparse local features found matched between the images, and the other based on dense pixel-to-pixel matched regions found between the images. The former is proven effective for 3D object recognition and categorization, while the latter is better for view synthesis and scene reconstruction. This chapter focuses on the former because of the increasing interests in 3D object recognition in recent years, also because the feature-based methods have recently made a substantial progress by several state-of-the-art local (feature) descriptors. The study of object recognition using stereo vision often requires a training set which offers stereo images for developing the model for each object considered, and a test set which offers images with variations in viewpoint, scale, illumination, and occlusion conditions for evaluating the model. Many methods on local descriptors consider each image from stereo or multiple views a single instance without exploring much of the relationship between these instances, ending up with models of multiple independent instances. Using such a model for object recognition is like matching between a training image and a test image. It is, however, especially interested in this chapter that models are developed integrating the information across multiple training images. The central concern is how to extract local features from stereo or multiple images so that the information from different views can be integrated in the modeling phase, and applied in the recognition phase. This chapter is composed of the following contents:

5 citations

Proceedings ArticleDOI
01 Nov 2018
TL;DR: This work focuses on the problem of finding point correspondences in a multispectral imaging setup and proposes an image transformation, which maps one image modality to the respective target image, conditioned on the data of the original spectral band.
Abstract: The precise determination of correspondences between pairs of images is still a fundamental building block of many computer vision systems. Despite the maturity of modern feature matchers, multispectral methods are still lacking robustness and speed. We focus on the problem of finding point correspondences in a multispectral imaging setup. Most methods aim at invariant feature transforms (e.g. multi-modal descriptors) which come at the cost of reduced discriminance.We model the appearance change by learning an image transformation, which maps one image modality to the respective target image, conditioned on the data of the original spectral band. This approach is coupled with a pipeline of state of the art matching methods with view synthesis of increasing complexity and algorithm run-time.We evaluate the approach on a wide spectrum of multispectral datasets including near-infrared, color-infrared and night and day thermal infrared imagery. The proposed approach provides significant improvements in terms of speed and robustness compared to standard multi-modal registration approaches. In addition, the approach fits very well into existing system approaches by design.Applications are numerous and include multispectral sensor fusion, multispectral odometry systems, multispectral segmentation or multispectral super-resolution methods.

5 citations


Network Information
Related Topics (5)
Image segmentation
79.6K papers, 1.8M citations
86% related
Feature (computer vision)
128.2K papers, 1.7M citations
86% related
Object detection
46.1K papers, 1.3M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Feature extraction
111.8K papers, 2.1M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202354
2022117
2021189
2020158
2019114
2018102