scispace - formally typeset
Search or ask a question
Topic

Depth map

About: Depth map is a research topic. Over the lifetime, 8449 publications have been published within this topic receiving 135608 citations.


Papers
More filters
Posted Content
TL;DR: In this paper, an unsupervised framework was proposed to learn a deep CNN for single view depth prediction without requiring a pre-training stage or annotated ground truth depths, by training the network in a manner analogous to an autoencoder.
Abstract: A significant weakness of most current deep Convolutional Neural Networks is the need to train them using vast amounts of manu- ally labelled data. In this work we propose a unsupervised framework to learn a deep convolutional neural network for single view depth predic- tion, without requiring a pre-training stage or annotated ground truth depths. We achieve this by training the network in a manner analogous to an autoencoder. At training time we consider a pair of images, source and target, with small, known camera motion between the two such as a stereo pair. We train the convolutional encoder for the task of predicting the depth map for the source image. To do so, we explicitly generate an inverse warp of the target image using the predicted depth and known inter-view displacement, to reconstruct the source image; the photomet- ric error in the reconstruction is the reconstruction loss for the encoder. The acquisition of this training data is considerably simpler than for equivalent systems, requiring no manual annotation, nor calibration of depth sensor to camera. We show that our network trained on less than half of the KITTI dataset (without any further augmentation) gives com- parable performance to that of the state of art supervised methods for single view depth estimation.

830 citations

Posted Content
TL;DR: In this article, a fully convolutional architecture, encompassing residual learning, is proposed to model the ambiguous mapping between monocular images and depth maps, which can be trained end-to-end and does not rely on post-processing techniques such as CRFs or other additional refinement steps.
Abstract: This paper addresses the problem of estimating the depth map of a scene given a single RGB image. We propose a fully convolutional architecture, encompassing residual learning, to model the ambiguous mapping between monocular images and depth maps. In order to improve the output resolution, we present a novel way to efficiently learn feature map up-sampling within the network. For optimization, we introduce the reverse Huber loss that is particularly suited for the task at hand and driven by the value distributions commonly present in depth maps. Our model is composed of a single architecture that is trained end-to-end and does not rely on post-processing techniques, such as CRFs or other additional refinement steps. As a result, it runs in real-time on images or videos. In the evaluation, we show that the proposed model contains fewer parameters and requires fewer training data than the current state of the art, while outperforming all approaches on depth estimation. Code and models are publicly available.

827 citations

Journal ArticleDOI
TL;DR: A novel systematic approach to enhance underwater images by a dehazing algorithm, to compensate the attenuation discrepancy along the propagation path, and to take the influence of the possible presence of an artifical light source into consideration is proposed.
Abstract: Light scattering and color change are two major sources of distortion for underwater photography. Light scattering is caused by light incident on objects reflected and deflected multiple times by particles present in the water before reaching the camera. This in turn lowers the visibility and contrast of the image captured. Color change corresponds to the varying degrees of attenuation encountered by light traveling in the water with different wavelengths, rendering ambient underwater environments dominated by a bluish tone. No existing underwater processing techniques can handle light scattering and color change distortions suffered by underwater images, and the possible presence of artificial lighting simultaneously. This paper proposes a novel systematic approach to enhance underwater images by a dehazing algorithm, to compensate the attenuation discrepancy along the propagation path, and to take the influence of the possible presence of an artifical light source into consideration. Once the depth map, i.e., distances between the objects and the camera, is estimated, the foreground and background within a scene are segmented. The light intensities of foreground and background are compared to determine whether an artificial light source is employed during the image capturing process. After compensating the effect of artifical light, the haze phenomenon and discrepancy in wavelength attenuation along the underwater propagation path to camera are corrected. Next, the water depth in the image scene is estimated according to the residual energy ratios of different color channels existing in the background light. Based on the amount of attenuation corresponding to each light wavelength, color change compensation is conducted to restore color balance. The performance of the proposed algorithm for wavelength compensation and image dehazing (WCID) is evaluated both objectively and subjectively by utilizing ground-truth color patches and video downloaded from the Youtube website. Both results demonstrate that images with significantly enhanced visibility and superior color fidelity are obtained by the WCID proposed.

782 citations

Book ChapterDOI
08 Sep 2018
TL;DR: This work presents an end-to-end deep learning architecture for depth map inference from multi-view images that flexibly adapts arbitrary N-view inputs using a variance-based cost metric that maps multiple features into one cost feature.
Abstract: We present an end-to-end deep learning architecture for depth map inference from multi-view images. In the network, we first extract deep visual image features, and then build the 3D cost volume upon the reference camera frustum via the differentiable homography warping. Next, we apply 3D convolutions to regularize and regress the initial depth map, which is then refined with the reference image to generate the final output. Our framework flexibly adapts arbitrary N-view inputs using a variance-based cost metric that maps multiple features into one cost feature. The proposed MVSNet is demonstrated on the large-scale indoor DTU dataset. With simple post-processing, our method not only significantly outperforms previous state-of-the-arts, but also is several times faster in runtime. We also evaluate MVSNet on the complex outdoor Tanks and Temples dataset, where our method ranks first before April 18, 2018 without any fine-tuning, showing the strong generalization ability of MVSNet.

746 citations

Journal ArticleDOI
Paul J. Besl1
01 Dec 1988
TL;DR: In this survey, the relative capabilities of different sensors and sensing methods are evaluated using a figure of merit based on range accuracy, depth of field, and image acquisition time.
Abstract: Active, optical range imaging systems collect three-dimensional coordinate data from object surfaces. These systems can be useful in a wide variety of automation applications, including shape acquisition, bin picking, assembly, inspection, gauging, robot navigation, medical diagnosis, cartography, and military tasks. The range-imaging sensors in such systems are unique imaging devices in that the image data points explicitly represent scene surface geometry in a sampled form. At least six different optical principles have been used to actively obtain range images: (1) radar, (2) triangulation, (3) moire, (4) holographic interferometry, (5) lens focusing, and (6) diffraction. The relative capabilities of different sensors and sensing methods are evaluated using a figure of merit based on range accuracy, depth of field, and image acquisition time.

670 citations


Network Information
Related Topics (5)
Image segmentation
79.6K papers, 1.8M citations
91% related
Feature extraction
111.8K papers, 2.1M citations
91% related
Convolutional neural network
74.7K papers, 2M citations
91% related
Feature (computer vision)
128.2K papers, 1.7M citations
90% related
Image processing
229.9K papers, 3.5M citations
88% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202382
2022229
2021480
2020685
2019797
2018654