scispace - formally typeset
Open AccessProceedings ArticleDOI

A Two-Streamed Network for Estimating Fine-Scaled Depth Maps from Single RGB Images

TLDR
In this article, a fast-to-train two-streamed CNN is proposed to predict depth and depth gradients, which are then fused together into an accurate and detailed depth map.
Abstract
Estimating depth from a single RGB image is an ill-posed and inherently ambiguous problem. State-of-the-art deep learning methods can now estimate accurate 2D depth maps, but when the maps are projected into 3D, they lack local detail and are often highly distorted. We propose a fast-to-train two-streamed CNN that predicts depth and depth gradients, which are then fused together into an accurate and detailed depth map. We also define a novel set loss over multiple images; by regularizing the estimation between a common set of images, the network is less prone to overfitting and achieves better accuracy than competing methods. Experiments on the NYU Depth v2 dataset shows that our depth predictions are competitive with state-of-the-art and lead to faithful 3D projections.

read more

Citations
More filters
Proceedings ArticleDOI

Deep Ordinal Regression Network for Monocular Depth Estimation

TL;DR: Deep Ordinal Regression Network (DORN) as discussed by the authors discretizes depth and recast depth network learning as an ordinal regression problem by training the network using an ordinary regression loss, which achieves much higher accuracy and faster convergence in synch.
Proceedings ArticleDOI

Densely Connected Pyramid Dehazing Network

TL;DR: Zhang et al. as discussed by the authors proposed a Densely Connected Pyramid Dehazing Network (DCPDN), which can jointly learn the transmission map, atmospheric light and dehazing all together.
Proceedings ArticleDOI

Enforcing Geometric Constraints of Virtual Normal for Depth Prediction

TL;DR: Zhang et al. as mentioned in this paper designed a loss term that enforces one simple type of geometric constraints, namely, virtual normal directions determined by randomly sampled three points in the reconstructed 3D space.
Posted Content

From big to small: Multi-scale local planar guidance for monocular depth estimation

TL;DR: This paper proposes a network architecture that utilizes novel local planar guidance layers located at multiple stages in the decoding phase that outperforms the state-of-the-art works with significant margin evaluating on challenging benchmarks.
Proceedings ArticleDOI

Deep Depth Completion of a Single RGB-D Image

TL;DR: In this article, a deep network is trained to predict surface normals and occlusion boundaries, which are then combined with raw depth observations provided by the RGB-D camera to solve for all pixels, including those missing in the original observation.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
Book ChapterDOI

U-Net: Convolutional Networks for Biomedical Image Segmentation

TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
Proceedings ArticleDOI

Fully convolutional networks for semantic segmentation

TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.
Proceedings ArticleDOI

Image-to-Image Translation with Conditional Adversarial Networks

TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
Related Papers (5)