scispace - formally typeset
Open AccessProceedings Article

Learning Depth from Single Monocular Images

TLDR
This work begins by collecting a training set of monocular images (of unstructured outdoor environments which include forests, trees, buildings, etc.) and their corresponding ground-truth depthmaps, and applies supervised learning to predict the depthmap as a function of the image.
Abstract
We consider the task of depth estimation from a single monocular image. We take a supervised learning approach to this problem, in which we begin by collecting a training set of monocular images (of unstructured outdoor environments which include forests, trees, buildings, etc.) and their corresponding ground-truth depthmaps. Then, we apply supervised learning to predict the depthmap as a function of the image. Depth estimation is a challenging problem, since local features alone are insufficient to estimate depth at a point, and one needs to consider the global context of the image. Our model uses a discriminatively-trained Markov Random Field (MRF) that incorporates multiscale local- and global-image features, and models both depths at individual points as well as the relation between depths at different points. We show that, even on unstructured scenes, our algorithm is frequently able to recover fairly accurate depthmaps.

read more

Content maybe subject to copyright    Report

Citations
More filters

Computer vision : a modern approach = 计算机视觉 : 一种现代的方法

David Forsyth, +1 more
TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.
Proceedings Article

Depth Map Prediction from a Single Image using a Multi-Scale Deep Network

TL;DR: In this article, two deep network stacks are employed to make a coarse global prediction based on the entire image, and another to refine this prediction locally, which achieves state-of-the-art results on both NYU Depth and KITTI.
Proceedings ArticleDOI

Deeper Depth Prediction with Fully Convolutional Residual Networks

TL;DR: A fully convolutional architecture, encompassing residual learning, to model the ambiguous mapping between monocular images and depth maps is proposed and a novel way to efficiently learn feature map up-sampling within the network is presented.
Journal ArticleDOI

Make3D: Learning 3D Scene Structure from a Single Still Image

TL;DR: This work considers the problem of estimating detailed 3D structure from a single still image of an unstructured environment and uses a Markov random field (MRF) to infer a set of "plane parameters" that capture both the 3D location and 3D orientation of the patch.
Proceedings ArticleDOI

Deep Ordinal Regression Network for Monocular Depth Estimation

TL;DR: Deep Ordinal Regression Network (DORN) as discussed by the authors discretizes depth and recast depth network learning as an ordinal regression problem by training the network using an ordinary regression loss, which achieves much higher accuracy and faster convergence in synch.
References
More filters
Journal ArticleDOI

A taxonomy and evaluation of dense two-frame stereo correspondence algorithms

TL;DR: This paper has designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms.

Computer vision : a modern approach = 计算机视觉 : 一种现代的方法

David Forsyth, +1 more
TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.
Book

Computer Vision: A Modern Approach

TL;DR: The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications as discussed by the authors, which includes essential topics that either reflect practical significance or are of theoretical importance.
Proceedings ArticleDOI

Multiscale conditional random fields for image labeling

TL;DR: An approach to include contextual features for labeling images, in which each pixel is assigned to one of a finite set of labels, are incorporated into a probabilistic framework, which combines the outputs of several components.
Proceedings ArticleDOI

High speed obstacle avoidance using monocular vision and reinforcement learning

TL;DR: An approach in which supervised learning is first used to estimate depths from single monocular images, which is able to learn monocular vision cues that accurately estimate the relative depths of obstacles in a scene is presented.
Related Papers (5)