Open AccessProceedings Article
Learning Depth from Single Monocular Images
Ashutosh Saxena,Sung Hwan Chung,Andrew Y. Ng +2 more
- Vol. 18, pp 1161-1168
TLDR
This work begins by collecting a training set of monocular images (of unstructured outdoor environments which include forests, trees, buildings, etc.) and their corresponding ground-truth depthmaps, and applies supervised learning to predict the depthmap as a function of the image.Abstract:
We consider the task of depth estimation from a single monocular image. We take a supervised learning approach to this problem, in which we begin by collecting a training set of monocular images (of unstructured outdoor environments which include forests, trees, buildings, etc.) and their corresponding ground-truth depthmaps. Then, we apply supervised learning to predict the depthmap as a function of the image. Depth estimation is a challenging problem, since local features alone are insufficient to estimate depth at a point, and one needs to consider the global context of the image. Our model uses a discriminatively-trained Markov Random Field (MRF) that incorporates multiscale local- and global-image features, and models both depths at individual points as well as the relation between depths at different points. We show that, even on unstructured scenes, our algorithm is frequently able to recover fairly accurate depthmaps.read more
Citations
More filters
Computer vision : a modern approach = 计算机视觉 : 一种现代的方法
David Forsyth,Jean Ponce +1 more
TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.
Proceedings Article
Depth Map Prediction from a Single Image using a Multi-Scale Deep Network
TL;DR: In this article, two deep network stacks are employed to make a coarse global prediction based on the entire image, and another to refine this prediction locally, which achieves state-of-the-art results on both NYU Depth and KITTI.
Proceedings ArticleDOI
Deeper Depth Prediction with Fully Convolutional Residual Networks
TL;DR: A fully convolutional architecture, encompassing residual learning, to model the ambiguous mapping between monocular images and depth maps is proposed and a novel way to efficiently learn feature map up-sampling within the network is presented.
Journal ArticleDOI
Make3D: Learning 3D Scene Structure from a Single Still Image
TL;DR: This work considers the problem of estimating detailed 3D structure from a single still image of an unstructured environment and uses a Markov random field (MRF) to infer a set of "plane parameters" that capture both the 3D location and 3D orientation of the patch.
Proceedings ArticleDOI
Deep Ordinal Regression Network for Monocular Depth Estimation
TL;DR: Deep Ordinal Regression Network (DORN) as discussed by the authors discretizes depth and recast depth network learning as an ordinal regression problem by training the network using an ordinary regression loss, which achieves much higher accuracy and faster convergence in synch.
References
More filters
Journal ArticleDOI
A taxonomy and evaluation of dense two-frame stereo correspondence algorithms
TL;DR: This paper has designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms.
Computer vision : a modern approach = 计算机视觉 : 一种现代的方法
David Forsyth,Jean Ponce +1 more
TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.
Book
Computer Vision: A Modern Approach
David Forsyth,Jean Ponce +1 more
TL;DR: The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications as discussed by the authors, which includes essential topics that either reflect practical significance or are of theoretical importance.
Proceedings ArticleDOI
Multiscale conditional random fields for image labeling
TL;DR: An approach to include contextual features for labeling images, in which each pixel is assigned to one of a finite set of labels, are incorporated into a probabilistic framework, which combines the outputs of several components.
Proceedings ArticleDOI
High speed obstacle avoidance using monocular vision and reinforcement learning
TL;DR: An approach in which supervised learning is first used to estimate depths from single monocular images, which is able to learn monocular vision cues that accurately estimate the relative depths of obstacles in a scene is presented.