Deep Ordinal Regression Network for Monocular Depth Estimation
Citations
3,627 citations
954 citations
791 citations
Cites background from "Deep Ordinal Regression Network for..."
...Various approaches, such as combining local predictions [19, 55], non-parametric scene sampling [24], through to end-to-end supervised learning [9, 31, 10] have been explored....
[...]
567 citations
Cites background from "Deep Ordinal Regression Network for..."
...Beyond 2D image recognition, there is increasingly growing interest in 3D vision [18, 36, 6, 34, 4] for applications of autonomous driving, augmented reality, robotics, etc....
[...]
447 citations
Cites background or methods from "Deep Ordinal Regression Network for..."
...Recent algorithms like DORN [10] combine multi-scale features with ordinal regression to predict pixel depth with remarkably low errors....
[...]
...We use the state-of-the-art monocular depth estimator DORN [10], which is trained by the authors on 23,488 KITTI images....
[...]
...Our methods with pseudo-LiDAR estimated by PSMNET⋆ [3] (stereo) or DORN [10] (monocular) are in blue....
[...]
...One interesting comparison is between approaches using pseudo-LiDAR with monocular depth (DORN) and stereo depth (PSMNET )....
[...]
...While DORN has been trained with almost ten times more images than PSMNET (and some of them overlap with the validation data), the results with PSMNET dominate....
[...]
References
123,388 citations
49,914 citations
30,811 citations
12,531 citations
10,189 citations
"Deep Ordinal Regression Network for..." refers methods in this paper
...Following some recent scene parsing network [60, 4, 62], we advocate removing the last few downsampling operators of DCNNs and inserting holes to filters in the subsequent conv layers, called dilated convolution, to enlarge the field-of-view of filters without decreasing spatial resolution or increasing number of parameters....
[...]
...Inspired by recent advances in scene parsing [60, 4, 62], we first remove subsampling in the last few pooling layers and apply dilated convolutions to obtain large receptive fields....
[...]