Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry

doi:10.1007/978-3-030-01237-3_50

Open AccessBook ChapterDOI

Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry

- pp 835-852

TLDR

The Deep Virtual Stereo Odometry incorporates deep depth predictions into Direct Sparse Odometry (DSO) as direct virtual stereo measurements and designs a novel deep network that refines predicted depth from a single image in a two-stage process.

Abstract:

Monocular visual odometry approaches that purely rely on geometric cues are prone to scale drift and require sufficient motion parallax in successive frames for motion estimation and 3D reconstruction. In this paper, we propose to leverage deep monocular depth prediction to overcome limitations of geometry-based monocular visual odometry. To this end, we incorporate deep depth predictions into Direct Sparse Odometry (DSO) as direct virtual stereo measurements. For depth prediction, we design a novel deep network that refines predicted depth from a single image in a two-stage process. We train our network in a semi-supervised way on photoconsistency in stereo images and on consistency with accurate sparse depth reconstructions from Stereo DSO. Our deep predictions excel state-of-the-art approaches for monocular depth on the KITTI benchmark. Moreover, our Deep Virtual Stereo Odometry clearly exceeds previous monocular and deep-learning based methods in accuracy. It even achieves comparable performance to the state-of-the-art stereo methods, while only relying on a single camera.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Digging Into Self-Supervised Monocular Depth Estimation

Clément Godard, +3 more

TL;DR: In this paper, the authors propose a set of improvements, which together result in both quantitatively and qualitatively improved depth maps compared to competing self-supervised methods, and demonstrate the effectiveness of each component in isolation, and show high quality, state-of-theart results on the KITTI benchmark.

...read moreread less

Posted Content

Digging Into Self-Supervised Monocular Depth Estimation

Clément Godard, +3 more

- 04 Jun 2018 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: It is shown that a surprisingly simple model, and associated design choices, lead to superior predictions, and together result in both quantitatively and qualitatively improved depth maps compared to competing self-supervised methods.

...read moreread less

Proceedings ArticleDOI

D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry

Nan Yang, +3 more

TL;DR: Li et al. as mentioned in this paper proposed a self-supervised monocular depth estimation network trained on stereo videos without any external supervision, which aligns the training image pairs into similar lighting condition with predictive brightness transformation parameters.

...read moreread less

Proceedings ArticleDOI

3D Packing for Self-Supervised Monocular Depth Estimation

Vitor Guizilini, +4 more

TL;DR: Li et al. as mentioned in this paper proposed a self-supervised monocular depth estimation method combining geometry with a new deep network, PackNet, learned only from unlabeled monocular videos, which leverages symmetrical packing and unpacking blocks to jointly learn to compress and decompress detail-preserving representations using 3D convolutions.

...read moreread less

Journal ArticleDOI

CubeSLAM: Monocular 3-D Object SLAM

Shichao Yang, +1 more

- 07 May 2019 -

IEEE Transactions on Robotics

TL;DR: The SLAM method achieves the state-of-the-art monocular camera pose estimation and at the same time, improves the 3-D object detection accuracy.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

Journal ArticleDOI

Image quality assessment: from error visibility to structural similarity

Zhou Wang, +3 more

- 01 Apr 2004 -

IEEE Transactions on Image Processing

TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.

...read moreread less

Proceedings ArticleDOI

Fully convolutional networks for semantic segmentation

Jonathan Long, +2 more

TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.

...read moreread less

Book

Multiple view geometry in computer vision

Richard Hartley, +1 more

TL;DR: In this article, the authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly in a unified framework, including geometric principles and how to represent objects algebraically so they can be computed and applied.

...read moreread less

Collapse

Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry

Citations

Digging Into Self-Supervised Monocular Depth Estimation

Digging Into Self-Supervised Monocular Depth Estimation

D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry

3D Packing for Self-Supervised Monocular Depth Estimation

CubeSLAM: Monocular 3-D Object SLAM

References

Deep Residual Learning for Image Recognition

Adam: A Method for Stochastic Optimization

Image quality assessment: from error visibility to structural similarity

Fully convolutional networks for semantic segmentation

Multiple view geometry in computer vision

Related Papers (5)

Unsupervised Learning of Depth and Ego-Motion from Video

Are we ready for autonomous driving? The KITTI vision benchmark suite

Unsupervised Monocular Depth Estimation with Left-Right Consistency

Depth Map Prediction from a Single Image using a Multi-Scale Deep Network

Deep Residual Learning for Image Recognition