Fully Differentiable and Interpretable Model for VIO with 4 Trainable Parameters.

Open AccessPosted Content

Fully Differentiable and Interpretable Model for VIO with 4 Trainable Parameters.

- 25 Sep 2021 -

TLDR

In this article, the authors propose a fully differentiable, interpretable, and lightweight monocular odometry model that contains only 4 trainable parameters. But, the model is not fully interpretable and heavy models hinder the generalization ability.

Abstract:

Monocular visual-inertial odometry (VIO) is a critical problem in robotics and autonomous driving. Traditional methods solve this problem based on filtering or optimization. While being fully interpretable, they rely on manual interference and empirical parameter tuning. On the other hand, learning-based approaches allow for end-to-end training but require a large number of training data to learn millions of parameters. However, the non-interpretable and heavy models hinder the generalization ability. In this paper, we propose a fully differentiable, interpretable, and lightweight monocular VIO model that contains only 4 trainable parameters. Specifically, we first adopt Unscented Kalman Filter as a differentiable layer to predict the pitch and roll, where the covariance matrices of noise are learned to filter out the noise of the IMU raw data. Second, the refined pitch and roll are adopted to retrieve a gravity-aligned BEV image of each frame using differentiable camera projection. Finally, a differentiable pose estimator is utilized to estimate the remaining 4 DoF poses between the BEV frames. Our method allows for learning the covariance matrices end-to-end supervised by the pose estimation loss, demonstrating superior performance to empirical baselines. Experimental results on synthetic and real-world datasets demonstrate that our simple approach is competitive with state-of-the-art methods and generalizes well on unseen scenes.

References

PDF

Open Access

More filters

Proceedings ArticleDOI

ORB: An efficient alternative to SIFT or SURF

Ethan Rublee, +3 more

TL;DR: This paper proposes a very fast binary descriptor based on BRIEF, called ORB, which is rotation invariant and resistant to noise, and demonstrates through experiments how ORB is at two orders of magnitude faster than SIFT, while performing as well in many situations.

...read moreread less

Journal ArticleDOI

Vision meets robotics: The KITTI dataset

Andreas Geiger, +3 more

- 01 Sep 2013 -

The International Journal of Robotics Re...

TL;DR: A novel dataset captured from a VW station wagon for use in mobile robotics and autonomous driving research, using a variety of sensor modalities such as high-resolution color and grayscale stereo cameras and a high-precision GPS/IMU inertial navigation system.

...read moreread less

Journal ArticleDOI

VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator

Tong Qin, +2 more

- 01 Aug 2018 -

IEEE Transactions on Robotics

TL;DR: In this article, a robust and versatile monocular visual-inertial state estimator is presented, which is the minimum sensor suite (in size, weight, and power) for the metric six degrees of freedom (DOF) state estimation.

...read moreread less

CARLA: An Open Urban Driving Simulator

Alexey Dosovitskiy, +4 more

TL;DR: This work introduces CARLA, an open-source simulator for autonomous driving research, and uses it to study the performance of three approaches to autonomous driving: a classic modular pipeline, an end-to-end model trained via imitation learning, and an end to-end models trained via reinforcement learning.

...read moreread less

Journal ArticleDOI

Keyframe-based visual-inertial odometry using nonlinear optimization

Stefan Leutenegger, +4 more

- 01 Mar 2015 -

The International Journal of Robotics Re...

TL;DR: This work forms a rigorously probabilistic cost function that combines reprojection errors of landmarks and inertial terms and compares the performance to an implementation of a state-of-the-art stochastic cloning sliding-window filter.

...read moreread less