VidLoc: A Deep Spatio-Temporal Model for 6-DoF Video-Clip Relocalization
Ronald Clark,Sen Wang,Andrew Markham,Niki Trigoni,Hongkai Wen +4 more
- pp 2652-2660
TLDR
In this article, a recurrent model is proposed for 6-DoF localization of video-clips, and the pose estimates are smoothed and the localization error can be drastically reduced.Abstract:
Machine learning techniques, namely convolutional neural networks (CNN) and regression forests, have recently shown great promise in performing 6-DoF localization of monocular images. However, in most cases image-sequences, rather only single images, are readily available. To this extent, none of the proposed learning-based approaches exploit the valuable constraint of temporal smoothness, often leading to situations where the per-frame error is larger than the camera motion. In this paper we propose a recurrent model for performing 6-DoF localization of video-clips. We find that, even by considering only short sequences (20 frames), the pose estimates are smoothed and the localization error can be drastically reduced. Finally, we consider means of obtaining probabilistic pose estimates from our model. We evaluate our method on openly-available real-world autonomous driving and indoor localization datasets.read more
Citations
More filters
Proceedings ArticleDOI
Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions
Torsten Sattler,Will Maddern,Carl Toft,Akihiko Torii,Lars Hammarstrand,Erik Stenborg,Daniel Safari,Daniel Safari,Masatoshi Okutomi,Marc Pollefeys,Marc Pollefeys,Josef Sivic,Fredrik Kahl,Fredrik Kahl,Tomas Pajdla +14 more
TL;DR: This paper introduces the first benchmark datasets specifically designed for analyzing the impact of day-night changes, weather and seasonal variations, as well as sequence-based localization approaches and the need for better local features on visual localization.
Journal ArticleDOI
The ApolloScape Open Dataset for Autonomous Driving and Its Application
TL;DR: This paper provides a sensor fusion scheme integrating camera videos, consumer-grade motion sensors (GPS/IMU), and a 3D semantic map in order to achieve robust self-localization and semantic segmentation for autonomous driving.
Proceedings ArticleDOI
Geometry-Aware Learning of Maps for Camera Localization
TL;DR: In this article, the authors propose to represent maps as a deep neural network called MapNet, which enables learning a data-driven map representation and fuses them together for camera localization.
Proceedings ArticleDOI
CodeSLAM - Learning a Compact, Optimisable Representation for Dense Visual SLAM
TL;DR: In this paper, a dense representation of scene geometry which is conditioned on the intensity data from a single image and generated from a code consisting of a small number of parameters is presented. But it is not suitable for use in a keyframe-based monocular dense SLAM system.
Proceedings ArticleDOI
L3-Net: Towards Learning Based LiDAR Localization for Autonomous Driving
TL;DR: This work innovatively implements the use of various deep neural network structures to establish a learning-based LiDAR localization system that achieves centimeter-level localization accuracy, comparable to prior state-of-the-art systems with hand-crafted pipelines.
References
More filters
Journal ArticleDOI
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Proceedings ArticleDOI
Going deeper with convolutions
Christian Szegedy,Wei Liu,Yangqing Jia,Pierre Sermanet,Scott Reed,Dragomir Anguelov,Dumitru Erhan,Vincent Vanhoucke,Andrew Rabinovich +8 more
TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Journal ArticleDOI
Bidirectional recurrent neural networks
Mike Schuster,Kuldip K. Paliwal +1 more
TL;DR: It is shown how the proposed bidirectional structure can be easily modified to allow efficient estimation of the conditional posterior probability of complete symbol sequences without making any explicit assumption about the shape of the distribution.
Proceedings ArticleDOI
KinectFusion: Real-time dense surface mapping and tracking
Richard Newcombe,Shahram Izadi,Otmar Hilliges,David Molyneaux,David Kim,Andrew J. Davison,Pushmeet Kohi,Jamie Shotton,Steve Hodges,Andrew Fitzgibbon +9 more
TL;DR: A system for accurate real-time mapping of complex and arbitrary indoor scenes in variable lighting conditions, using only a moving low-cost depth camera and commodity graphics hardware, which fuse all of the depth data streamed from a Kinect sensor into a single global implicit surface model of the observed scene in real- time.