Deep reinforcement learning with successor features for navigation across similar environments

doi:10.1109/IROS.2017.8206049

Open AccessProceedings ArticleDOI

Deep reinforcement learning with successor features for navigation across similar environments

Jingwei Zhang, +3 more

- pp 2371-2378

Chats0

TLDR

In this paper, a successor-feature-based deep reinforcement learning algorithm is proposed to transfer navigation knowledge from previously mastered navigation tasks to new problem instances, which substantially decreases the required learning time after the first task instance has been solved, making it easily adaptable to changing environments.

Abstract:

In this paper we consider the problem of robot navigation in simple maze-like environments where the robot has to rely on its onboard sensors to perform the navigation task. In particular, we are interested in solutions to this problem that do not require localization, mapping or planning. Additionally, we require that our solution can quickly adapt to new situations (e.g., changing navigation goals and environments). To meet these criteria we frame this problem as a sequence of related reinforcement learning tasks. We propose a successor-feature-based deep reinforcement learning algorithm that can learn to transfer knowledge from previously mastered navigation tasks to new problem instances. Our algorithm substantially decreases the required learning time after the first task instance has been solved, which makes it easily adaptable to changing environments. We validate our method in both simulated and real robot experiments with a Robotino and compare it to a set of baseline methods including classical planning-based navigation.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

The hippocampus as a predictive map

Kimberly L. Stachenfeld, +2 more

- 02 Oct 2017 -

Nature Neuroscience

TL;DR: It is argued that entorhinal grid cells encode a low-dimensionality basis set for the predictive representation, useful for suppressing noise in predictions and extracting multiscale structure for hierarchical planning.

...read moreread less

Proceedings ArticleDOI

Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation

Lei Tai, +2 more

TL;DR: In this paper, a mapless motion planner is proposed by taking the sparse 10-dimensional range findings and the target position with respect to the mobile robot coordinate frame as input and the continuous steering commands as output.

...read moreread less

Proceedings ArticleDOI

Cognitive Mapping and Planning for Visual Navigation

Saurabh Gupta, +4 more

TL;DR: A neural architecture for navigation in novel environments that learns to map from first-person views and plans a sequence of actions towards goals in the environment, and can also achieve semantically specified goals, such as go to a chair.

...read moreread less

Journal ArticleDOI

Reinforcement learning for control: Performance, stability, and deep approximators

Lucian Busoniu, +4 more

- 01 Jan 2018 -

Annual Reviews in Control

TL;DR: This review mainly covers artificial-intelligence approaches to RL, from the viewpoint of the control engineer, and explains how approximate representations of the solution make RL feasible for problems with continuous states and control actions.

...read moreread less

Journal ArticleDOI

Voronoi-Based Multi-Robot Autonomous Exploration in Unknown Environments via Deep Reinforcement Learning

Junyan Hu, +4 more

- 29 Oct 2020 -

IEEE Transactions on Vehicular Technolog...

TL;DR: A novel cooperative exploration strategy is proposed for multiple mobile robots, which reduces the overall task completion time and energy costs compared to conventional methods and enables the control policy to learn from human demonstration data and thus improve the learning speed and performance.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

Book

Reinforcement Learning: An Introduction

Richard S. Sutton, +1 more

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

...read moreread less

Posted Content

Distilling the Knowledge in a Neural Network

Geoffrey E. Hinton, +2 more

- 09 Mar 2015 -

arXiv: Machine Learning

TL;DR: This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.

...read moreread less

Book