Open AccessProceedings Article
Prioritized Experience Replay
TLDR
Prioritized experience replay as mentioned in this paper is a framework for prioritizing experience, so as to replay important transitions more frequently, and therefore learn more efficiently, achieving human-level performance across many Atari games.Abstract:
Experience replay lets online reinforcement learning agents remember and reuse experiences from the past. In prior work, experience transitions were uniformly sampled from a replay memory. However, this approach simply replays transitions at the same frequency that they were originally experienced, regardless of their significance. In this paper we develop a framework for prioritizing experience, so as to replay important transitions more frequently, and therefore learn more efficiently. We use prioritized experience replay in Deep Q-Networks (DQN), a reinforcement learning algorithm that achieved human-level performance across many Atari games. DQN with prioritized experience replay achieves a new state-of-the-art, outperforming DQN with uniform replay on 41 out of 49 games.read more
Citations
More filters
Proceedings Article
Asynchronous methods for deep reinforcement learning
Volodymyr Mnih,Adrià Puigdomènech Badia,Mehdi Mirza,Alex Graves,Tim Harley,Timothy P. Lillicrap,David Silver,Koray Kavukcuoglu +7 more
TL;DR: A conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers and shows that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.
Journal ArticleDOI
Building machines that learn and think like people.
TL;DR: In this article, a review of recent progress in cognitive science suggests that truly human-like learning and thinking machines will have to reach beyond current engineering trends in both what they learn and how they learn it.
Posted Content
Dueling Network Architectures for Deep Reinforcement Learning
TL;DR: This paper presents a new neural network architecture for model-free reinforcement learning that leads to better policy evaluation in the presence of many similar-valued actions and enables the RL agent to outperform the state-of-the-art on the Atari 2600 domain.
Posted Content
Addressing Function Approximation Error in Actor-Critic Methods
TL;DR: This paper builds on Double Q-learning, by taking the minimum value between a pair of critics to limit overestimation, and draws the connection between target networks and overestimation bias.
References
More filters
Journal ArticleDOI
The arcade learning environment: an evaluation platform for general agents
TL;DR: The Arcade Learning Environment (ALE) as discussed by the authors is a platform for evaluating the development of general, domain-independent AI technology, which provides an interface to hundreds of Atari 2600 game environments, each one different, interesting, and designed to be a challenge for human players.
Journal ArticleDOI
A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches
TL;DR: A taxonomy for ensemble-based methods to address the class imbalance where each proposal can be categorized depending on the inner ensemble methodology in which it is based is proposed and a thorough empirical comparison is developed by the consideration of the most significant published approaches to show whether any of them makes a difference.
Posted Content
Dueling Network Architectures for Deep Reinforcement Learning
TL;DR: This paper presents a new neural network architecture for model-free reinforcement learning that leads to better policy evaluation in the presence of many similar-valued actions and enables the RL agent to outperform the state-of-the-art on the Atari 2600 domain.
Journal ArticleDOI
Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching
TL;DR: This paper compares eight reinforcement learning frameworks: Adaptive heuristic critic (AHC) learning due to Sutton, Q-learning due to Watkins, and three extensions to both basic methods for speeding up learning and two extensions are experience replay, learning action models for planning, and teaching.
Proceedings Article
Torch7: A Matlab-like Environment for Machine Learning
TL;DR: Torch7 is a versatile numeric computing framework and machine learning library that extends Lua that can easily be interfaced to third-party software thanks to Lua’s light interface.
Related Papers (5)
Human-level control through deep reinforcement learning
Mastering the game of Go with deep neural networks and tree search
David Silver,Aja Huang,Chris J. Maddison,Arthur Guez,Laurent Sifre,George van den Driessche,Julian Schrittwieser,Ioannis Antonoglou,Veda Panneershelvam,Marc Lanctot,Sander Dieleman,Dominik Grewe,John Nham,Nal Kalchbrenner,Ilya Sutskever,Timothy P. Lillicrap,Madeleine Leach,Koray Kavukcuoglu,Thore Graepel,Demis Hassabis +19 more