Open Access
Technical Note:Q-Learning
C. J. C. H. Watkins
- pp 55-68
Reads0
Chats0
About:
The article was published on 1993-01-01 and is currently open access. It has received 2697 citations till now.read more
Citations
More filters
Journal ArticleDOI
Human-level control through deep reinforcement learning
Volodymyr Mnih,Koray Kavukcuoglu,David Silver,Andrei Rusu,Joel Veness,Marc G. Bellemare,Alex Graves,Martin Riedmiller,Andreas K. Fidjeland,Georg Ostrovski,Stig Petersen,Charles Beattie,Amir Sadik,Ioannis Antonoglou,Helen King,Dharshan Kumaran,Daan Wierstra,Shane Legg,Demis Hassabis +18 more
TL;DR: This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
Journal ArticleDOI
Deep learning in neural networks
TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.
Journal ArticleDOI
The free-energy principle: a unified brain theory?
TL;DR: This Review looks at some key brain theories in the biological and physical sciences from the free-energy perspective, suggesting that several global brain theories might be unified within a free- energy framework.
Journal ArticleDOI
A Tutorial on the Cross-Entropy Method
TL;DR: This tutorial presents the CE methodology, the basic algorithm and its modifications, and discusses applications in combinatorial optimization and machine learning.
Book
Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations
Yoav Shoham,Kevin Leyton-Brown +1 more
TL;DR: This exciting and pioneering new overview of multiagent systems, which are online systems composed of multiple interacting intelligent agents, i.e., online trading, offers a newly seen computer science perspective on multi agent systems, while integrating ideas from operations research, game theory, economics, logic, and even philosophy and linguistics.
References
More filters
Journal ArticleDOI
Learning to Predict by the Methods of Temporal Differences
TL;DR: This article introduces a class of incremental learning procedures specialized for prediction – that is, for using past experience with an incompletely known system to predict its future behavior – and proves their convergence and optimality for special cases and relation to supervised-learning methods.
Journal ArticleDOI
Learning from delayed rewards
TL;DR: The invention relates to a circuit for use in a receiver which can receive two-tone/stereo signals which is intended to make a choice between mono or stereo reproduction of signal A or of signal B and vice versa.
Journal ArticleDOI
Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching
TL;DR: This paper compares eight reinforcement learning frameworks: Adaptive heuristic critic (AHC) learning due to Sutton, Q-learning due to Watkins, and three extensions to both basic methods for speeding up learning and two extensions are experience replay, learning action models for planning, and teaching.
Book ChapterDOI
Integrated architecture for learning, planning, and reacting based on approximating dynamic programming
TL;DR: This paper extends previous work with Dyna, a class of architectures for intelligent systems based on approximating dynamic programming methods, and presents and shows results for two Dyna architectures, based on Watkins's Q-learning, a new kind of reinforcement learning.