scispace - formally typeset
Open AccessJournal ArticleDOI

Asynchronous Stochastic Approximation and Q-Learning

John N. Tsitsiklis
- 01 Sep 1994 - 
- Vol. 16, Iss: 3, pp 185-202
Reads0
Chats0
TLDR
The Q-learning algorithm, a reinforcement learning method for solving Markov decision problems, is studied to establish its convergence under conditions more general than previously available.
Abstract
We provide some general results on the convergence of a class of stochastic approximation algorithms and their parallel and asynchronous variants. We then use these results to study the Q-learning algorithm, a reinforcement learning method for solving Markov decision problems, and establish its convergence under conditions more general than previously available.

read more

Citations
More filters
Book

Reinforcement Learning: An Introduction

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Journal ArticleDOI

Reinforcement learning: a survey

TL;DR: Central issues of reinforcement learning are discussed, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state.
Proceedings Article

Asynchronous methods for deep reinforcement learning

TL;DR: A conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers and shows that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.
Book

Foundations of Machine Learning

TL;DR: This graduate-level textbook introduces fundamental concepts and methods in machine learning, and provides the theoretical underpinnings of these algorithms, and illustrates key aspects for their application.
Journal ArticleDOI

A Comprehensive Survey of Multiagent Reinforcement Learning

TL;DR: The benefits and challenges of MARL are described along with some of the problem domains where the MARL techniques have been applied, and an outlook for the field is provided.
References
More filters
Journal ArticleDOI

Technical Note : \cal Q -Learning

TL;DR: This paper presents and proves in detail a convergence theorem forQ-learning based on that outlined in Watkins (1989), showing that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action- values are represented discretely.
Book

Parallel and Distributed Computation: Numerical Methods

TL;DR: This work discusses parallel and distributed architectures, complexity measures, and communication and synchronization issues, and it presents both Jacobi and Gauss-Seidel iterations, which serve as algorithms of reference for many of the computational approaches addressed later.
Journal ArticleDOI

Learning to Predict by the Methods of Temporal Differences

Richard S. Sutton
- 01 Aug 1988 - 
TL;DR: This article introduces a class of incremental learning procedures specialized for prediction – that is, for using past experience with an incompletely known system to predict its future behavior – and proves their convergence and optimality for special cases and relation to supervised-learning methods.
Journal ArticleDOI

Technical Note Q-Learning

TL;DR: In this article, it is shown that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action values are represented discretely.