Open AccessBook
Introduction to Stochastic Dynamic Programming
Reads0
Chats0
About:
The article was published on 2014-09-25 and is currently open access. It has received 1100 citations till now. The article focuses on the topics: Stochastic programming & Dynamic programming.read more
Citations
More filters
Book
Reinforcement Learning: An Introduction
TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Book
Markov Decision Processes: Discrete Stochastic Dynamic Programming
TL;DR: Puterman as discussed by the authors provides a uniquely up-to-date, unified, and rigorous treatment of the theoretical, computational, and applied research on Markov decision process models, focusing primarily on infinite horizon discrete time models and models with discrete time spaces while also examining models with arbitrary state spaces, finite horizon models, and continuous time discrete state models.
Journal ArticleDOI
Technical Note Q-Learning
Chris Watkins,Peter Dayan +1 more
TL;DR: In this article, it is shown that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action values are represented discretely.
Journal ArticleDOI
Learning to act using real-time dynamic programming
TL;DR: An algorithm based on dynamic programming, which is called Real-Time DP, is introduced, by which an embedded system can improve its performance with experience and illuminate aspects of other DP-based reinforcement learning methods such as Watkins'' Q-Learning algorithm.
Journal ArticleDOI
Recent Advances in Hierarchical Reinforcement Learning
TL;DR: This work reviews several approaches to temporal abstraction and hierarchical organization that machine learning researchers have recently developed and discusses extensions of these ideas to concurrent activities, multiagent coordination, and hierarchical memory for addressing partial observability.