scispace - formally typeset
Search or ask a question
Book

Markov Decision Processes: Discrete Stochastic Dynamic Programming

15 Apr 1994-
TL;DR: Puterman as discussed by the authors provides a uniquely up-to-date, unified, and rigorous treatment of the theoretical, computational, and applied research on Markov decision process models, focusing primarily on infinite horizon discrete time models and models with discrete time spaces while also examining models with arbitrary state spaces, finite horizon models, and continuous time discrete state models.
Abstract: From the Publisher: The past decade has seen considerable theoretical and applied research on Markov decision processes, as well as the growing use of these models in ecology, economics, communications engineering, and other fields where outcomes are uncertain and sequential decision-making processes are needed. A timely response to this increased activity, Martin L. Puterman's new work provides a uniquely up-to-date, unified, and rigorous treatment of the theoretical, computational, and applied research on Markov decision process models. It discusses all major research directions in the field, highlights many significant applications of Markov decision processes models, and explores numerous important topics that have previously been neglected or given cursory coverage in the literature. Markov Decision Processes focuses primarily on infinite horizon discrete time models and models with discrete time spaces while also examining models with arbitrary state spaces, finite horizon models, and continuous-time discrete state models. The book is organized around optimality criteria, using a common framework centered on the optimality (Bellman) equation for presenting results. The results are presented in a "theorem-proof" format and elaborated on through both discussion and examples, including results that are not available in any other book. A two-state Markov decision process model, presented in Chapter 3, is analyzed repeatedly throughout the book and demonstrates many results and algorithms. Markov Decision Processes covers recent research advances in such areas as countable state space models with average reward criterion, constrained models, and models with risk sensitive optimality criteria. It also explores several topics that have received little or no attention in other books, including modified policy iteration, multichain models with average reward criterion, and sensitive optimality. In addition, a Bibliographic Remarks section in each chapter comments on relevant historic
Citations
More filters
Journal ArticleDOI
TL;DR: Central issues of reinforcement learning are discussed, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state.
Abstract: This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word "reinforcement." The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.

6,895 citations

Posted Content
TL;DR: A survey of reinforcement learning from a computer science perspective can be found in this article, where the authors discuss the central issues of RL, including trading off exploration and exploitation, establishing the foundations of RL via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state.
Abstract: This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.

5,970 citations

Book
25 Apr 2008
TL;DR: Principles of Model Checking offers a comprehensive introduction to model checking that is not only a text suitable for classroom use but also a valuable reference for researchers and practitioners in the field.
Abstract: Our growing dependence on increasingly complex computer and software systems necessitates the development of formalisms, techniques, and tools for assessing functional properties of these systems. One such technique that has emerged in the last twenty years is model checking, which systematically (and automatically) checks whether a model of a given system satisfies a desired property such as deadlock freedom, invariants, and request-response properties. This automated technique for verification and debugging has developed into a mature and widely used approach with many applications. Principles of Model Checking offers a comprehensive introduction to model checking that is not only a text suitable for classroom use but also a valuable reference for researchers and practitioners in the field. The book begins with the basic principles for modeling concurrent and communicating systems, introduces different classes of properties (including safety and liveness), presents the notion of fairness, and provides automata-based algorithms for these properties. It introduces the temporal logics LTL and CTL, compares them, and covers algorithms for verifying these logics, discussing real-time systems as well as systems subject to random phenomena. Separate chapters treat such efficiency-improving techniques as abstraction and symbolic manipulation. The book includes an extensive set of examples (most of which run through several chapters) and a complete set of basic results accompanied by detailed proofs. Each chapter concludes with a summary, bibliographic notes, and an extensive list of exercises of both practical and theoretical nature.

4,905 citations

Book
01 Jan 2001
TL;DR: The book introduces probabilistic graphical models and decision graphs, including Bayesian networks and influence diagrams, and presents a thorough introduction to state-of-the-art solution and analysis algorithms.
Abstract: Probabilistic graphical models and decision graphs are powerful modeling tools for reasoning and decision making under uncertainty. As modeling languages they allow a natural specification of problem domains with inherent uncertainty, and from a computational perspective they support efficient algorithms for automatic construction and query answering. This includes belief updating, finding the most probable explanation for the observed evidence, detecting conflicts in the evidence entered into the network, determining optimal strategies, analyzing for relevance, and performing sensitivity analysis. The book introduces probabilistic graphical models and decision graphs, including Bayesian networks and influence diagrams. The reader is introduced to the two types of frameworks through examples and exercises, which also instruct the reader on how to build these models. The book is a new edition of Bayesian Networks and Decision Graphs by Finn V. Jensen. The new edition is structured into two parts. The first part focuses on probabilistic graphical models. Compared with the previous book, the new edition also includes a thorough description of recent extensions to the Bayesian network modeling language, advances in exact and approximate belief updating algorithms, and methods for learning both the structure and the parameters of a Bayesian network. The second part deals with decision graphs, and in addition to the frameworks described in the previous edition, it also introduces Markov decision processes and partially ordered decision problems. The authors also provide a well-founded practical introduction to Bayesian networks, object-oriented Bayesian networks, decision trees, influence diagrams (and variants hereof), and Markov decision processes. give practical advice on the construction of Bayesian networks, decision trees, and influence diagrams from domain knowledge. give several examples and exercises exploiting computer systems for dealing with Bayesian networks and decision graphs. present a thorough introduction to state-of-the-art solution and analysis algorithms. The book is intended as a textbook, but it can also be used for self-study and as a reference book.

4,566 citations

Journal ArticleDOI
TL;DR: A novel algorithm for solving pomdps off line and how, in some cases, a finite-memory controller can be extracted from the solution to a POMDP is outlined.

4,283 citations

References
More filters
Book
01 Jan 1973

14,545 citations

Book
21 Oct 1957
TL;DR: The more the authors study the information processing aspects of the mind, the more perplexed and impressed they become, and it will be a very long time before they understand these processes sufficiently to reproduce them.
Abstract: From the Publisher: An introduction to the mathematical theory of multistage decision processes, this text takes a functional equation approach to the discovery of optimum policies. Written by a leading developer of such policies, it presents a series of methods, uniqueness and existence theorems, and examples for solving the relevant equations. The text examines existence and uniqueness theorems, the optimal inventory equation, bottleneck problems in multistage production processes, a new formalism in the calculus of variation, strategies behind multistage games, and Markovian decision processes. Each chapter concludes with a problem set that Eric V. Denardo of Yale University, in his informative new introduction, calls a rich lode of applications and research topics. 1957 edition. 37 figures.

14,187 citations

Book
25 Sep 2014

1,100 citations

Book
01 Jan 1982

429 citations

Journal ArticleDOI
TL;DR: A new algorithm for computing optimal ( s , S ) policies is derived based upon a number of new properties of the infinite horizon cost function c as well as a new upper bound for optimal order-up-to levels S * and a new lower bound for ideal reorder levels s *.
Abstract: In this paper, a new algorithm for computing optimal (s, S) policies is derived based upon a number of new properties of the infinite horizon cost function c(s, S) as well as a new upper bound for optimal order-up-to levels S* and a new lower bound for optimal reorder levels s*. The algorithm is simple and easy to understand. Its computational complexity is only 2.4 times that required to evaluate a (specific) single (s, S) policy. The algorithm applies to both periodic review and continuous review inventory systems.

317 citations