scispace - formally typeset
Open AccessProceedings Article

Structured reachability analysis for Markov decision processes

TLDR
A family of algorithms for structured reachability analysis of MDPs that are suitable when an initial state (or set of states) is known and can be used to eliminate variables oy variable values from the problem description, reducing the size of the MDP and making it easier to solve.
Abstract
Recent research in decision theoretic planning has focussed on making the solution of Markov decision processes (MDPs) more feasible. We develop a family of algorithms for structured reachability analysis of MDPs that are suitable when an initial state (or set of states) is known. Usin compact, structured representations of MDPs (e.g., Bayesian networks), our methods, which vary in the tradeoff between complexity and accurac roduce structured descriptions of (estimated) reacpagle states that can be used to eliminate variables oy variable values from the problem description, reducing the size of the MDP and making it easier to solve. One contribution of our work is the extension of ideas from GRAPHPLAN to deal with the distributed nature of action reoresentations typically embodied within Bayes nets and the problem of correlated action effects. We also demonstrate that our algorithm can be made more complete by using k-ary constraints instead of binary constraints. Another contribution is the illustration of how the compact representation of reachability constraints can be exploited by several existing (exact and approximate) abstraction algorithms for MDPs.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Decision-theoretic planning: structural assumptions and computational leverage

TL;DR: In this article, the authors present an overview and synthesis of MDP-related methods, showing how they provide a unifying framework for modeling many classes of planning problems studied in AI.
Journal ArticleDOI

Stochastic dynamic programming with factored representations

TL;DR: This work uses dynamic Bayesian networks (with decision trees representing the local families of conditional probability distributions) to represent stochastic actions in an MDP, together with a decision-tree representation of rewards, and develops versions of standard dynamic programming algorithms that directly manipulate decision-Tree representations of policies and value functions.
Posted Content

Heuristic Search Value Iteration for POMDPs

TL;DR: Heuristic search value iteration (HSVI) as mentioned in this paper is an anytime algorithm that returns a policy and a provable bound on its regret with respect to the optimal policy, which can be used to solve POMDP problems.
Proceedings ArticleDOI

Heuristic search value iteration for POMDPs

TL;DR: HSVI is an anytime algorithm that returns a policy and a provable bound on its regret with respect to the optimal policy and is applied to a new rover exploration problem 10 times larger than most POMDP problems in the literature.
Book

Automated Planning and Acting

TL;DR: This book presents a comprehensive paradigm of planning and acting using the most recent and advanced automated-planning techniques, and explains the computational deliberation capabilities that allow an actor to reason about its actions, choose them, organize them purposefully, and act deliberately to achieve an objective.
References
More filters
Book

Dynamic Programming

TL;DR: The more the authors study the information processing aspects of the mind, the more perplexed and impressed they become, and it will be a very long time before they understand these processes sufficiently to reproduce them.

Neuro-Dynamic Programming.

TL;DR: In this article, the authors present the first textbook that fully explains the neuro-dynamic programming/reinforcement learning methodology, which is a recent breakthrough in the practical application of neural networks and dynamic programming to complex problems of planning, optimal decision making, and intelligent control.
Book

Neuro-dynamic programming

TL;DR: This is the first textbook that fully explains the neuro-dynamic programming/reinforcement learning methodology, which is a recent breakthrough in the practical application of neural networks and dynamic programming to complex problems of planning, optimal decision making, and intelligent control.
Related Papers (5)