Planning and control in stochastic domains with imperfect information

Open AccessDissertation

Planning and control in stochastic domains with imperfect information

Chats0

TLDR

Experimental results show that methods that preserve the shape of the value function over updates, such as the newly designed incremental linear vector and fast informed bound methods, tend to outperform other methods on the control performance test.

Abstract:

Partially observable Markov decision processes (POMDPs) can be used to model complex control problems that include both action outcome uncertainty and imperfect observability. A control problem within the POMDP framework is expressed as a dynamic optimization problem with a value function that combines costs or rewards from multiple steps. Although the POMDP framework is more expressive than other simpler frameworks, like Markov decision processes (MDP), its associated optimization methods are more demanding computationally and only very small problems can be solved exactly in practice. The thesis focuses on two possible approaches that can he used to solve larger problems: approximation methods and exploitation of additional problem structure. First, a number of new efficient approximation methods and improvements of existing algorithms are proposed. These include (1) the fast informed bound method based on approximate dynamic programming updates that lead to piecewise linear and convex value functions with a constant number of linear vectors, (2) a grid-based point interpolation method that supports variable grids, (3) an incremental version of the linear vector method that updates value function derivatives, as well as (4) various heuristics for selecting grid-points. The new and existing methods are experimentally tested and compared on a set of three infinite discounted horizon problems of different complexity. The experimental results show that methods that preserve the shape of the value function over updates, such as the newly designed incremental linear vector and fast informed bound methods, tend to outperform other methods on the control performance test. Second, the thesis presents a number of techniques for exploiting additional structure in the model of complex control problems. These are studied as applied to a medical therapy planning problem--the management of patients with chronic ischemic heart disease. The new extensions proposed include factored and hierarchically structured models that combine the advantages of the POMDP and MDP frameworks and cut down the size and complexity of the information state space.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Decision-theoretic planning: structural assumptions and computational leverage

Craig Boutilier, +2 more

- 01 Jul 1999 -

Journal of Artificial Intelligence Resea...

TL;DR: In this article, the authors present an overview and synthesis of MDP-related methods, showing how they provide a unifying framework for modeling many classes of planning problems studied in AI.

...read moreread less

Journal ArticleDOI

Value-function approximations for partially observable Markov decision processes

Milos Hauskrecht

- 01 Aug 2000 -

Journal of Artificial Intelligence Resea...

TL;DR: This work surveys various approximation methods, analyzes their properties and relations and provides some new insights into their differences, and presents a number of new approximation methods and novel refinements of existing techniques.

...read moreread less

Journal ArticleDOI

A Framework for Sequential Planning in Multi-Agent Settings

Prashant Doshi, +1 more

- 09 Sep 2011 -

arXiv: Artificial Intelligence

TL;DR: This paper extends the framework of partially observable Markov decision processes (POMDPs) to multi-agent settings by incorporating the notion of agent models into the state space and expresses the agents' autonomy by postulating that their models are not directly manipulable or observable by other agents.

...read moreread less

Proceedings Article

Planning with incomplete information as heuristic search in belief space

Blai Bonet, +1 more

TL;DR: The formulation of planning as heuristic search with heuristics derived from problem representations in the context planning with incomplete information is made explicit, to test it over a number of domains, and to extend it to tasks like planning with sensing where the standard search algorithms do not apply.

...read moreread less

Journal ArticleDOI

A framework for sequential planning in multi-agent settings

Piotr J. Gmytrasiewicz, +1 more

- 01 Jul 2005 -

Journal of Artificial Intelligence Resea...

TL;DR: In this paper, the authors extend the framework of partially observable Markov decision processes (POMDPs) to multi-agent settings by incorporating the notion of agent models into the state space.

...read moreread less