A heuristic variable grid solution method for POMDPs

Open AccessProceedings Article

A heuristic variable grid solution method for POMDPs

Ronen I. Brafman

- pp 727-733

Chats0

TLDR

A simple variable-grid solution method which yields good results on relatively large problems with modest computational effort is described.

Abstract:

Partially observable Markov decision processes (POMDPs) are an appealing tool for modeling planning problems under uncertainty. They incorporate stochastic action and sensor descriptions and easily capture goal oriented and process onented tasks. Unfortunately, POMDPs are very difficult to solve. Exact methods cannot handle problems with much more than 10 states, so approximate methods must be used. In this paper, we describe a simple variable-grid solution method which yields good results on relatively large problems with modest computational effort.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Decision-theoretic planning: structural assumptions and computational leverage

Craig Boutilier, +2 more

- 01 Jul 1999 -

Journal of Artificial Intelligence Resea...

TL;DR: In this article, the authors present an overview and synthesis of MDP-related methods, showing how they provide a unifying framework for modeling many classes of planning problems studied in AI.

...read moreread less

Proceedings Article

Point-based value iteration: an anytime algorithm for POMDPs

Joelle Pineau, +2 more

TL;DR: This paper introduces the Point-Based Value Iteration (PBVI) algorithm for POMDP planning, and presents results on a robotic laser tag problem as well as three test domains from the literature.

...read moreread less

Journal ArticleDOI

Perseus: randomized point-based value iteration for POMDPs

Matthijs T. J. Spaan, +1 more

- 01 Jul 2005 -

Journal of Artificial Intelligence Resea...

TL;DR: This work presents a randomized point-based value iteration algorithm called PERSEUS, which backs up only a (randomly selected) subset of points in the belief set, sufficient for improving the value of each belief point in the set.

...read moreread less

Journal ArticleDOI

The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management

Steve Young, +6 more

- 01 Apr 2010 -

Computer Speech & Language

TL;DR: This paper explains how Partially Observable Markov Decision Processes (POMDPs) can provide a principled mathematical framework for modelling the inherent uncertainty in spoken dialogue systems and describes a form of approximation called the Hidden Information State model which does scale and which can be used to build practical systems.

...read moreread less

Book ChapterDOI

The cog project: building a humanoid robot

Rodney A. Brooks, +4 more

TL;DR: This chapter gives a background on the methodology that the authors have used in investigations, highlights the research issues that have been raised during this project, and provides a summary of both the current state of the project and the long-term goals.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book

Dynamic Programming

Richard Ernest Bellman

TL;DR: The more the authors study the information processing aspects of the mind, the more perplexed and impressed they become, and it will be a very long time before they understand these processes sufficiently to reproduce them.

...read moreread less

Journal ArticleDOI

Planning and Acting in Partially Observable Stochastic Domains

Leslie Pack Kaelbling, +2 more

- 01 May 1998 -

Artificial Intelligence

TL;DR: A novel algorithm for solving pomdps off line and how, in some cases, a finite-memory controller can be extracted from the solution to a POMDP is outlined.

...read moreread less

Journal ArticleDOI

The Optimal Control of Partially Observable Markov Processes over a Finite Horizon

Richard D. Smallwood, +1 more

- 01 Oct 1973 -

Operations Research

TL;DR: In this article, the optimal control problem for a class of mathematical models in which the system to be controlled is characterized by a finite-state discrete-time Markov process is formulated.

...read moreread less

Book ChapterDOI

Learning policies for partially observable environments: scaling up

Michael L. Littman, +2 more

TL;DR: This paper discusses several simple solution methods and shows that all are capable of finding near- optimal policies for a selection of extremely small POMDP'S taken from the learning literature, but shows that none are able to solve a slightly larger and noisier problem based on robot navigation.

...read moreread less

Journal ArticleDOI

The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs

Edward J. Sondik

- 01 Apr 1978 -

Operations Research

TL;DR: The paper develops easily implemented approximations to stationary policies based on finitely transient policies and shows that the concave hull of an approximation can be included in the well-known Howard policy improvement algorithm with subsequent convergence.

...read moreread less