scispace - formally typeset
Open AccessProceedings Article

A heuristic variable grid solution method for POMDPs

Reads0
Chats0
TLDR
A simple variable-grid solution method which yields good results on relatively large problems with modest computational effort is described.
Abstract
Partially observable Markov decision processes (POMDPs) are an appealing tool for modeling planning problems under uncertainty. They incorporate stochastic action and sensor descriptions and easily capture goal oriented and process onented tasks. Unfortunately, POMDPs are very difficult to solve. Exact methods cannot handle problems with much more than 10 states, so approximate methods must be used. In this paper, we describe a simple variable-grid solution method which yields good results on relatively large problems with modest computational effort.

read more

Citations
More filters
Journal ArticleDOI

Decision-theoretic planning: structural assumptions and computational leverage

TL;DR: In this article, the authors present an overview and synthesis of MDP-related methods, showing how they provide a unifying framework for modeling many classes of planning problems studied in AI.
Proceedings Article

Point-based value iteration: an anytime algorithm for POMDPs

TL;DR: This paper introduces the Point-Based Value Iteration (PBVI) algorithm for POMDP planning, and presents results on a robotic laser tag problem as well as three test domains from the literature.
Journal ArticleDOI

Perseus: randomized point-based value iteration for POMDPs

TL;DR: This work presents a randomized point-based value iteration algorithm called PERSEUS, which backs up only a (randomly selected) subset of points in the belief set, sufficient for improving the value of each belief point in the set.
Journal ArticleDOI

The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management

TL;DR: This paper explains how Partially Observable Markov Decision Processes (POMDPs) can provide a principled mathematical framework for modelling the inherent uncertainty in spoken dialogue systems and describes a form of approximation called the Hidden Information State model which does scale and which can be used to build practical systems.
Book ChapterDOI

The cog project: building a humanoid robot

TL;DR: This chapter gives a background on the methodology that the authors have used in investigations, highlights the research issues that have been raised during this project, and provides a summary of both the current state of the project and the long-term goals.
References
More filters
Book

Dynamic Programming

TL;DR: The more the authors study the information processing aspects of the mind, the more perplexed and impressed they become, and it will be a very long time before they understand these processes sufficiently to reproduce them.
Journal ArticleDOI

Planning and Acting in Partially Observable Stochastic Domains

TL;DR: A novel algorithm for solving pomdps off line and how, in some cases, a finite-memory controller can be extracted from the solution to a POMDP is outlined.
Journal ArticleDOI

The Optimal Control of Partially Observable Markov Processes over a Finite Horizon

TL;DR: In this article, the optimal control problem for a class of mathematical models in which the system to be controlled is characterized by a finite-state discrete-time Markov process is formulated.
Book ChapterDOI

Learning policies for partially observable environments: scaling up

TL;DR: This paper discusses several simple solution methods and shows that all are capable of finding near- optimal policies for a selection of extremely small POMDP'S taken from the learning literature, but shows that none are able to solve a slightly larger and noisier problem based on robot navigation.
Journal ArticleDOI

The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs

TL;DR: The paper develops easily implemented approximations to stationary policies based on finitely transient policies and shows that the concave hull of an approximation can be included in the well-known Howard policy improvement algorithm with subsequent convergence.
Related Papers (5)