Open AccessProceedings Article
A heuristic variable grid solution method for POMDPs
Ronen I. Brafman
- pp 727-733
Reads0
Chats0
TLDR
A simple variable-grid solution method which yields good results on relatively large problems with modest computational effort is described.Abstract:
Partially observable Markov decision processes (POMDPs) are an appealing tool for modeling planning problems under uncertainty. They incorporate stochastic action and sensor descriptions and easily capture goal oriented and process onented tasks. Unfortunately, POMDPs are very difficult to solve. Exact methods cannot handle problems with much more than 10 states, so approximate methods must be used. In this paper, we describe a simple variable-grid solution method which yields good results on relatively large problems with modest computational effort.read more
Citations
More filters
Journal ArticleDOI
Decision-theoretic planning: structural assumptions and computational leverage
TL;DR: In this article, the authors present an overview and synthesis of MDP-related methods, showing how they provide a unifying framework for modeling many classes of planning problems studied in AI.
Proceedings Article
Point-based value iteration: an anytime algorithm for POMDPs
TL;DR: This paper introduces the Point-Based Value Iteration (PBVI) algorithm for POMDP planning, and presents results on a robotic laser tag problem as well as three test domains from the literature.
Journal ArticleDOI
Perseus: randomized point-based value iteration for POMDPs
TL;DR: This work presents a randomized point-based value iteration algorithm called PERSEUS, which backs up only a (randomly selected) subset of points in the belief set, sufficient for improving the value of each belief point in the set.
Journal ArticleDOI
The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management
Steve Young,Milica Gasic,Simon Keizer,François Mairesse,Jost Schatzmann,Blaise Thomson,Kai Yu +6 more
TL;DR: This paper explains how Partially Observable Markov Decision Processes (POMDPs) can provide a principled mathematical framework for modelling the inherent uncertainty in spoken dialogue systems and describes a form of approximation called the Hidden Information State model which does scale and which can be used to build practical systems.
Book ChapterDOI
The cog project: building a humanoid robot
Rodney A. Brooks,Cynthia Breazeal,Matthew Marjanović,Brian Scassellati,Matthew M. Williamson +4 more
TL;DR: This chapter gives a background on the methodology that the authors have used in investigations, highlights the research issues that have been raised during this project, and provides a summary of both the current state of the project and the long-term goals.
References
More filters
Book
Dynamic Programming
TL;DR: The more the authors study the information processing aspects of the mind, the more perplexed and impressed they become, and it will be a very long time before they understand these processes sufficiently to reproduce them.
Journal ArticleDOI
Planning and Acting in Partially Observable Stochastic Domains
TL;DR: A novel algorithm for solving pomdps off line and how, in some cases, a finite-memory controller can be extracted from the solution to a POMDP is outlined.
Journal ArticleDOI
The Optimal Control of Partially Observable Markov Processes over a Finite Horizon
TL;DR: In this article, the optimal control problem for a class of mathematical models in which the system to be controlled is characterized by a finite-state discrete-time Markov process is formulated.
Book ChapterDOI
Learning policies for partially observable environments: scaling up
TL;DR: This paper discusses several simple solution methods and shows that all are capable of finding near- optimal policies for a selection of extremely small POMDP'S taken from the learning literature, but shows that none are able to solve a slightly larger and noisier problem based on robot navigation.
Journal ArticleDOI
The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs
TL;DR: The paper develops easily implemented approximations to stationary policies based on finitely transient policies and shows that the concave hull of an approximation can be included in the well-known Howard policy improvement algorithm with subsequent convergence.