The Witness Algorithm: Solving Partially Observable Markov Decision Processes

Open Access

The Witness Algorithm: Solving Partially Observable Markov Decision Processes

Chats0

TLDR

It is argued that the witness algorithm is superior to existing algorithms for solving POMDP problems in an important complexity-theoretic sense.

Abstract:

Markov decision processes (MDP''s) are a mathematical formalization of problems in which a decision-maker must choose how to act to maximize its reward over a series of interactions with its environment Partially observable Markov decision processes (POMDP''s) generalize the MDP framework to the case where the agent must make its decisions in partial ignorance of its current situation This paper describes the POMDP framework and presents some well-known results from the field It then presents a novel method called the witness algorithm for solving POMDP problems and analyzes its computational complexity The paper argues that the witness algorithm is superior to existing algorithms for solving POMDP''s in an important complexity-theoretic sense

Citations

PDF

Open Access

More filters

MonographDOI

Planning Algorithms: Introductory Material

Steven M. LaValle

TL;DR: This coherent and comprehensive book unifies material from several sources, including robotics, control theory, artificial intelligence, and algorithms, into planning under differential constraints that arise when automating the motions of virtually any mechanical system.

...read moreread less

Journal ArticleDOI

Planning and Acting in Partially Observable Stochastic Domains

Leslie Pack Kaelbling, +2 more

- 01 May 1998 -

Artificial Intelligence

TL;DR: A novel algorithm for solving pomdps off line and how, in some cases, a finite-memory controller can be extracted from the solution to a POMDP is outlined.

...read moreread less

Book ChapterDOI

Learning policies for partially observable environments: scaling up

Michael L. Littman, +2 more

TL;DR: This paper discusses several simple solution methods and shows that all are capable of finding near- optimal policies for a selection of extremely small POMDP'S taken from the learning literature, but shows that none are able to solve a slightly larger and noisier problem based on robot navigation.

...read moreread less

Journal ArticleDOI

The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management

Steve Young, +6 more

- 01 Apr 2010 -

Computer Speech & Language

TL;DR: This paper explains how Partially Observable Markov Decision Processes (POMDPs) can provide a principled mathematical framework for modelling the inherent uncertainty in spoken dialogue systems and describes a form of approximation called the Hidden Information State model which does scale and which can be used to build practical systems.

...read moreread less

Algorithms for Sequential Decision Making

Michael L. Littman

TL;DR: This thesis shows how to answer the question ``What should I do now?

...read moreread less

Collapse

References

PDF

Open Access

More filters

Johnson: Computers and Intractability-A Guide to the Theory of NP-Completeness

Michael Randolph Garey

Journal ArticleDOI

Learning from delayed rewards

Ben Kröse

- 01 Oct 1995 -

Robotics and Autonomous Systems

TL;DR: The invention relates to a circuit for use in a receiver which can receive two-tone/stereo signals which is intended to make a choice between mono or stereo reproduction of signal A or of signal B and vice versa.

...read moreread less

Book

Dynamic Programming: Deterministic and Stochastic Models

Dimitri P. Bertsekas

TL;DR: As one of the part of book categories, dynamic programming deterministic and stochastic models always becomes the most wanted book.

...read moreread less

Proceedings Article

Acting Optimally in Partially Observable Stochastic Domains

Anthony R. Cassandra, +2 more

TL;DR: The existing algorithms for computing optimal control strategies for partially observable stochastic environments are found to be highly computationally inefficient and a new algorithm is developed that is empirically more efficient.

...read moreread less

Journal ArticleDOI

The Optimal Search for a Moving Target When the Search Path Is Constrained

James N. Eagle

- 01 Oct 1984 -

Operations Research

TL;DR: A search is conducted for a target moving in discrete time among a finite number of cells according to a known Markov process and a finite time horizon POMDP solution technique is presented which is simpler than the standard linear programming methods.

...read moreread less

The Witness Algorithm: Solving Partially Observable Markov Decision Processes

Citations

Planning Algorithms: Introductory Material

Planning and Acting in Partially Observable Stochastic Domains

Learning policies for partially observable environments: scaling up

The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management

Algorithms for Sequential Decision Making

References

Johnson: Computers and Intractability-A Guide to the Theory of NP-Completeness

Learning from delayed rewards

Dynamic Programming: Deterministic and Stochastic Models

Acting Optimally in Partially Observable Stochastic Domains

The Optimal Search for a Moving Target When the Search Path Is Constrained

Related Papers (5)

The Optimal Control of Partially Observable Markov Processes over a Finite Horizon

Planning and Acting in Partially Observable Stochastic Domains

Acting Optimally in Partially Observable Stochastic Domains

State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms

Dynamic Programming