AEMS: an anytime online search algorithm for approximate policy refinement in large POMDPs

Open AccessProceedings Article

AEMS: an anytime online search algorithm for approximate policy refinement in large POMDPs

Stephane Ross, +1 more

- pp 2592-2598

Chats0

TLDR

A new anytime online search algorithm which seeks to minimize, as efficiently as possible, the error made by an approximate value function computed offline, and shows how previous online computations can be reused in following time steps in order to prevent redundant computations.

Abstract:

Solving large Partially Observable Markov Decision Processes (POMDPs) is a complex task which is often intractable. A lot of effort has been made to develop approximate offline algorithms to solve ever larger POMDPs. However, even state-of-the-art approaches fail to solve large POMDPs in reasonable time. Recent developments in online POMDP search suggest that combining offline computations with online computations is often more efficient and can also considerably reduce the error made by approximate policies computed offline. In the same vein, we propose a new anytime online search algorithm which seeks to minimize, as efficiently as possible, the error made by an approximate value function computed offline. In addition, we show how previous online computations can be reused in following time steps in order to prevent redundant computations. Our preliminary results indicate that our approach is able to tackle large state space and observation space efficiently and under real-time constraints.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Phd by thesis

Richard Lathe

- 01 Apr 1988 -

Nature

TL;DR: In this paper, a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) is presented.

...read moreread less

Journal ArticleDOI

A survey of point-based POMDP solvers

Guy Shani, +2 more

- 01 Jul 2013 -

Autonomous Agents and Multi-Agent System...

TL;DR: This survey walks the reader through the fundamentals of point-based value iteration, explaining the main concepts and ideas, and surveys the major extensions to the basic algorithm, discussing their merits.

...read moreread less

Journal ArticleDOI

Online planning algorithms for POMDPs

Stephane Ross, +3 more

- 01 May 2008 -

Journal of Artificial Intelligence Resea...

TL;DR: The objectives here are to survey the various existing online POMDP methods, analyze their properties and discuss their advantages and disadvantages; and to thoroughly evaluate these online approaches in different environments under various metrics.

...read moreread less

ReportDOI

Mathematics of Operations Research.

Robert M. Thrall

TL;DR: This report briefly summarizes research on the following topics: game theory and energy; scheduling of large research and development programs; bimatrix games; cost/benefit analyses; measures of worth of weapons systems; hybrid primal algorithm; branch and round algorithm.

...read moreread less

Proceedings Article

DESPOT: Online POMDP Planning with Regularization

Adhiraj Somani, +3 more

TL;DR: This paper presents an online POMDP algorithm that alleviates these difficulties by focusing the search on a set of randomly sampled scenarios, and gives an output-sensitive performance bound for all policies derived from a DESPOT, and shows that R-DESPOT works well if a small optimal policy exists.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Phd by thesis

Richard Lathe

- 01 Apr 1988 -

Nature

TL;DR: In this paper, a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) is presented.

...read moreread less

Book

Principles of Artificial Intelligence

Nils J. Nilsson

TL;DR: This classic introduction to artificial intelligence describes fundamental AI ideas that underlie applications such as natural language processing, automatic programming, robotics, machine vision, automatic theorem proving, and intelligent data retrieval.

...read moreread less

Journal ArticleDOI

The Complexity of Markov Decision Processes

Christos H. Papadimitriou, +1 more

- 01 Aug 1987 -

Mathematics of Operations Research

TL;DR: All three variants of the classical problem of optimal policy computation in Markov decision processes, finite horizon, infinite horizon discounted, and infinite horizon average cost are shown to be complete for P, and therefore most likely cannot be solved by highly parallel algorithms.

...read moreread less

Proceedings Article

Point-based value iteration: an anytime algorithm for POMDPs

Joelle Pineau, +2 more

TL;DR: This paper introduces the Point-Based Value Iteration (PBVI) algorithm for POMDP planning, and presents results on a robotic laser tag problem as well as three test domains from the literature.

...read moreread less

Journal ArticleDOI

Perseus: randomized point-based value iteration for POMDPs

Matthijs T. J. Spaan, +1 more

- 01 Jul 2005 -

Journal of Artificial Intelligence Resea...

TL;DR: This work presents a randomized point-based value iteration algorithm called PERSEUS, which backs up only a (randomly selected) subset of points in the belief set, sufficient for improving the value of each belief point in the set.

...read moreread less