scispace - formally typeset
Open AccessProceedings Article

AEMS: an anytime online search algorithm for approximate policy refinement in large POMDPs

Stephane Ross, +1 more
- pp 2592-2598
Reads0
Chats0
TLDR
A new anytime online search algorithm which seeks to minimize, as efficiently as possible, the error made by an approximate value function computed offline, and shows how previous online computations can be reused in following time steps in order to prevent redundant computations.
Abstract
Solving large Partially Observable Markov Decision Processes (POMDPs) is a complex task which is often intractable. A lot of effort has been made to develop approximate offline algorithms to solve ever larger POMDPs. However, even state-of-the-art approaches fail to solve large POMDPs in reasonable time. Recent developments in online POMDP search suggest that combining offline computations with online computations is often more efficient and can also considerably reduce the error made by approximate policies computed offline. In the same vein, we propose a new anytime online search algorithm which seeks to minimize, as efficiently as possible, the error made by an approximate value function computed offline. In addition, we show how previous online computations can be reused in following time steps in order to prevent redundant computations. Our preliminary results indicate that our approach is able to tackle large state space and observation space efficiently and under real-time constraints.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Phd by thesis

TL;DR: In this paper, a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) is presented.
Journal ArticleDOI

A survey of point-based POMDP solvers

TL;DR: This survey walks the reader through the fundamentals of point-based value iteration, explaining the main concepts and ideas, and surveys the major extensions to the basic algorithm, discussing their merits.
Journal ArticleDOI

Online planning algorithms for POMDPs

TL;DR: The objectives here are to survey the various existing online POMDP methods, analyze their properties and discuss their advantages and disadvantages; and to thoroughly evaluate these online approaches in different environments under various metrics.
ReportDOI

Mathematics of Operations Research.

TL;DR: This report briefly summarizes research on the following topics: game theory and energy; scheduling of large research and development programs; bimatrix games; cost/benefit analyses; measures of worth of weapons systems; hybrid primal algorithm; branch and round algorithm.
Proceedings Article

DESPOT: Online POMDP Planning with Regularization

TL;DR: This paper presents an online POMDP algorithm that alleviates these difficulties by focusing the search on a set of randomly sampled scenarios, and gives an output-sensitive performance bound for all policies derived from a DESPOT, and shows that R-DESPOT works well if a small optimal policy exists.
References
More filters
Journal ArticleDOI

Phd by thesis

TL;DR: In this paper, a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) is presented.
Book

Principles of Artificial Intelligence

TL;DR: This classic introduction to artificial intelligence describes fundamental AI ideas that underlie applications such as natural language processing, automatic programming, robotics, machine vision, automatic theorem proving, and intelligent data retrieval.
Journal ArticleDOI

The Complexity of Markov Decision Processes

TL;DR: All three variants of the classical problem of optimal policy computation in Markov decision processes, finite horizon, infinite horizon discounted, and infinite horizon average cost are shown to be complete for P, and therefore most likely cannot be solved by highly parallel algorithms.
Proceedings Article

Point-based value iteration: an anytime algorithm for POMDPs

TL;DR: This paper introduces the Point-Based Value Iteration (PBVI) algorithm for POMDP planning, and presents results on a robotic laser tag problem as well as three test domains from the literature.
Journal ArticleDOI

Perseus: randomized point-based value iteration for POMDPs

TL;DR: This work presents a randomized point-based value iteration algorithm called PERSEUS, which backs up only a (randomly selected) subset of points in the belief set, sufficient for improving the value of each belief point in the set.
Related Papers (5)