Partially Observable Markov Decision Processes

doi:10.1007/978-3-642-27645-3_12

Book ChapterDOI

Partially Observable Markov Decision Processes

- pp 387-414

TLDR

This chapter presents the POMDP model by focusing on the differences with fully observable MDPs, and it is shown how optimal policies for POM DPs can be represented.

Abstract:

For reinforcement learning in environments in which an agent has access to a reliable state signal, methods based on the Markov decision process (MDP) have had many successes. In many problem domains, however, an agent suffers from limited sensing capabilities that preclude it from recovering a Markovian state signal from its perceptions. Extending the MDP framework, partially observable Markov decision processes (POMDPs) allow for principled decision making under conditions of uncertain sensing. In this chapter we present the POMDP model by focusing on the differences with fully observable MDPs, and we show how optimal policies for POMDPs can be represented. Next, we give a review of model-based techniques for policy computation, followed by an overview of the available model-free methods for POMDPs. We conclude by highlighting recent trends in POMDP reinforcement learning.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Decision-theoretic planning under uncertainty with information rewards for active cooperative perception

Matthijs T. J. Spaan, +2 more

- 01 Nov 2015 -

Autonomous Agents and Multi-Agent System...

TL;DR: This work presents the POMDP with Information Rewards (POMDP-IR) modeling framework, which rewards an agent for reaching a certain level of belief regarding a state feature, and demonstrates their use for active cooperative perception scenarios.

...read moreread less

Journal Article

The MADP Toolbox: An Open Source Library for Planning and Learning in (Multi-)Agent Systems

Frans A. Oliehoek, +4 more

- 17 Aug 2017 -

Journal of Machine Learning Research

TL;DR: The MultiAgent Decision Process (MADP) toolbox as mentioned in this paper is a software library to support planning and learning for intelligent agents and multi-agent systems in uncertain environments, including partially observable environments and stochastic transition models.

...read moreread less

Posted Content

User-centric Cell-free Massive MIMO Networks: A Survey of Opportunities, Challenges and Solutions.

Hussein A. Ammar, +4 more

- 29 Apr 2021 -

arXiv: Information Theory

TL;DR: In this article, the authors present a guide to the key challenges facing the deployment of this network scheme and contemplate the solutions being proposed for the main bottlenecks facing cell-free communications.

...read moreread less

Posted Content

Reinforcement Learning of POMDP's using Spectral Methods.

Kamyar Azizzadenesheli, +2 more

- 25 Feb 2016 -

arXiv: Artificial Intelligence

TL;DR: This work proposes a new reinforcement learning algorithm for partially observable Markov decision processes (POMDP) based on spectral decomposition methods and proves an order-optimal regret bound with respect to the optimal memoryless policy and efficient scaling withrespect to the dimensionality of observation and action spaces.

...read moreread less

Journal ArticleDOI

Review of mission planning for autonomous marine vehicle fleets

Fletcher Thompson, +1 more

- 01 Mar 2019 -

Journal of Field Robotics

TL;DR: A critical review of the current advances in automated planning for AMV fleets is presented, investigating the limitations of available state‐of‐the‐art tools and providing a road map of the goals and challenges based on analysis of field reports and end user initiatives.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997 -

Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

Book

Reinforcement Learning: An Introduction

Richard S. Sutton, +1 more

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

...read moreread less

Book

Probabilistic Robotics

Sebastian Thrun

TL;DR: This research presents a novel approach to planning and navigation algorithms that exploit statistics gleaned from uncertain, imperfect real-world environments to guide robots toward their goals and around obstacles.

...read moreread less

Journal ArticleDOI

Planning and Acting in Partially Observable Stochastic Domains

Leslie Pack Kaelbling, +2 more

- 01 May 1998 -

Artificial Intelligence

TL;DR: A novel algorithm for solving pomdps off line and how, in some cases, a finite-memory controller can be extracted from the solution to a POMDP is outlined.

...read moreread less

Journal ArticleDOI

The Optimal Control of Partially Observable Markov Processes over a Finite Horizon

Richard D. Smallwood, +1 more

- 01 Oct 1973 -

Operations Research

TL;DR: In this article, the optimal control problem for a class of mathematical models in which the system to be controlled is characterized by a finite-state discrete-time Markov process is formulated.

...read moreread less

Collapse

Related Papers (5)

Planning and Acting in Partially Observable Stochastic Domains

Leslie Pack Kaelbling, +2 more

- 01 May 1998 -

Artificial Intelligence

Mathematics of Operations Research

Partially Observable Markov Decision Processes

Citations

Decision-theoretic planning under uncertainty with information rewards for active cooperative perception

The MADP Toolbox: An Open Source Library for Planning and Learning in (Multi-)Agent Systems

User-centric Cell-free Massive MIMO Networks: A Survey of Opportunities, Challenges and Solutions.

Reinforcement Learning of POMDP's using Spectral Methods.

Review of mission planning for autonomous marine vehicle fleets

References

Long short-term memory

Reinforcement Learning: An Introduction

Probabilistic Robotics

Planning and Acting in Partially Observable Stochastic Domains

The Optimal Control of Partially Observable Markov Processes over a Finite Horizon

Related Papers (5)

Planning and Acting in Partially Observable Stochastic Domains

Reinforcement Learning: An Introduction

Markov Decision Processes: Discrete Stochastic Dynamic Programming

Human-level control through deep reinforcement learning

The Complexity of Markov Decision Processes