scispace - formally typeset
Book ChapterDOI

Partially Observable Markov Decision Processes

TLDR
This chapter presents the POMDP model by focusing on the differences with fully observable MDPs, and it is shown how optimal policies for POM DPs can be represented.
Abstract
For reinforcement learning in environments in which an agent has access to a reliable state signal, methods based on the Markov decision process (MDP) have had many successes. In many problem domains, however, an agent suffers from limited sensing capabilities that preclude it from recovering a Markovian state signal from its perceptions. Extending the MDP framework, partially observable Markov decision processes (POMDPs) allow for principled decision making under conditions of uncertain sensing. In this chapter we present the POMDP model by focusing on the differences with fully observable MDPs, and we show how optimal policies for POMDPs can be represented. Next, we give a review of model-based techniques for policy computation, followed by an overview of the available model-free methods for POMDPs. We conclude by highlighting recent trends in POMDP reinforcement learning.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Decision-theoretic planning under uncertainty with information rewards for active cooperative perception

TL;DR: This work presents the POMDP with Information Rewards (POMDP-IR) modeling framework, which rewards an agent for reaching a certain level of belief regarding a state feature, and demonstrates their use for active cooperative perception scenarios.
Journal Article

The MADP Toolbox: An Open Source Library for Planning and Learning in (Multi-)Agent Systems

TL;DR: The MultiAgent Decision Process (MADP) toolbox as mentioned in this paper is a software library to support planning and learning for intelligent agents and multi-agent systems in uncertain environments, including partially observable environments and stochastic transition models.
Posted Content

User-centric Cell-free Massive MIMO Networks: A Survey of Opportunities, Challenges and Solutions.

TL;DR: In this article, the authors present a guide to the key challenges facing the deployment of this network scheme and contemplate the solutions being proposed for the main bottlenecks facing cell-free communications.
Posted Content

Reinforcement Learning of POMDP's using Spectral Methods.

TL;DR: This work proposes a new reinforcement learning algorithm for partially observable Markov decision processes (POMDP) based on spectral decomposition methods and proves an order-optimal regret bound with respect to the optimal memoryless policy and efficient scaling withrespect to the dimensionality of observation and action spaces.
Journal ArticleDOI

Review of mission planning for autonomous marine vehicle fleets

TL;DR: A critical review of the current advances in automated planning for AMV fleets is presented, investigating the limitations of available state‐of‐the‐art tools and providing a road map of the goals and challenges based on analysis of field reports and end user initiatives.
References
More filters
Journal ArticleDOI

Long short-term memory

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Book

Reinforcement Learning: An Introduction

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Book

Probabilistic Robotics

TL;DR: This research presents a novel approach to planning and navigation algorithms that exploit statistics gleaned from uncertain, imperfect real-world environments to guide robots toward their goals and around obstacles.
Journal ArticleDOI

Planning and Acting in Partially Observable Stochastic Domains

TL;DR: A novel algorithm for solving pomdps off line and how, in some cases, a finite-memory controller can be extracted from the solution to a POMDP is outlined.
Journal ArticleDOI

The Optimal Control of Partially Observable Markov Processes over a Finite Horizon

TL;DR: In this article, the optimal control problem for a class of mathematical models in which the system to be controlled is characterized by a finite-state discrete-time Markov process is formulated.
Related Papers (5)