Risk Sensitive Reinforcement Learning

doi:10.1023/A:1017940631555

Open AccessJournal ArticleDOI

Risk Sensitive Reinforcement Learning

- Vol. 49, Iss: 2, pp 1031-1037

TLDR

This risk-sensitive reinforcement learning algorithm is based on a very different philosophy and reflects important properties of the classical exponential utility framework, but avoids its serious drawbacks for learning.

Abstract:

Most reinforcement learning algorithms optimize the expected return of a Markov Decision Problem. Practice has taught us the lesson that this criterion is not always the most suitable because many applications require robust control strategies which also take into account the variance of the return. Classical control literature provides several techniques to deal with risk-sensitive optimization goals like the so-called worst-case optimality criterion exclusively focusing on risk-avoiding policies or classical risk-sensitive control, which transforms the returns by exponential utility functions. While the first approach is typically too restrictive, the latter suffers from the absence of an obvious way to design a corresponding model-free reinforcement learning algorithm. Our risk-sensitive reinforcement learning algorithm is based on a very different philosophy. Instead of transforming the return of the process, we transform the temporal differences during learning. While our approach reflects important properties of the classical exponential utility framework, we avoid its serious drawbacks for learning. Based on an extended set of optimality equations we are able to formulate risk-sensitive versions of various well-known reinforcement learning algorithms which converge with probability one under the usual conditions.

Citations

PDF

Open Access

More filters

Journal Article

A comprehensive survey on safe reinforcement learning

Javier García, +1 more

- 01 Jan 2015 -

Journal of Machine Learning Research

TL;DR: This work categorize and analyze two approaches of Safe Reinforcement Learning, based on the modification of the optimality criterion, the classic discounted finite/infinite horizon, with a safety factor and the incorporation of external knowledge or the guidance of a risk metric.

...read moreread less

Journal ArticleDOI

Human Insula Activation Reflects Risk Prediction Errors As Well As Risk

Kerstin Preuschoff, +2 more

- 12 Mar 2008 -

The Journal of Neuroscience

TL;DR: Using functional imaging during a simple gambling task, it is shown that an early-onset activation in the human insula correlates significantly with risk prediction error and that its time course is consistent with a role in rapid updating.

...read moreread less

Journal ArticleDOI

Reinforcement learning: The Good, The Bad and The Ugly

Peter Dayan, +1 more

- 01 Apr 2008 -

Current Opinion in Neurobiology

TL;DR: The latest dispatches from the forefront offorcement learning are reviewed, some of the territories where lie monsters are mapped, and the future of reinforcement learning is mapped.

...read moreread less

Journal ArticleDOI

Pupil Dilation Signals Surprise: Evidence for Noradrenaline's Role in Decision Making.

Kerstin Preuschoff, +2 more

- 30 Sep 2011 -

Frontiers in Neuroscience

TL;DR: This work demonstrates that the pupil does not signal expected reward or uncertainty per se, but instead signals surprise, that is, errors in judging uncertainty, and analyses this effect with respect to a specific mathematical model of uncertainty and surprise, namely risk and risk prediction error.

...read moreread less

Journal ArticleDOI

Learning to trade via direct reinforcement

John Moody, +1 more

- 01 Jul 2001 -

IEEE Transactions on Neural Networks

TL;DR: It is demonstrated how direct reinforcement can be used to optimize risk-adjusted investment returns (including the differential Sharpe ratio), while accounting for the effects of transaction costs.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book

Theory of Games and Economic Behavior

John von Neumann, +1 more

TL;DR: Theory of games and economic behavior as mentioned in this paper is the classic work upon which modern-day game theory is based, and it has been widely used to analyze a host of real-world phenomena from arms races to optimal policy choices of presidential candidates, from vaccination policy to major league baseball salary negotiations.

...read moreread less

Book

Markov Decision Processes: Discrete Stochastic Dynamic Programming

Martin L. Puterman

TL;DR: Puterman as discussed by the authors provides a uniquely up-to-date, unified, and rigorous treatment of the theoretical, computational, and applied research on Markov decision process models, focusing primarily on infinite horizon discrete time models and models with discrete time spaces while also examining models with arbitrary state spaces, finite horizon models, and continuous time discrete state models.

...read moreread less

Book

Dynamic Programming and Optimal Control

Dimitri P. Bertsekas

TL;DR: The leading and most up-to-date textbook on the far-ranging algorithmic methododogy of Dynamic Programming, which can be used for optimal control, Markovian decision problems, planning and sequential decision making under uncertainty, and discrete/combinatorial optimization.

...read moreread less

Journal ArticleDOI

Risk Aversion in the Small and in the Large

John W. Pratt

- 01 Jan 1964 -

Econometrica

TL;DR: In this article, a measure of risk aversion in the small, the risk premium or insurance premium for an arbitrary risk, and a natural concept of decreasing risk aversion are discussed and related to one another.

...read moreread less

MonographDOI

Markov Decision Processes

P. Whittle, +1 more

- 15 Apr 1994 -

Journal of The Royal Statistical Society...

TL;DR: Markov Decision Processes covers recent research advances in such areas as countable state space models with average reward criterion, constrained models, and models with risk sensitive optimality criteria, and explores several topics that have received little or no attention in other books.

...read moreread less

Collapse

Risk Sensitive Reinforcement Learning

Citations

A comprehensive survey on safe reinforcement learning

Human Insula Activation Reflects Risk Prediction Errors As Well As Risk

Reinforcement learning: The Good, The Bad and The Ugly

Pupil Dilation Signals Surprise: Evidence for Noradrenaline's Role in Decision Making.

Learning to trade via direct reinforcement

References

Theory of Games and Economic Behavior

Markov Decision Processes: Discrete Stochastic Dynamic Programming

Dynamic Programming and Optimal Control

Risk Aversion in the Small and in the Large

Markov Decision Processes

Related Papers (5)

Reinforcement Learning: An Introduction

Human-level control through deep reinforcement learning

Markov Decision Processes: Discrete Stochastic Dynamic Programming

Introduction to Reinforcement Learning

Dynamic Programming and Optimal Control