Risk Sensitive Reinforcement Learning
Ralph Neuneier,Oliver Mihatsch +1 more
- Vol. 49, Iss: 2, pp 1031-1037
TLDR
This risk-sensitive reinforcement learning algorithm is based on a very different philosophy and reflects important properties of the classical exponential utility framework, but avoids its serious drawbacks for learning.Abstract:
Most reinforcement learning algorithms optimize the expected return of a Markov Decision Problem. Practice has taught us the lesson that this criterion is not always the most suitable because many applications require robust control strategies which also take into account the variance of the return. Classical control literature provides several techniques to deal with risk-sensitive optimization goals like the so-called worst-case optimality criterion exclusively focusing on risk-avoiding policies or classical risk-sensitive control, which transforms the returns by exponential utility functions. While the first approach is typically too restrictive, the latter suffers from the absence of an obvious way to design a corresponding model-free reinforcement learning algorithm.
Our risk-sensitive reinforcement learning algorithm is based on a very different philosophy. Instead of transforming the return of the process, we transform the temporal differences during learning. While our approach reflects important properties of the classical exponential utility framework, we avoid its serious drawbacks for learning. Based on an extended set of optimality equations we are able to formulate risk-sensitive versions of various well-known reinforcement learning algorithms which converge with probability one under the usual conditions.read more
Citations
More filters
Journal Article
A comprehensive survey on safe reinforcement learning
Javier García,Fernando Fernández +1 more
TL;DR: This work categorize and analyze two approaches of Safe Reinforcement Learning, based on the modification of the optimality criterion, the classic discounted finite/infinite horizon, with a safety factor and the incorporation of external knowledge or the guidance of a risk metric.
Journal ArticleDOI
Human Insula Activation Reflects Risk Prediction Errors As Well As Risk
TL;DR: Using functional imaging during a simple gambling task, it is shown that an early-onset activation in the human insula correlates significantly with risk prediction error and that its time course is consistent with a role in rapid updating.
Journal ArticleDOI
Reinforcement learning: The Good, The Bad and The Ugly
Peter Dayan,Yael Niv +1 more
TL;DR: The latest dispatches from the forefront offorcement learning are reviewed, some of the territories where lie monsters are mapped, and the future of reinforcement learning is mapped.
Journal ArticleDOI
Pupil Dilation Signals Surprise: Evidence for Noradrenaline's Role in Decision Making.
TL;DR: This work demonstrates that the pupil does not signal expected reward or uncertainty per se, but instead signals surprise, that is, errors in judging uncertainty, and analyses this effect with respect to a specific mathematical model of uncertainty and surprise, namely risk and risk prediction error.
Journal ArticleDOI
Learning to trade via direct reinforcement
John Moody,Matthew Saffell +1 more
TL;DR: It is demonstrated how direct reinforcement can be used to optimize risk-adjusted investment returns (including the differential Sharpe ratio), while accounting for the effects of transaction costs.
References
More filters
Book
Theory of Games and Economic Behavior
TL;DR: Theory of games and economic behavior as mentioned in this paper is the classic work upon which modern-day game theory is based, and it has been widely used to analyze a host of real-world phenomena from arms races to optimal policy choices of presidential candidates, from vaccination policy to major league baseball salary negotiations.
Book
Markov Decision Processes: Discrete Stochastic Dynamic Programming
TL;DR: Puterman as discussed by the authors provides a uniquely up-to-date, unified, and rigorous treatment of the theoretical, computational, and applied research on Markov decision process models, focusing primarily on infinite horizon discrete time models and models with discrete time spaces while also examining models with arbitrary state spaces, finite horizon models, and continuous time discrete state models.
Book
Dynamic Programming and Optimal Control
TL;DR: The leading and most up-to-date textbook on the far-ranging algorithmic methododogy of Dynamic Programming, which can be used for optimal control, Markovian decision problems, planning and sequential decision making under uncertainty, and discrete/combinatorial optimization.
Journal ArticleDOI
Risk Aversion in the Small and in the Large
TL;DR: In this article, a measure of risk aversion in the small, the risk premium or insurance premium for an arbitrary risk, and a natural concept of decreasing risk aversion are discussed and related to one another.
MonographDOI
Markov Decision Processes
P. Whittle,M. L. Puterman +1 more
TL;DR: Markov Decision Processes covers recent research advances in such areas as countable state space models with average reward criterion, constrained models, and models with risk sensitive optimality criteria, and explores several topics that have received little or no attention in other books.