A survey of multi-objective sequential decision-making

doi:10.1613/JAIR.3987

Open AccessJournal ArticleDOI

A survey of multi-objective sequential decision-making

Diederik M. Roijers, +3 more

- 01 Oct 2013 -

Journal of Artificial Intelligence Resea...

- Vol. 48, Iss: 1, pp 67-113

TLDR

This article surveys algorithms designed for sequential decision-making problems with multiple objectives and proposes a taxonomy that classifies multi-objective methods according to the applicable scenario, the nature of the scalarization function, and the type of policies considered.

Abstract:

Sequential decision-making problems with multiple objectives arise naturally in practice and pose unique challenges for research in decision-theoretic planning and learning, which has largely focused on single-objective settings. This article surveys algorithms designed for sequential decision-making problems with multiple objectives. Though there is a growing body of literature on this subject, little of it makes explicit under what circumstances special methods are needed to solve multi-objective problems. Therefore, we identify three distinct scenarios in which converting such a problem to a single-objective one is impossible, infeasible, or undesirable. Furthermore, we propose a taxonomy that classifies multi-objective methods according to the applicable scenario, the nature of the scalarization function (which projects multi-objective values to scalar ones), and the type of policies considered. We show how these factors determine the nature of an optimal solution, which can be a single policy, a convex hull, or a Pareto front. Using this taxonomy, we survey the literature on multi-objective methods for planning and learning. Finally, we discuss key applications of such methods and outline opportunities for future work.

Citations

PDF

Open Access

More filters

Proceedings Article

Efficient solutions for Stochastic Shortest Path Problems with Dead Ends.

Felipe W. Trevizan, +2 more

TL;DR: This work studies a new, perhaps more natural optimization criterion capturing these problems, the Min-Cost given MaxProb (MCMP) criterion, which leads to the minimum expected cost policy among those with maximum success probability, and accurately accounts for the cost and risk of reaching dead ends.

...read moreread less

Journal ArticleDOI

Dynamic multi-objective optimisation using deep reinforcement learning: benchmark, algorithm and an application to identify vulnerable zones based on water quality

Mahmudul Hasan, +5 more

- 01 Nov 2019 -

Engineering Applications of Artificial I...

TL;DR: The outcome of the implementation reveals that the proposed parity-Q deep Q network (PQDQN) algorithm is an efficient way to optimise the decision in a dynamic environment and performs better compared to the other state-of-the-art solutions both in the simulated and the real-world scenario.

...read moreread less

Journal ArticleDOI

Identification and off-policy learning of multiple objectives using adaptive clustering

Thommen George Karimpanal, +1 more

- 08 Nov 2017 -

Neurocomputing

TL;DR: Using a simulated agent and environment, it is shown that the converged or partially converged value function weights resulting from off-policy learning can be used to accumulate knowledge about multiple objectives without any additional exploration.

...read moreread less

Posted Content

Multi-objective Bandits: Optimizing the Generalized Gini Index

Róbert Busa-Fekete, +3 more

- 15 Jun 2017 -

arXiv: Learning

TL;DR: An online gradient descent algorithm is proposed which exploits the convexity of the GGI aggregation function, and controls the exploration in a careful way achieving a distribution-free regret $\tilde{\bigO} (T^{-1/2} )$ with high probability.

...read moreread less

Book ChapterDOI

Bayesian Best-Arm Identification for Selecting Influenza Mitigation Strategies

Pieter Libin, +6 more

TL;DR: A new sampling technique is presented to optimize the evaluation of preventive strategies using fixed budget best-arm identification algorithms and it is demonstrated that it is possible to identify the optimal strategy using only a limited number of model evaluations.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book

Dynamic Programming

Richard Ernest Bellman

TL;DR: The more the authors study the information processing aspects of the mind, the more perplexed and impressed they become, and it will be a very long time before they understand these processes sufficiently to reproduce them.

...read moreread less

Book

Markov Decision Processes: Discrete Stochastic Dynamic Programming

Martin L. Puterman

TL;DR: Puterman as discussed by the authors provides a uniquely up-to-date, unified, and rigorous treatment of the theoretical, computational, and applied research on Markov decision process models, focusing primarily on infinite horizon discrete time models and models with discrete time spaces while also examining models with arbitrary state spaces, finite horizon models, and continuous time discrete state models.

...read moreread less

Book

Introduction to Reinforcement Learning

Richard S. Sutton, +1 more

TL;DR: In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning.

...read moreread less

Book

Evolutionary algorithms for solving multi-objective problems

Gary B. Lamont, +1 more

TL;DR: This paper presents a meta-anatomy of the multi-Criteria Decision Making process, which aims to provide a scaffolding for the future development of multi-criteria decision-making systems.

...read moreread less

Proceedings Article

Policy Gradient Methods for Reinforcement Learning with Function Approximation

Richard S. Sutton, +3 more

TL;DR: This paper proves for the first time that a version of policy iteration with arbitrary differentiable function approximation is convergent to a locally optimal policy.

...read moreread less

Collapse

A survey of multi-objective sequential decision-making

Citations

Efficient solutions for Stochastic Shortest Path Problems with Dead Ends.

Dynamic multi-objective optimisation using deep reinforcement learning: benchmark, algorithm and an application to identify vulnerable zones based on water quality

Identification and off-policy learning of multiple objectives using adaptive clustering

Multi-objective Bandits: Optimizing the Generalized Gini Index

Bayesian Best-Arm Identification for Selecting Influenza Mitigation Strategies

References

Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming

Introduction to Reinforcement Learning

Evolutionary algorithms for solving multi-objective problems

Policy Gradient Methods for Reinforcement Learning with Function Approximation

Related Papers (5)

Reinforcement Learning: An Introduction

Human-level control through deep reinforcement learning

Markov Decision Processes: Discrete Stochastic Dynamic Programming

Multiobjective Reinforcement Learning: A Comprehensive Overview

Mastering the game of Go with deep neural networks and tree search