scispace - formally typeset
Open AccessJournal ArticleDOI

Tree-based reinforcement learning for optimal water reservoir operation

TLDR
In this paper, a reinforcement learning approach, called fitted Q-iteration, is presented: it combines the principle of continuous approximation of the value functions with a process of learning off-line from experience to design daily, cyclostationary operating policies.
Abstract
[1] Although being one of the most popular and extensively studied approaches to design water reservoir operations, Stochastic Dynamic Programming is plagued by a dual curse that makes it unsuitable to cope with large water systems: the computational requirement grows exponentially with the number of state variables considered (curse of dimensionality) and an explicit model must be available to describe every system transition and the associated rewards/costs (curse of modeling). A variety of simplifications and approximations have been devised in the past, which, in many cases, make the resulting operating policies inefficient and of scarce relevance in practical contexts. In this paper, a reinforcement-learning approach, called fitted Q-iteration, is presented: it combines the principle of continuous approximation of the value functions with a process of learning off-line from experience to design daily, cyclostationary operating policies. The continuous approximation, performed via tree-based regression, makes it possible to mitigate the curse of dimensionality by adopting a very coarse discretization grid with respect to the dense grid required to design an equally performing policy via Stochastic Dynamic Programming. The learning experience, in the form of a data set generated combining historical observations and model simulations, allows us to overcome the curse of modeling. Lake Como water system (Italy) is used as study site to infer general guidelines on the appropriate setting for the algorithm parameters and to demonstrate the advantages of the approach in terms of accuracy and computational effectiveness compared to traditional Stochastic Dynamic Programming.

read more

Citations
More filters
Journal ArticleDOI

A survey of multi-objective sequential decision-making

TL;DR: This article surveys algorithms designed for sequential decision-making problems with multiple objectives and proposes a taxonomy that classifies multi-objective methods according to the applicable scenario, the nature of the scalarization function, and the type of policies considered.
Journal ArticleDOI

A Brief Review of Random Forests for Water Scientists and Practitioners and Their Recent History in Water Resources

TL;DR: This work popularizes RF and their variants for the practicing water scientist, and discusses related concepts and techniques, which have received less attention from the water science and hydrologic communities.
Journal ArticleDOI

Curses, Tradeoffs, and Scalable Management: Advancing Evolutionary Multiobjective Direct Policy Search to Improve Water Reservoir Operations

TL;DR: This analysis explores the technical and practical implications of using EMODPS through a careful diagnostic assessment of the effectiveness and reliability of the overall EModPS solution design as well as of the resulting Pareto-approximate operating policies.
Journal ArticleDOI

Position Paper: A general framework for Dynamic Emulation Modelling in environmental problems

TL;DR: The main aim of the paper is to provide an introduction to emulation modelling together with a unified strategy for its application, so that modellers from different disciplines can better appreciate how it may be applied in their area of expertise.
Journal ArticleDOI

Robustness Metrics: How Are They Calculated, When Should They Be Used and Why Do They Give Different Results?

TL;DR: A conceptual framework describing when relative robustness values of decision alternatives obtained using different metrics are likely to agree and disagree is introduced, used as a measure of how “stable” the ranking of decision alternative is when determined using different robustness metrics.
References
More filters
Journal ArticleDOI

Random Forests

TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Book

Reinforcement Learning: An Introduction

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Journal ArticleDOI

Bagging predictors

Leo Breiman
TL;DR: Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy.
Book

Classification and regression trees

Leo Breiman
TL;DR: The methodology used to construct tree structured rules is the focus of a monograph as mentioned in this paper, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.
Book

Dynamic Programming

TL;DR: The more the authors study the information processing aspects of the mind, the more perplexed and impressed they become, and it will be a very long time before they understand these processes sufficiently to reproduce them.