Gradient Flows for Regularized Stochastic Control Problems

Open AccessPosted Content

Gradient Flows for Regularized Stochastic Control Problems

David Šiška, +1 more

- 10 Jun 2020 -

arXiv: Optimization and Control

Chats0

TLDR

This work constructs gradient flow for the measure-valued control process along which the cost functional is guaranteed to decrease and shows that under appropriate conditions, this gradient flow has an invariant measure which is the optimal control for the regularized stochastic control problem.

Abstract:

This paper studies stochastic control problems regularized by the relative entropy, where the action space is the space of measures. This setting includes relaxed control problems, problems of finding Markovian controls with the control function replaced by an idealized infinitely wide neural network and can be extended to the search for causal optimal transport maps. By exploiting the Pontryagin optimality principle, we identify suitable metric space on which we construct gradient flow for the measure-valued control process along which the cost functional is guaranteed to decrease. It is shown that under appropriate conditions, this gradient flow has an invariant measure which is the optimal control for the regularized stochastic control problem. If the problem we work with is sufficiently convex, the gradient flow converges exponentially fast. Furthermore, the optimal measured valued control admits Bayesian interpretation which means that one can incorporate prior knowledge when solving stochastic control problem. This work is motivated by a desire to extend the theoretical underpinning for the convergence of stochastic gradient type algorithms widely used in the reinforcement learning community to solve control problems.

Citations

PDF

Open Access

More filters

Posted Content

Robust pricing and hedging via neural SDEs.

Patryk Gierjatowicz, +4 more

- 08 Jul 2020 -

arXiv: Mathematical Finance

TL;DR: Combining neural networks with risk models based on classical stochastic differential equations (SDEs), the resulting model called neural SDE is an instantiation of generative models and is closely linked with the theory of causal optimal transport.

...read moreread less

Posted Content

Exploratory LQG Mean Field Games with Entropy Regularization.

Dena Firoozi, +2 more

- 25 Nov 2020 -

arXiv: Optimization and Control

TL;DR: It is demonstrated that the optimal set of action distributions yields an $\epsilon$-Nash equilibrium for the finite-population entropy-regularized MFG.

...read moreread less

Proceedings Article

Convergence of policy gradient for entropy regularized MDPs with neural network approximation in the mean-field regime

Bekzhan Kerimkulov, +3 more

TL;DR: The results rely on the careful analysis of the nonlinear Fokker–Planck–Kolmogorov equation and extend the pioneering work of (Mei et al., 2020) and (Agarwal et al, 2020), which quantify the global convergence rate of policy gradient for entropy-regularized MDPs in the tabular setting.

...read moreread less

Posted Content

Ergodicity of the underdamped mean-field Langevin dynamics

Anna Kazeykina, +3 more

- 29 Jul 2020 -

arXiv: Probability

TL;DR: The long time behavior of an underdamped mean-field Langevin (MFL) equation is studied, and a general convergence as well as an exponential convergence rate result under different conditions are provided.

...read moreread less

Journal ArticleDOI

A modified MSA for stochastic control problems

Bekzhan Kerimkulov, +4 more

- 25 Feb 2021 -

Applied Mathematics and Optimization

TL;DR: In this article, a modified MSA is shown to converge for general stochastic control problems with control in both the drift and diffusion coefficients under some additional assumptions, and the results are valid without restrictions on the time horizon of the control problem, in contrast to iterative methods based on the theory of forward-backward Stochastic differential equations.

...read moreread less

References

PDF

Open Access

More filters

Book

Reinforcement Learning: An Introduction

Richard S. Sutton, +1 more

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

...read moreread less

Book

Dynamic Programming

Richard Ernest Bellman

TL;DR: The more the authors study the information processing aspects of the mind, the more perplexed and impressed they become, and it will be a very long time before they understand these processes sufficiently to reproduce them.

...read moreread less

Book

Dynamic Programming and Optimal Control

Dimitri P. Bertsekas

TL;DR: The leading and most up-to-date textbook on the far-ranging algorithmic methododogy of Dynamic Programming, which can be used for optimal control, Markovian decision problems, planning and sequential decision making under uncertainty, and discrete/combinatorial optimization.

...read moreread less

Book

Brownian Motion and Stochastic Calculus

Ioannis Karatzas, +1 more

TL;DR: In this paper, the authors present a characterization of continuous local martingales with respect to Brownian motion in terms of Markov properties, including the strong Markov property, and a generalized version of the Ito rule.

...read moreread less

Book

Optimal Transport: Old and New

Cédric Villani

TL;DR: In this paper, the authors provide a detailed description of the basic properties of optimal transport, including cyclical monotonicity and Kantorovich duality, and three examples of coupling techniques.

...read moreread less