scispace - formally typeset
Open AccessPosted Content

Gradient Flows for Regularized Stochastic Control Problems

Reads0
Chats0
TLDR
This work constructs gradient flow for the measure-valued control process along which the cost functional is guaranteed to decrease and shows that under appropriate conditions, this gradient flow has an invariant measure which is the optimal control for the regularized stochastic control problem.
Abstract
This paper studies stochastic control problems regularized by the relative entropy, where the action space is the space of measures. This setting includes relaxed control problems, problems of finding Markovian controls with the control function replaced by an idealized infinitely wide neural network and can be extended to the search for causal optimal transport maps. By exploiting the Pontryagin optimality principle, we identify suitable metric space on which we construct gradient flow for the measure-valued control process along which the cost functional is guaranteed to decrease. It is shown that under appropriate conditions, this gradient flow has an invariant measure which is the optimal control for the regularized stochastic control problem. If the problem we work with is sufficiently convex, the gradient flow converges exponentially fast. Furthermore, the optimal measured valued control admits Bayesian interpretation which means that one can incorporate prior knowledge when solving stochastic control problem. This work is motivated by a desire to extend the theoretical underpinning for the convergence of stochastic gradient type algorithms widely used in the reinforcement learning community to solve control problems.

read more

Citations
More filters
Posted Content

Robust pricing and hedging via neural SDEs.

TL;DR: Combining neural networks with risk models based on classical stochastic differential equations (SDEs), the resulting model called neural SDE is an instantiation of generative models and is closely linked with the theory of causal optimal transport.
Posted Content

Exploratory LQG Mean Field Games with Entropy Regularization.

TL;DR: It is demonstrated that the optimal set of action distributions yields an $\epsilon$-Nash equilibrium for the finite-population entropy-regularized MFG.
Proceedings Article

Convergence of policy gradient for entropy regularized MDPs with neural network approximation in the mean-field regime

TL;DR: The results rely on the careful analysis of the nonlinear Fokker–Planck–Kolmogorov equation and extend the pioneering work of (Mei et al., 2020) and (Agarwal et al, 2020), which quantify the global convergence rate of policy gradient for entropy-regularized MDPs in the tabular setting.
Posted Content

Ergodicity of the underdamped mean-field Langevin dynamics

TL;DR: The long time behavior of an underdamped mean-field Langevin (MFL) equation is studied, and a general convergence as well as an exponential convergence rate result under different conditions are provided.
Journal ArticleDOI

A modified MSA for stochastic control problems

TL;DR: In this article, a modified MSA is shown to converge for general stochastic control problems with control in both the drift and diffusion coefficients under some additional assumptions, and the results are valid without restrictions on the time horizon of the control problem, in contrast to iterative methods based on the theory of forward-backward Stochastic differential equations.
References
More filters
Book

Reinforcement Learning: An Introduction

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Book

Dynamic Programming

TL;DR: The more the authors study the information processing aspects of the mind, the more perplexed and impressed they become, and it will be a very long time before they understand these processes sufficiently to reproduce them.
Book

Dynamic Programming and Optimal Control

TL;DR: The leading and most up-to-date textbook on the far-ranging algorithmic methododogy of Dynamic Programming, which can be used for optimal control, Markovian decision problems, planning and sequential decision making under uncertainty, and discrete/combinatorial optimization.
Book

Brownian Motion and Stochastic Calculus

TL;DR: In this paper, the authors present a characterization of continuous local martingales with respect to Brownian motion in terms of Markov properties, including the strong Markov property, and a generalized version of the Ito rule.
Book

Optimal Transport: Old and New

TL;DR: In this paper, the authors provide a detailed description of the basic properties of optimal transport, including cyclical monotonicity and Kantorovich duality, and three examples of coupling techniques.
Related Papers (5)