Stochastic optimal control via forward and backward stochastic differential equations and importance sampling

doi:10.1016/J.AUTOMATICA.2017.09.004

Home
/
Papers
/
Stochastic optimal control via forward and backward stochastic differential equations and importance sampling

Journal Article•DOI•

Stochastic optimal control via forward and backward stochastic differential equations and importance sampling

Ioannis Exarchos¹, Evangelos A. Theodorou¹•Institutions (1)

Georgia Institute of Technology¹

01 Jan 2018-Automatica (Pergamon)-Vol. 87, Iss: 87, pp 159-165

TL;DR: This scheme is shown to be capable of learning the optimal control without requiring an initial guess and to enhance the efficiency of the proposed scheme when treating more complex nonlinear systems, an iterative algorithm based on Girsanov's theorem on the change of measure is derived.

read less

About: This article is published in Automatica.The article was published on 2018-01-01 and is currently open access. It has received 59 citations till now. The article focuses on the topics: Stochastic partial differential equation & Stochastic control.

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•

Learning Deep Stochastic Optimal Control Policies Using Forward-Backward SDEs

[...]

Ziyi Wang¹, Marcus A. Pereira¹, Ioannis Exarchos², Evangelos A. Theodorou¹•Institutions (2)

Georgia Institute of Technology¹, Emory University²

22 Jun 2019

TL;DR: A new methodology for decision-making under uncertainty using recent advancements in the areas of nonlinear stochastic optimal control theory, applied mathematics, and machine learning is proposed.

...read moreread less

Abstract: In this paper we propose a new methodology for decision-making under uncertainty using recent advancements in the areas of nonlinear stochastic optimal control theory, applied mathematics, and machine learning. Grounded on the fundamental relation between certain nonlinear partial differential equations and forward-backward stochastic differential equations, we develop a control framework that is scalable and applicable to general classes of stochastic systems and decision-making problem formulations in robotics and autonomy. The proposed deep neural network architectures for stochastic control consist of recurrent and fully connected layers. The performance and scalability of the aforementioned algorithm are investigated in three non-linear systems in simulation with and without control constraints. We conclude with a discussion on future directions and their implications to robotics.

...read moreread less

41 citations

Cites background or methods from "Stochastic optimal control via forw..."

...Exarchos and Theodorou [14] developed an importance sampling based iterative scheme by approximating the conditional expectation at every time step using linear regression (see also [15] and [16])....
[...]
...The resulting algorithms overcome limitations of previous work in [24] by exploiting Girsanov’s theorem as in [14] to enable efficient exploration and by utilizing the benefits of recurrent neural networks in learning temporal dependencies....
[...]
...We refer the readers to proof of Theorem 1 in [14] for the full derivation of change of measure for FBSDEs....
[...]
...This was the basis of the approximation scheme and corresponding algorithm introduced in [14]....
[...]
...This problem was addressed in [14] through application of Girsanov’s theorem, which allows for the modification of the drift terms in the FBSDE system thereby facilitating efficient exploration through controlled forward dynamics....
[...]

Journal Article•DOI•

Stochastic L1-optimal control via forward and backward sampling

[...]

Ioannis Exarchos¹, Evangelos A. Theodorou¹, Panagiotis Tsiotras¹•Institutions (1)

Georgia Institute of Technology¹

01 Aug 2018-Systems & Control Letters

TL;DR: A probabilistic representation of the solution to the nonlinear Hamilton–Jacobi–Bellman equation is obtained, expressed in the form of a system of decoupled FBSDEs, which can be solved by employing linear regression techniques.

...read moreread less

32 citations

Posted Content•

Safe Optimal Control Using Stochastic Barrier Functions and Deep Forward-Backward SDEs

[...]

Marcus A. Pereira¹, Ziyi Wang¹, Ioannis Exarchos², Evangelos A. Theodorou¹•Institutions (2)

Georgia Institute of Technology¹, Stanford University²

02 Sep 2020

TL;DR: This paper introduces a new formulation for stochastic optimal control and Stochastic dynamic optimization that ensures safety with respect to state and control constraints that is designed for safe trajectory optimization.

...read moreread less

Abstract: This paper introduces a new formulation for stochastic optimal control and stochastic dynamic optimization that ensures safety with respect to state and control constraints. The proposed methodology brings together concepts such as Forward-Backward Stochastic Differential Equations, Stochastic Barrier Functions, Differentiable Convex Optimization and Deep Learning. Using the aforementioned concepts, a Neural Network architecture is designed for safe trajectory optimization in which learning can be performed in an end-to-end fashion. Simulations are performed on three systems to show the efficacy of the proposed methodology.

...read moreread less

22 citations

Cites methods from "Stochastic optimal control via forw..."

...Popular solution methods in literature include iLQG [12], Path-Integral Control [13], and the Forward-Backward Stochastic Differential Equations (FBSDEs) framework [14], which tackle the HJB through locally optimal solutions....
[...]

Journal Article•DOI•

Stochastic Differential Games: A Sampling Approach via FBSDEs

[...]

Ioannis Exarchos¹, Evangelos A. Theodorou¹, Panagiotis Tsiotras¹•Institutions (1)

Georgia Institute of Technology¹

01 Jun 2019-Dynamic Games and Applications

TL;DR: A sampling-based algorithm designed to solve various classes of stochastic differential games in light of the nonlinear version of the Feynman–Kac lemma, probabilistic representations of solutions to the non linear Hamilton–Jacobi–Isaacs equations that arise for each class are obtained.

...read moreread less

Abstract: The aim of this work is to present a sampling-based algorithm designed to solve various classes of stochastic differential games. The foundation of the proposed approach lies in the formulation of the game solution in terms of a decoupled pair of forward and backward stochastic differential equations (FBSDEs). In light of the nonlinear version of the Feynman–Kac lemma, probabilistic representations of solutions to the nonlinear Hamilton–Jacobi–Isaacs equations that arise for each class are obtained. These representations are in form of decoupled systems of FBSDEs, which may be solved numerically.

...read moreread less

18 citations

Cites background or methods from "Stochastic optimal control via forw..."

...In this paper, we employ a scheme proposed in previous work by the authors [17], which capitalizes on the regularity present whenever systems of FBSDEs are linked to PDEs....
[...]
...In previous work by the authors [17], a scheme involving a drift term modification has been constructed through Girsanov’s theorem on the change of measure [30,48]....
[...]
...The formal proof which involves Girsanov’s theorem on the change of measure can be found in [17]....
[...]

Journal Article•DOI•

Variational approach to rare event simulation using least-squares regression

[...]

Carsten Hartmann¹, Omar Kebiri¹, Lara Neureither¹, Lorenz Richter², Lorenz Richter¹ - Show less +1 more•Institutions (2)

Brandenburg University of Technology¹, Free University of Berlin²

12 Jun 2019-Chaos

TL;DR: An adaptive importance sampling scheme for the simulation of rare events when the underlying dynamics is given by diffusion is proposed, based on a Gibbs variational principle that is used to determine the optimal change of measure.

...read moreread less

Abstract: We propose an adaptive importance sampling scheme for the simulation of rare events when the underlying dynamics is given by diffusion. The scheme is based on a Gibbs variational principle that is used to determine the optimal (i.e., zero-variance) change of measure and exploits the fact that the latter can be rephrased as a stochastic optimal control problem. The control problem can be solved by a stochastic approximation algorithm, using the Feynman–Kac representation of the associated dynamic programming equations, and we discuss numerical aspects for high-dimensional problems along with simple toy examples.

...read moreread less

14 citations

1
2
3
4
…
5
6
7
8
9
10
11
12

Collapse

References

PDF

Open Access

More filters

Book•

Brownian Motion and Stochastic Calculus

[...]

Ioannis Karatzas, Steven E. Shreve

01 Jan 1987

TL;DR: In this paper, the authors present a characterization of continuous local martingales with respect to Brownian motion in terms of Markov properties, including the strong Markov property, and a generalized version of the Ito rule.

...read moreread less

Abstract: 1 Martingales, Stopping Times, and Filtrations.- 1.1. Stochastic Processes and ?-Fields.- 1.2. Stopping Times.- 1.3. Continuous-Time Martingales.- A. Fundamental inequalities.- B. Convergence results.- C. The optional sampling theorem.- 1.4. The Doob-Meyer Decomposition.- 1.5. Continuous, Square-Integrable Martingales.- 1.6. Solutions to Selected Problems.- 1.7. Notes.- 2 Brownian Motion.- 2.1. Introduction.- 2.2. First Construction of Brownian Motion.- A. The consistency theorem.- B. The Kolmogorov-?entsov theorem.- 2.3. Second Construction of Brownian Motion.- 2.4. The SpaceC[0, ?), Weak Convergence, and Wiener Measure.- A. Weak convergence.- B. Tightness.- C. Convergence of finite-dimensional distributions.- D. The invariance principle and the Wiener measure.- 2.5. The Markov Property.- A. Brownian motion in several dimensions.- B. Markov processes and Markov families.- C. Equivalent formulations of the Markov property.- 2.6. The Strong Markov Property and the Reflection Principle.- A. The reflection principle.- B. Strong Markov processes and families.- C. The strong Markov property for Brownian motion.- 2.7. Brownian Filtrations.- A. Right-continuity of the augmented filtration for a strong Markov process.- B. A "universal" filtration.- C. The Blumenthal zero-one law.- 2.8. Computations Based on Passage Times.- A. Brownian motion and its running maximum.- B. Brownian motion on a half-line.- C. Brownian motion on a finite interval.- D. Distributions involving last exit times.- 2.9. The Brownian Sample Paths.- A. Elementary properties.- B. The zero set and the quadratic variation.- C. Local maxima and points of increase.- D. Nowhere differentiability.- E. Law of the iterated logarithm.- F. Modulus of continuity.- 2.10. Solutions to Selected Problems.- 2.11. Notes.- 3 Stochastic Integration.- 3.1. Introduction.- 3.2. Construction of the Stochastic Integral.- A. Simple processes and approximations.- B. Construction and elementary properties of the integral.- C. A characterization of the integral.- D. Integration with respect to continuous, local martingales.- 3.3. The Change-of-Variable Formula.- A. The Ito rule.- B. Martingale characterization of Brownian motion.- C. Bessel processes, questions of recurrence.- D. Martingale moment inequalities.- E. Supplementary exercises.- 3.4. Representations of Continuous Martingales in Terms of Brownian Motion.- A. Continuous local martingales as stochastic integrals with respect to Brownian motion.- B. Continuous local martingales as time-changed Brownian motions.- C. A theorem of F. B. Knight.- D. Brownian martingales as stochastic integrals.- E. Brownian functionals as stochastic integrals.- 3.5. The Girsanov Theorem.- A. The basic result.- B. Proof and ramifications.- C. Brownian motion with drift.- D. The Novikov condition.- 3.6. Local Time and a Generalized Ito Rule for Brownian Motion.- A. Definition of local time and the Tanaka formula.- B. The Trotter existence theorem.- C. Reflected Brownian motion and the Skorohod equation.- D. A generalized Ito rule for convex functions.- E. The Engelbert-Schmidt zero-one law.- 3.7. Local Time for Continuous Semimartingales.- 3.8. Solutions to Selected Problems.- 3.9. Notes.- 4 Brownian Motion and Partial Differential Equations.- 4.1. Introduction.- 4.2. Harmonic Functions and the Dirichlet Problem.- A. The mean-value property.- B. The Dirichlet problem.- C. Conditions for regularity.- D. Integral formulas of Poisson.- E. Supplementary exercises.- 4.3. The One-Dimensional Heat Equation.- A. The Tychonoff uniqueness theorem.- B. Nonnegative solutions of the heat equation.- C. Boundary crossing probabilities for Brownian motion.- D. Mixed initial/boundary value problems.- 4.4. The Formulas of Feynman and Kac.- A. The multidimensional formula.- B. The one-dimensional formula.- 4.5. Solutions to selected problems.- 4.6. Notes.- 5 Stochastic Differential Equations.- 5.1. Introduction.- 5.2. Strong Solutions.- A. Definitions.- B. The Ito theory.- C. Comparison results and other refinements.- D. Approximations of stochastic differential equations.- E. Supplementary exercises.- 5.3. Weak Solutions.- A. Two notions of uniqueness.- B. Weak solutions by means of the Girsanov theorem.- C. A digression on regular conditional probabilities.- D. Results of Yamada and Watanabe on weak and strong solutions.- 5.4. The Martingale Problem of Stroock and Varadhan.- A. Some fundamental martingales.- B. Weak solutions and martingale problems.- C. Well-posedness and the strong Markov property.- D. Questions of existence.- E. Questions of uniqueness.- F. Supplementary exercises.- 5.5. A Study of the One-Dimensional Case.- A. The method of time change.- B. The method of removal of drift.- C. Feller's test for explosions.- D. Supplementary exercises.- 5.6. Linear Equations.- A. Gauss-Markov processes.- B. Brownian bridge.- C. The general, one-dimensional, linear equation.- D. Supplementary exercises.- 5.7. Connections with Partial Differential Equations.- A. The Dirichlet problem.- B. The Cauchy problem and a Feynman-Kac representation.- C. Supplementary exercises.- 5.8. Applications to Economics.- A. Portfolio and consumption processes.- B. Option pricing.- C. Optimal consumption and investment (general theory).- D. Optimal consumption and investment (constant coefficients).- 5.9. Solutions to Selected Problems.- 5.10. Notes.- 6 P. Levy's Theory of Brownian Local Time.- 6.1. Introduction.- 6.2. Alternate Representations of Brownian Local Time.- A. The process of passage times.- B. Poisson random measures.- C. Subordinators.- D. The process of passage times revisited.- E. The excursion and downcrossing representations of local time.- 6.3. Two Independent Reflected Brownian Motions.- A. The positive and negative parts of a Brownian motion.- B. The first formula of D. Williams.- C. The joint density of (W(t), L(t), ? +(t)).- 6.4. Elastic Brownian Motion.- A. The Feynman-Kac formulas for elastic Brownian motion.- B. The Ray-Knight description of local time.- C. The second formula of D. Williams.- 6.5. An Application: Transition Probabilities of Brownian Motion with Two-Valued Drift.- 6.6. Solutions to Selected Problems.- 6.7. Notes.

...read moreread less

8,639 citations

Book•

Numerical Solution of Stochastic Differential Equations

[...]

Peter E. Kloeden, Eckhard Platen

01 Jun 1992

TL;DR: In this article, a time-discrete approximation of deterministic Differential Equations is proposed for the stochastic calculus, based on Strong Taylor Expansions and Strong Taylor Approximations.

...read moreread less

Abstract: 1 Probability and Statistics- 2 Probability and Stochastic Processes- 3 Ito Stochastic Calculus- 4 Stochastic Differential Equations- 5 Stochastic Taylor Expansions- 6 Modelling with Stochastic Differential Equations- 7 Applications of Stochastic Differential Equations- 8 Time Discrete Approximation of Deterministic Differential Equations- 9 Introduction to Stochastic Time Discrete Approximation- 10 Strong Taylor Approximations- 11 Explicit Strong Approximations- 12 Implicit Strong Approximations- 13 Selected Applications of Strong Approximations- 14 Weak Taylor Approximations- 15 Explicit and Implicit Weak Approximations- 16 Variance Reduction Methods- 17 Selected Applications of Weak Approximations- Solutions of Exercises- Bibliographical Notes

...read moreread less

6,284 citations

"Stochastic optimal control via forw..." refers background or methods in this paper

...The simplest discretized scheme for the forward process is the Euler scheme, which is also called Euler–Maruyama scheme (Kloeden & Platen, 1999): Xi+1 ≈ Xi + b(ti, Xi)∆ti +Σ(ti, Xi)∆Wi, (18) for i = 0, . . . ,N − 1 and X0 = x....
[...]
...Several alternative, higher order schemes exist that can be selected in lieu of the Euler scheme (Kloeden & Platen, 1999)....
[...]

Journal Article•DOI•

Stochastic differential equations : an introduction with applications

[...]

Bernt Øksendal

01 Sep 1987-Journal of the American Statistical Association

TL;DR: Some Mathematical Preliminaries as mentioned in this paper include the Ito Integrals, Ito Formula and the Martingale Representation Theorem, and Stochastic Differential Equations.

...read moreread less

Abstract: Some Mathematical Preliminaries.- Ito Integrals.- The Ito Formula and the Martingale Representation Theorem.- Stochastic Differential Equations.- The Filtering Problem.- Diffusions: Basic Properties.- Other Topics in Diffusion Theory.- Applications to Boundary Value Problems.- Application to Optimal Stopping.- Application to Stochastic Control.- Application to Mathematical Finance.

...read moreread less

4,705 citations

Book•

Controlled Markov processes and viscosity solutions

[...]

Wendell H. Fleming¹, H. Mete Soner•Institutions (1)

Brown University¹

18 Dec 1992

TL;DR: In this paper, an introduction to optimal stochastic control for continuous time Markov processes and to the theory of viscosity solutions is given, as well as a concise introduction to two-controller, zero-sum differential games.

...read moreread less

Abstract: This book is intended as an introduction to optimal stochastic control for continuous time Markov processes and to the theory of viscosity solutions. The authors approach stochastic control problems by the method of dynamic programming. The text provides an introduction to dynamic programming for deterministic optimal control problems, as well as to the corresponding theory of viscosity solutions. A new Chapter X gives an introduction to the role of stochastic optimal control in portfolio optimization and in pricing derivatives in incomplete markets. Chapter VI of the First Edition has been completely rewritten, to emphasize the relationships between logarithmic transformations and risk sensitivity. A new Chapter XI gives a concise introduction to two-controller, zero-sum differential games. Also covered are controlled Markov diffusions and viscosity solutions of Hamilton-Jacobi-Bellman equations. The authors have tried, through illustrative examples and selective material, to connect stochastic control theory with other mathematical areas (e.g. large deviations theory) and with applications to engineering, physics, management, and finance. In this Second Edition, new material on applications to mathematical finance has been added. Concise introductions to risk-sensitive control theory, nonlinear H-infinity control and differential games are also included.

...read moreread less

3,885 citations

"Stochastic optimal control via forw..." refers methods in this paper

...Specifically, by applying the stochastic version of Bellman’s principle of optimality, it is shown (Fleming & Soner, 2006; Yong & Zhou, 1999) that if the Value function is in C1,2([0, T ]×Rn), then it is a solution to the following terminal value problem of a second-order partial differential…...
[...]
...Following the same procedure as in Section 2, and under Assumption 1, the resulting HJB PDE is Fleming & Soner (2006)⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩ vt + 1 2 tr(vxxΣ(t, x)Σ⊤(t, x)) + v⊤x b(t, x) + h(t, x,Σ⊤(t, x)vx) = 0, (t, x) ∈ [0, T ) × G, v(T , x) = g(x), x ∈ G, v(t, x) = ψ(t, x), (t, x) ∈ [0, T ) × ∂G (15) in…...
[...]

Journal Article•DOI•

Valuing American Options by Simulation: A Simple Least-Squares Approach

[...]

Francis A. Longstaff¹, Eduardo S. Schwartz¹•Institutions (1)

University of California, Los Angeles¹

01 Jan 2001-Review of Financial Studies

TL;DR: In this paper, a new approach for approximating the value of American options by simulation is presented, using least squares to estimate the conditional expected payoff to the optionholder from continuation.

...read moreread less

Abstract: This article presents a simple yet powerful new approach for approximating the value of American options by simulation. The key to this approach is the use of least squares to estimate the conditional expected payoff to the optionholder from continuation. This makes this approach readily applicable in path-dependent and multifactor situations where traditional finite difference techniques cannot be used. We illustrate this technique with several realistic examples including valuing an option when the underlying asset follows a jump-diffusion process and valuing an American swaption in a 20-factor string model of the term structure.

...read moreread less

2,612 citations