Stochastic optimal control via forward and backward stochastic differential equations and importance sampling
Citations
41 citations
Cites background or methods from "Stochastic optimal control via forw..."
...Exarchos and Theodorou [14] developed an importance sampling based iterative scheme by approximating the conditional expectation at every time step using linear regression (see also [15] and [16])....
[...]
...The resulting algorithms overcome limitations of previous work in [24] by exploiting Girsanov’s theorem as in [14] to enable efficient exploration and by utilizing the benefits of recurrent neural networks in learning temporal dependencies....
[...]
...We refer the readers to proof of Theorem 1 in [14] for the full derivation of change of measure for FBSDEs....
[...]
...This was the basis of the approximation scheme and corresponding algorithm introduced in [14]....
[...]
...This problem was addressed in [14] through application of Girsanov’s theorem, which allows for the modification of the drift terms in the FBSDE system thereby facilitating efficient exploration through controlled forward dynamics....
[...]
32 citations
22 citations
Cites methods from "Stochastic optimal control via forw..."
...Popular solution methods in literature include iLQG [12], Path-Integral Control [13], and the Forward-Backward Stochastic Differential Equations (FBSDEs) framework [14], which tackle the HJB through locally optimal solutions....
[...]
18 citations
Cites background or methods from "Stochastic optimal control via forw..."
...In this paper, we employ a scheme proposed in previous work by the authors [17], which capitalizes on the regularity present whenever systems of FBSDEs are linked to PDEs....
[...]
...In previous work by the authors [17], a scheme involving a drift term modification has been constructed through Girsanov’s theorem on the change of measure [30,48]....
[...]
...The formal proof which involves Girsanov’s theorem on the change of measure can be found in [17]....
[...]
14 citations
References
8,639 citations
6,284 citations
"Stochastic optimal control via forw..." refers background or methods in this paper
...The simplest discretized scheme for the forward process is the Euler scheme, which is also called Euler–Maruyama scheme (Kloeden & Platen, 1999): Xi+1 ≈ Xi + b(ti, Xi)∆ti +Σ(ti, Xi)∆Wi, (18) for i = 0, . . . ,N − 1 and X0 = x....
[...]
...Several alternative, higher order schemes exist that can be selected in lieu of the Euler scheme (Kloeden & Platen, 1999)....
[...]
4,705 citations
3,885 citations
"Stochastic optimal control via forw..." refers methods in this paper
...Specifically, by applying the stochastic version of Bellman’s principle of optimality, it is shown (Fleming & Soner, 2006; Yong & Zhou, 1999) that if the Value function is in C1,2([0, T ]×Rn), then it is a solution to the following terminal value problem of a second-order partial differential…...
[...]
...Following the same procedure as in Section 2, and under Assumption 1, the resulting HJB PDE is Fleming & Soner (2006)⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩ vt + 1 2 tr(vxxΣ(t, x)Σ⊤(t, x)) + v⊤x b(t, x) + h(t, x,Σ⊤(t, x)vx) = 0, (t, x) ∈ [0, T ) × G, v(T , x) = g(x), x ∈ G, v(t, x) = ψ(t, x), (t, x) ∈ [0, T ) × ∂G (15) in…...
[...]
2,612 citations