scispace - formally typeset
Open AccessJournal ArticleDOI

Weak Dynamic Programming Principle for Viscosity Solutions

Reads0
Chats0
TLDR
A weak version of the dynamic programming principle is proved for standard stochastic control problems and mixed control-stopping problems, which avoids the technical difficulties related to the measurable selection argument.
Abstract
We prove a weak version of the dynamic programming principle for standard stochastic control problems and mixed control-stopping problems, which avoids the technical difficulties related to the measurable selection argument. In the Markov case, our result is tailor-made for the derivation of the dynamic programming equation in the sense of viscosity solutions.

read more

Content maybe subject to copyright    Report

Weak Dynamic Programming Principle
for Viscosity Solutions
Bruno Bouchard
and Nizar Touzi
February 2009
Abstract
We prove a weak version of the dynamic programming principle for standard
stochastic control problems and mixed control-stopping problems, which avoids the
technical difficulties related to the measurable selection argument. In the Markov
case, our result is tailor-maid for the derivation of the dynamic programming equation
in the sense of viscosity solutions.
Key words: Optimal control, Dynamic programming, discontinuous viscosity solutions.
AMS 1991 subject classifications: Primary 49L25, 60J60; secondary 49L20, 35K55.
1 Introduction
Consider the standard class of stochastic control problems in the Mayer form
V (t, x) := sup
ν∈U
E [f(X
ν
T
)|X
ν
t
= x] ,
where U is the controls set, X
ν
is the controlled process, f is some given function, 0 < T
is a given time horizon, t [0, T ) is the time origin, and x R
d
is some given initial condition.
This f ramework includes the general class of stochastic control problems under the so-called
Bolza formulation, the corresponding singular versions, and optimal stopping problems.
The authors are grateful to Nicole El Karoui for fruitful comments. This research is part of the Chair
Financial Risks of the Risk Foundation sponsored by Soci´et´e en´erale, the Chair Derivatives of the Future
sponsored by the ed´eration Bancaire Fran¸caise, the Chair Finance and Sustainable Development sponsored
by EDF and Calyon, and the Chair Les particuliers face au risque sponsored by Groupama.
CEREMADE, Universit´e Paris Dauphine and CREST-ENSAE, bouchard@ceremade.dauphine.fr
Ecole Polytechnique Paris, Centre de Math´ematiques Appliqu´ee s, touzi@cmap.polytechnique.fr
1

A key-tool for the analysis of such problems is the so-called dynamic programming principle
(DPP), which relates the timet value function V (t, .) to any later timeτ value V (τ, .) for
any stopping time τ [t, T ) a.s. A formal statement of the DPP is:
00
V (t, x) = v(t, x) := sup
ν∈U
E [V (τ, X
ν
τ
)|X
ν
t
= x] .
00
(1.1)
In particular, this result is routinely used in the case of controlled Markov jump-diffusions in
order to derive the corresponding dynamic programming equation in the sense of viscosity
solutions, see Lions [6, 7], Fleming and Soner [5], and Touzi [9].
The statement (1.1) of the DPP is very intuitive and can be easily proved in the deter-
ministic framework, or in discrete-time with finite probability space. However, its proof is
in general not trivial, and requires on the first stage that V be measurable.
The inequality V v is the easy one but still requires that V be measurable. Our
weak formulation avoids this is sue. Namely, under fairly general conditions on the controls
set and the controlled proc ess, it follows from an easy application of the tower property of
conditional expectations that
V (t, x) sup
ν∈U
E [V
(τ, X
ν
τ
)|X
ν
t
= x] ,
where V
is the upper semicontinuous envelope of the function V .
The proof of the converse inequality V v in a general probability space turns out to
be difficult when the function V is not known a priori to satisfy some continuity condition.
See e.g. Bertsekas and Shreve [1], Borkar [2], and El Karoui [4].
Our weak version of the DPP avoids the non-trivial measurable selection argument needed
to prove the inequality V v in (1.1). Namely, in the context of a general control problem
presented in Section 2, we show in Section 3 that:
V (t, x) sup
ν∈U
E [ϕ(τ, X
ν
τ
)|X
t
= x]
for every upper-semicontinuous minorant ϕ of V.
We also show that an easy consequence of this result is that
V (t, x) sup
ν∈U
E
V
(τ
ν
n
, X
ν
τ
ν
n
)|X
t
= x
,
where τ
ν
n
:= τ inf {s > t : |X
ν
s
x| > n}, and V
is the lower semicontinuous envelope of
V .
This result is weaker than the classical DPP (1.1). However, in the controlled Markov jump-
diffusions case, it turns out to be tailor-maid for the derivation of the dynamic programming
equation in the sense of viscosity solutions. Section 5 reports this de rivation in the context
of controlled diffusions.
Finally, Section 4 provides an extension of our argument in order to obtain a weak dynamic
programming principle for mixed control-stopping problems.
2

2 The stochastic control problem
Let (Ω, F, P ) be a probability space, T > 0 a finite time horizon, and F := {F
t
, 0 t T }
a given filtration of F, satisfying the usual assumptions. For every t 0, we denote by
F
t
= (F
t
s
)
s0
the right-continuous filtration generated by F measurable processes that are
independent of F
t
, t 0.
We denote by T the collection of all Fstopping times. For τ
1
, τ
2
T with τ
1
τ
2
a.s.,
the subset T
[τ
1
2
]
is the collection of all τ T such that τ [τ
1
, τ
2
] a.s. When τ
1
= 0, we
simply write T
τ
2
. We use the notations T
t
[τ
1
2
]
and T
t
τ
2
to denote the corresponding sets of
stopping times that are independent of F
t
For τ T and a subset A of a finite dimensional space, we denote by L
0
τ
(A) the collection
of all F
τ
measurable random variables with values in A. H
0
(A) is the collection of all
Fprogressively measurable processes with values in A, and H
0
rcll
(A) is the subset of all
processes in H
0
(A) which are right-continuous with finite left limits.
In the following, we denote by B
r
(z) (resp. B
r
(z)) the open ball (resp. its boundary) of
radius r > 0 and center z R
`
, ` N.
Througout this note, we fix an integer d N, and we introduce the sets:
S := [0, T ] × R
d
and S
0
:=
(τ, ξ) : τ T
T
and ξ L
0
τ
(R
d
)
.
We also denote by USC(S) (resp. LSC(S)) the collection of all upper-semicontinuous (resp.
lower-semicontinuous) functions from S to R.
The set of control processes is a given subset U
0
of H
0
(R
k
), for some integer k 1, so that
the controlled state process defined as the mapping:
(τ, ξ; ν) S × U
0
7− X
ν
τ
H
0
rcll
(R
d
) for some S with S S S
0
is well-defined and satisfies:
θ, X
ν
τ
(θ)
S for all (τ, ξ) S and θ T
[τ,T ]
.
Given a Borel function f : R
d
R and (t, x) S, we introduce the reward function
J : S × U R:
J(t, x; ν) := E
f
X
ν
t,x
(T )

(2.1)
which is well-defined for controls ν in
U :=
n
ν U
0
: E|f(X
ν
t,x
(T ))| < (t, x) S
o
. (2.2)
We say that a control ν U is t-admissible if it is independent of F
t
, and we denote by U
t
the collection of such processes . The stochastic control problem is defined by:
V (t, x) := sup
ν∈U
t
J(t, x; ν) for (t, x) S. (2.3)
3

3 Dynamic programming for stochastic control prob-
lems
For the purp ose of our weak dynamic programming principle, the following assumptions are
crucial.
Assumption A For all (t, x) S and ν U
t
, the controlled state process satisfies:
A1 (Independence) The process X
ν
t,x
is independent of F
t
.
A2 (Causality) For ˜ν U
t
, if ν = ˜ν on A F, then X
ν
t,x
= X
˜ν
t,x
on A.
A3 (Stability under concatenation) For every ˜ν U
t
, and θ T
t
[t,T ]
:
ν1
[0]
+ ˜ν1
(θ,T ]
U
t
.
A4 (Consistency with deterministic initial data) For all θ T
t
[t,T ]
, we have:
a. For P-a.e ω , there exists ˜ν
ω
U
θ(ω)
such that
E
f
X
ν
t,x
(T )
|F
θ
(ω) J(θ(ω), X
ν
t,x
(θ)(ω); ˜ν
ω
)
b. For t s T , θ T
t
[t,s]
, ˜ν U
s
, and ¯ν := ν1
[0]
+ ˜ν1
(θ,T ]
, we have:
E
f
X
¯ν
t,x
(T )
|F
θ
(ω) = J(θ(ω), X
ν
t,x
(θ)(ω); ˜ν) for P a.e. ω .
Remark 3.1 Assumption A3 above implies the following property of the controls set which
will be needed later:
A5 (Stability under bifurcation) For ν
1
, ν
2
U
t
, τ T
t
[t,T ]
and A F
t
τ
, we have:
¯ν := ν
1
1
[0]
+ (ν
1
1
A
+ ν
2
1
A
c
) 1
(τ,T ]
U
t
.
To see this, observe that τ
A
:= T 1
A
+ τ 1
A
c
is a stopping time in T
t
[t,T ]
, and ¯ν = ν
1
1
[0
A
)
+
ν
2
1
[τ
A
,T ]
is the concatenation of ν
1
and ν
2
at the stopping time τ
A
.
Iterating the above property, we see that for 0 t s T and τ T
t
[t,T ]
, we have the
following extension: for a finite sequence (ν
1
, . . . , ν
n
) of control in U
t
with ν
i
= ν
1
on [0, τ),
and for a partion (A
i
)
1in
of with A
i
F
t
τ
for every i n:
¯ν := ν
1
1
[0)
+ 1
[τ,T ]
n
X
i=1
ν
i
1
A
i
U
t
.
Our main result is the following weak version of the dynamic programming principle which
uses the following notation:
V
(t, x) := lim inf
(t
0
,x
0
)(t,x)
V (t
0
, x
0
), V
(t, x) := lim sup
(t
0
,x
0
)(t,x)
V (t
0
, x
0
), (t, x) S.
4

Theorem 3.1 Let Assumptions A hold true. Then for every (t, x) S, and for all family
of stopping times {θ
ν
, ν U
t
} T
t
[t,T ]
V (t, x) sup
ν∈U
t
E
V
(θ
ν
, X
ν
t,x
(θ
ν
))
. (3.1)
Assume further that J(.; ν) LSC(S) for every ν U
0
. Then, for any function ϕ : S R:
ϕ USC(S) and V ϕ = V (t, x) sup
ν∈U
ϕ
t
E
ϕ(θ
ν
, X
ν
t,x
(θ
ν
))
, (3.2)
where U
ϕ
t
=
ν U
t
: E
ϕ(θ
ν
, X
ν
t,x
(θ
ν
))
+
< or E
ϕ(θ
ν
, X
ν
t,x
(θ
ν
))
<
.
Before proceeding to the proof of this result, we report the following consequence.
Corollary 3.1 Let the conditions of Theorem 3.1 hold. For (t, x) S, let {θ
ν
, ν U
t
}
T
t
[t,T ]
be a family of stopping times such that X
ν
t,x
1
[t,θ
ν
]
is L
bounded for all ν U
t
. Then,
sup
ν∈U
t
E
V
(θ
ν
, X
ν
t,x
(θ
ν
))
V (t, x) sup
ν∈U
t
E
V
(θ
ν
, X
ν
t,x
(θ
ν
))
. (3.3)
Proof The right-hand side inequality is already provided in Theorem 3.1. It follows from
standard arguments, see e.g. Lemma 3.5 in [8], that we can find a sequence of continuous
functions (ϕ
n
)
n
such that ϕ
n
V
V for all n 1 and such that ϕ
n
converges pointwise
to V
on [0, T ] × B
r
(0). Set φ
N
:= min
nN
ϕ
n
for N 1 and observe that the sequence
(φ
N
)
N
is non-decreasing and converges pointwise to V
on [0, T] × B
r
(0). Applying (3.2) of
Theorem 3.1 and using the monotone convergence Theorem, we then obtain:
V (t, x) lim
N→∞
E
φ
N
(θ
ν
, X
ν
t,x
(θ
ν
))
= E
V
(θ
ν
, X
ν
t,x
(θ
ν
))
.
2
Proof of Theorem 3.1 1. Let ν U
t
be arbitrary and set θ := θ
ν
. The first assertion
is a direct consequence of Assumption A4-a. I ndeed, it implies that, for P-almost all ω Ω,
there exists ˜ν
ω
U
θ(ω)
such that
E
f
X
ν
t,x
(T )
|F
θ
(ω) J(θ(ω), X
ν
t,x
(θ)(ω); ˜ν
ω
) .
Since, by definition, J(θ(ω), X
ν
t,x
(θ)(ω); ˜ν
ω
) V
(θ(ω), X
ν
t,x
(θ)(ω)), it follows from the tower
property of conditional expectations that
E
f
X
ν
t,x
(T )

= E
E
f
X
ν
t,x
(T )
|F
θ

E
V
θ, X
ν
t,x
(θ)

.
2. Let {(t
i
, x
i
), i 1} := Q
d+1
S, and let ε > 0 be given. Then there is a sequence
(ν
i,ε
)
i1
U
0
such that:
ν
i,ε
U
t
i
and J(t
i
, x
i
; ν
i,ε
) V (t
i
, x
i
) ε, for every i 1. (3.4)
5

Citations
More filters
Journal ArticleDOI

Wellposedness of Second Order Backward SDEs

TL;DR: Soner et al. as mentioned in this paper provided an existence and uniqueness theory for an extension of backward SDEs to the second order, which is a fully nonlinear extension of the Feynman-Kac formula.
Posted Content

Wellposedness of Second Order Backward SDEs

TL;DR: In this paper, the authors provide an existence and uniqueness theory for an extension of backward SDEs to the second order, based on a stochastic representation of the Feynman Kac formula.
Journal ArticleDOI

Partial differential equation models in macroeconomics.

TL;DR: A number of examples of partial differential equations that naturally arise in macroeconomics are presented, what is known about their properties, and some open questions for future research are listed.
Journal ArticleDOI

Optimal transportation under controlled stochastic dynamics

TL;DR: In this article, an extension of the Monge-Kantorovitch optimal transportation problem is considered, where the mass is transported along a continuous semimartingale, and the cost of transportation depends on the drift and the diffusion coefficients of the continuous semi-artingsale.
Journal ArticleDOI

Optimal Control of Trading Algorithms: A General Impulse Control Approach

TL;DR: A general framework for intraday trading based on the control of trading algorithms and adapts the weak dynamic programming principle to provide a characterization of the associated value function as a discontinuous viscosity solution of a system of partial differential equations with appropriate boundary conditions.
References
More filters
Journal ArticleDOI

User’s guide to viscosity solutions of second order partial differential equations

TL;DR: The notion of viscosity solutions of scalar fully nonlinear partial differential equations of second order provides a framework in which startling comparison and uniqueness theorems, existence theorem, and continuous dependence may now be proved by very efficient and striking arguments as discussed by the authors.
Book

Controlled Markov processes and viscosity solutions

TL;DR: In this paper, an introduction to optimal stochastic control for continuous time Markov processes and to the theory of viscosity solutions is given, as well as a concise introduction to two-controller, zero-sum differential games.
Book

Stochastic controls : Hamiltonian systems and HJB equations

Jiongmin Yong, +1 more
TL;DR: In this article, the authors consider the problem of deterministic control problems in the context of stochastic control systems and show that the optimal control problem can be formulated in a deterministic manner.
Book

Stochastic optimal control : the discrete time case

TL;DR: This research monograph is the authoritative and comprehensive treatment of the mathematical foundations of stochastic optimal control of discrete-time systems, including thetreatment of the intricate measure-theoretic issues.
Book

Applied Stochastic Control of Jump Diffusions

TL;DR: This third edition has expanded and updated the second edition and includedmore recent developments within stochastic control and its applications and replaced Section1.5 on application to finance by a more comprehensive presentation of financial markets modeled by jump diffusions.
Related Papers (5)
Frequently Asked Questions (7)
Q1. What contributions have the authors mentioned in the paper "Weak dynamic programming principle for viscosity solutions" ?

The authors prove a weak version of the dynamic programming principle for standard stochastic control problems and mixed control-stopping problems, which avoids the technical difficulties related to the measurable selection argument. 

The mixed control-stopping problem is defined by:V̄ (t, x) := sup (ν,τ)∈Ūt×T t[t,T ] J̄(t, x; ν, τ) , (4.2)where Ūt is the subset of elements of Ū that are independent of Ft. 

key-tool for the analysis of such problems is the so-called dynamic programming principle (DPP), which relates the time−t value function V (t, .) to any later time−τ value V (τ, .) for any stopping time τ ∈ [t, T ) a.s. A formal statement of the DPP is:′′V (t, x) = v(t, x) := sup ν∈U E [V (τ,Xντ )|Xνt = x] .′′ (1.1)In particular, this result is routinely used in the case of controlled Markov jump-diffusions in order to derive the corresponding dynamic programming equation in the sense of viscosity solutions, see Lions [6, 7], Fleming and Soner [5], and Touzi [9]. 

Consider the standard class of stochastic control problems in the Mayer formV (t, x) := sup ν∈UE [f(XνT )|Xνt = x] ,where U is the controls set, Xν is the controlled process, f is some given function, 0 < T ≤ ∞ is a given time horizon, t ∈ [0, T ) is the time origin, and x ∈ 

The key ingredient for the proof of (4.6) is the following property of the set of stopping times TT :For all θ, τ1 ∈ T tT and τ2 ∈ T t[θ,T ], the authors have τ11{τ1<θ} + τ21{τ1≥θ} ∈ T tT . (4.3)In order to extend the result of Theorem 3.1, the authors shall assume that the following version of A4 holds:Assumption A4’ 

Arguying as in Step 2 of the proof of Theorem 3.1, the authors first observe that, for every ε > 0, the authors can find a countable family Āi := (ti − ri, ti] × Ai ⊂ S, together with a sequence of stopping times τ i,ε in T ti[ti,T ], i ≥ 1, satisfying∪iĀi = S, Āi ∩ 

The authors take (Ω,F , F, P) to be the d-dimensional canonical filtered space equipped with the Wiener measure and denote by ω or ω̃ a generic point.