scispace - formally typeset
Open AccessJournal ArticleDOI

Constrained Stochastic LQC: A Tractable Approach

Reads0
Chats0
TLDR
This paper presents an alternative approach based on results from robust optimization to solve the stochastic linear-quadratic control (SLQC) problem, and considers a tight, second-order cone approximation to the SDP that can be solved much more efficiently when the problem has additional constraints.
Abstract
Despite the celebrated success of dynamic programming for optimizing quadratic cost functions over linear systems, such an approach is limited by its inability to tractably deal with even simple constraints. In this paper, we present an alternative approach based on results from robust optimization to solve the stochastic linear-quadratic control (SLQC) problem. In the unconstrained case, the problem may be formulated as a semidefinite optimization problem (SDP). We show that we can reduce this SDP to optimization of a convex function over a scalar variable followed by matrix multiplication in the current state, thus yielding an approach that is amenable to closed-loop control and analogous to the Riccati equation in our framework. We also consider a tight, second-order cone (SOCP) approximation to the SDP that can be solved much more efficiently when the problem has additional constraints. Both the SDP and SOCP are tractable in the presence of control and state space constraints; moreover, compared to the Riccati approach, they provide much greater control over the stochastic behavior of the cost function when the noise in the system is distributed normally.

read more

Content maybe subject to copyright    Report

1826 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 52, NO. 10, OCTOBER 2007
Constrained Stochastic LQC: A Tractable Approach
Dimitris Bertsimas and David B. Brown
Abstract—Despite the celebrated success of dynamic program-
ming for optimizing quadratic cost functions over linear systems,
such an approach is limited by its inability to tractably deal with
even simple constraints. In this paper, we present an alternative
approach based on results from robust optimization to solve the
stochastic linear-quadratic control (SLQC) problem. In the uncon-
strained case, the problem may be formulated as a semidefinite op-
timization problem (SDP). We show that we can reduce this SDP
to optimization of a convex function over a scalar variable followed
by matrix multiplication in the current state, thus yielding an ap-
proach that is amenable to closed-loop control and analogous to
the Riccati equation in our framework. We also consider a tight,
second-order cone (SOCP) approximation to the SDP that can be
solved much more efficiently when the problem has additional con-
straints. Both the SDP and SOCP are tractable in the presence
of control and state space constraints; moreover, compared to the
Riccati approach, they provide much greater control over the sto-
chastic behavior of the cost function when the noise in the system
is distributed normally.
Index Terms—Control with constraints, linear-quadratic con-
trol, robust optimization, semidefinite optimization.
I. INTRODUCTION
T
HE theory of dynamic programming, while conceptually
elegant, is computationally impractical for all but a few
special cases of system dynamics and cost functions. One of
the notable triumphs of dynamic programming is its success
with stochastic linear systems and quadratic cost functions
(stochastic linear-quadratic control—SLQC). It is easily shown
(e.g., [4]) in this case that the cost-to-go functions are quadratic
in the state, and therefore the resulting optimal controls are
linear in the current state. As a result, solving Bellman’s
equation in this case is tantamount to finding appropriate
gain matrices, and these gain matrices are described by the
well-known Riccati equation [19].
This success, however, has some limitations. In particular,
Bellman’s equation in the SLQC has tractability issues with
even the simplest of constraints on either the control or state
vectors. It is not difficult to find applications that demand con-
straints on the controls or state. Bertsimas and Lo [5] describe
the dynamics of an optimal share-purchasing policy for stock-
holders. The unconstrained policy based on the Riccati equa-
tion requires the investor to purchase and
sell shares, which is
clearly absurd. This can be mitigated by a nonnegativity con-
straint on the control, which causes the cost-to-go function to be-
Manuscript received September 15, 2004; revised August 3, 2005 and August
14, 2006. Recommended by Associate Editor C. D. Charalambous.
D. Bertsimas is with the Sloan School of Management and Operations Re-
search Center, Massachusetts Institute of Technology, Cambridge, MA 02139
USA (e-mail: dbertsim@mit.edu).
D. B. Brown is with the Fuqua School of Business, Duke University, Durham,
NC 27705 USA (e-mail: dbbrown@duke.edu).
Digital Object Identifier 10.1109/TAC.2007.906182
come piecewise quadratic with an exponential number of pieces.
Thus, a very simple constraint destroys the tractability of this
approach. Much of the current literature (e.g., [17] and [18])
derives necessary conditions for optimality for simple control
constraints but does not explicitly describe solution methods.
A further drawback of the Riccati approach from dynamic
programming is that it only deals with the expected value of the
resulting cost. In many cases, we may wish to know more in-
formation about the distribution of the cost function (e.g., cases
in which we want to provide a probabilistic level of protection
guaranteeing some system performance).
In this paper, we propose an alternative approach to the SLQC
problem. Rather than attempting to solve Bellman’s equation,
we exploit relatively new results from robust optimization to
propose an alternative solution technique for SLQC. Our ap-
proach has the following advantages over the traditional dy-
namic programming approach.
1) It can tractably handle a variety of constraints on both the
control and state vectors.
2) It admits a probabilistic description of the resulting cost,
allowing us to understand and control the system cost
distribution.
3) In the unconstrained case, its complexity is not much more
than the complexity of linear feedback (i.e., the Riccati
approach). In particular, optimal policies in this case may
be computed by optimizing a convex function over a scalar,
then multiplying the initial state by appropriate matrices.
Our approach is based on techniques from robust optimization.
Although the use of convex optimization techniques is common
in the control literature (see, e.g., [7], [9], [13], and [14]), we
believe our methodology is a new one. Chen and Zhou [8] pro-
vide an elegant solution to the SLQC problem with conic control
constraints, but their solution is limited to a scalar-valued state
variable and homogeneous system dynamics. Our approach here
is more general. We emphasize that we are not proposing a so-
lution for robust control (see, e.g., [21] for a start to the vast
literature on the subject); rather, we are proposing an approach
to the SLQC with the conceptual framework of robust optimiza-
tion as a guide.
The structure of this paper is as follows. In Section II, we
present a description of the SQLC problem, as well as the cur-
rently known results from dynamic programming and a con-
ceptual description of our methodology. In addition, we pro-
vide background for the robust optimization results we will later
use. In Section III, we develop our approach for the uncon-
strained SLQC problem. This approach is based on semidefi-
nite programming (SDP) and robust quadratic programming re-
sults from Ben-Tal and Nemirovski [3]. We further show that
this SDP has a very special structure that allows us to derive a
closed-loop control law suitable for real-time applications. Un-
fortunately, in the presence of constraints, this simplification no
0018-9286/$25.00 © 2007 IEEE

BERTSIMAS AND BROWN: CONSTRAINED STOCHASTIC LQC: A TRACTABLE APPROACH 1827
longer applies, and the complexity of solving the SDP is im-
practical in a closed-loop setting. This motivates us to simplify
the SDP, which we do in Section IV. Here we use recent results
from robust conic optimization developed by Bertsimas and Sim
[6] to develop a tight SOCP approximation that is far easier to
solve. We then show in Section V how this approach admits var-
ious constraints and performance guarantees. These constraints
may be deterministic constraints on the control or probabilistic
guarantees on the state and objective function. In Section VI,
we show that a particular model for imperfect state information
ts into the framework already developed, and in Section VII,
we provide computational results. Section VIII concludes this
paper.
II. P
ROBLEM
STATEMENT AND
PRELIMINARIES
Throughout this paper, we will work with discrete-time sto-
chastic linear systems of the form
(1)
where
is a state vector, is a control vector,
and
is a disturbance vector (an unknown quantity).
We assume throughout that the matrices
,
, and are known exactly.
It is desired to control the system in question in a way that
keeps the cost function
(2)
as small as possible. Here we will assume
, ,
and, again, that the data
, and , are known exactly.
We are also using the shorthand
and to denote the entire
vector of controls and disturbances, i.e.,
(3)
(4)
Finally, our convention will be for the system to be in some
initial state
. Unless otherwise stated, we assume this initial
state is also known exactly.
Note that (2) is an uncertain quantity, as it depends on the
realization of
, which is unknown. Most approaches assume
is a random variable possessing some distributional properties
and proceed to minimize (2) in an expected value sense. We now
survey the traditional approach to this problem.
A. The Traditional Approach: Bellman’s Recursion
The dynamic programming approach requires a few distribu-
tional assumptions on the disturbance vectors. Typically, it is as-
sumed that the
are independent, and independent of both
and . Moreover, we have , and has nite second
moment. For this derivation, we will assume
,
and
for ease of notation, but the result holds more gen-
erally after some simple manipulations. Modications of some
of the distributional assumptions (such as nonzero mean, corre-
lations) are also possible, but we do not detail them here.
The literature on this subject is vast, and the problem is well
understood. The main result is that the expected cost-to-go func-
tions
dened by
(5)
are quadratic in the state
. Thus, it follows that the optimal
policy is linear in the current state. In particular, one can show
(see, e.g., [4]) that the optimal control
is given by
where and
are symmetric, positive semidenite matrices computed recur-
sively. The fact that the recursion given in (5) works so well
(from a complexity standpoint) is quite particular to the case of
linear systems and quadratic costs. For more arbitrary systems
or cost functions such an approach is, in general, intractable.
A more troubling difculty, however, is that even with the
same system and cost function, this approach explodes compu-
tationally with ostensibly simple constraints, such as
.
For instance, the cost-to-go function (5) in this case becomes
piecewise quadratic with an exponential (in
) number of
pieces.
Of course, one way to suboptimally handle this issue is to
apply Lagrangian duality techniques to the constraints. For ex-
ample, in the case of quadratic constraints on the control vectors,
i.e.,
, one may relax the constraints and then max-
imize over a dual vector
. In particular, the cost-to-go functions
now have the form
(6)
where the dual functions
have the form
From here, one approach to solving (6) suboptimally is to
select a priori a dual vector
and then apply the Riccati equation
as usual. An optimal solution, however, relies on computation
of the optimal dual vector
, which, in general, is difcult and
destroys the quadratic form of the cost-to-go functions.
Thus, the traditional, dynamic programming approach can be
solved very rapidly with linear feedback in the unconstrained
case but becomes, for large-scale problems, impossible to solve
optimally when the constraints are included. This is a very unfa-
vorable property of the DP approach, and it is in direct contrast
to the eld of convex optimization, whose problem instances
are quite robust (in terms of complexity) to perturbations in the
constraint structure. Our approach, which we now detail, will
leverage this useful property of convex optimization.

1828 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 52, NO. 10, OCTOBER 2007
B. A Tractable Approach: Overview
The traditional approach above is not amenable to problem
changes such as the addition of constraints for two primary
reasons.
1) Complexity of distributional calculations. Computing the
expectation in (5), except for very special cases, is cum-
bersome computationally.
2) Intractability of Bellmans recursion. The recursion in (5)
requires us, when computing the current control, to have
advance knowledge of all future controls for all possible
future states, even states that are extraordinarily improb-
able. While this recursion is an elegant idea conceptually,
it is not well suited to computation because the number of
possible future states grows so rapidly with problem size.
We propose the following approach, which circumvents these
difculties.
a) Given our current state
and problem data, we
consider the entire control and disturbance vectors
, respectively, as in (3) and (4).
b) We do not assume a particular distribution for
. As-
sume only that
belongs within some reasonable un-
certainty set. In particular, assume
belongs to some
norm-bounded set
(7)
parameterized by
.
1
c) Discard the notion of Bellmans recursion. Instead, do
the best we can for all possible disturbances within
.
That is, rather than computing controls for every possible
state realization, we simply choose a control vector for
the remaining stages that performs best for the most pes-
simistic disturbance within this reasonable uncertainty
set. Specically, we search for an optimal control
to
the problem
(8)
Of course, this brings up the issue of open-loop versus closed-
loop control. At rst glance, this approach appears to be an
open-loop method only. We can, however, compute a solution
to (8), take the rst components, and apply this as the
current control. After a new state observation, we can repeat the
calculation in (8) with the updated problem data (most of this
updating can be done ofine). The only issue is that the routine
for solving (8) be computationally simple enough for the appli-
cation at hand. The complexity of these solution procedures will
indeed be a central issue for much of the remaining discussion.
Note that the model in (8) is similar in spirit to the approach
of
control (e.g., [2]) in that it is worst case over a determin-
istic uncertainty set. In contrast to
control, however, our
methodology explicitly relies on new results in robust optimiza-
tion. In particular, our approach has the following properties.
1
If we wish instead to have
www
2f
www
j
www
6
www
g
, where
6
0
,
then we may rescale coordinates and obtain a problem of the same form. Note
that the statistical appropriateness of ellipsoids and their explicit construction is
not the subject of this paper, but the interested reader may see Paganini [11] for
uncertainty set modelling for the case of white noise.
It is tractable, even in the presence of control and state-
space constraints.
We solve a deterministic problem (8) to compute an op-
timal solution
. Thus far, we have not discussed proba-
bility in any way. Nonetheless, our approach is amenable
to examining how good
is when the disturbances ,
rather being chosen in an adversarial manner from an el-
lipsoid, instead obey a probability distribution. We show in
Theorem 8 that under normality for
, the solution sat-
ises very strong probabilistic guarantees. In other words,
when nature gives rise to disturbances that are bounded, we
solve the problem optimally. When, on the other hand, na-
ture gives rise to disturbances that do not satisfy
,
we still show strong probabilistic guarantees on the perfor-
mance of
.
In the unconstrained case, it yields an efcient control law
that is linear in the current state after a simple, scalar opti-
mization procedure. In addition, for
, we recover the
traditional (Riccati) solution, whereas, for
,wehave
a family of increasingly conservative approaches.
To solve (8), we will utilize a number of results from robust
optimization, which we now describe.
C. Results From Robust Quadratic Optimization Over
Ellipsoids
We will leverage some robust quadratic programming results
popularized by Ben-Tal and Nemirovski [3]. In particular, they
consider the conic quadratic constraint
when the data are uncertain and known only to be-
long to some bounded uncertainty set
. The goal of robust
quadratic programming is to optimize over the set of all
such
that the constraint holds for all possible values of the data within
the set
. In other words, we desire to nd such that
Ben-Tal and Nemirovski show that in the case of an ellipsoidal
uncertainty set, the problem of optimizing over an uncertain
conic quadratic inequality may be solved tractably using
semidenite programming. This turns out also to be the case
for (8). To this end, we will need the following two classical
results, proofs of which may be found in [3], among others.
First, we have the Schur complement lemma.
Lemma 1: Let
where . Then is positive (semi) denite if and only if
the matrix
is positive (semi) denite.
In addition, we have the
-lemma.
Lemma 2: Let
be symmetric matrices and assume
that the quadratic inequality

BERTSIMAS AND BROWN: CONSTRAINED STOCHASTIC LQC: A TRACTABLE APPROACH 1829
is strictly feasible. Then the minimum value of the problem
minimize
subject to
is nonnegative if and only if there exists a such that
.
D. Results From Robust Conic Optimization Over
Norm-Bounded Sets
To improve the complexity of solving (8) when we have con-
straints, we will utilize recent results from robust conic opti-
mization results due to Bertsimas and Sim [6]. This approach
is a relaxation of the exact min-max approach but is computa-
tionally less complex and leads to a unied probability bound
across a variety of conic optimization problems. We survey the
main ideas and developments here.
Bertsimas and Sim use the following model for data
uncertainty:
where is the nominal data value and are data perturba-
tions. The
are random variables with mean zero and indepen-
dent, identical distributions. The goal is to nd a policy
such
that a given constraint is robust feasible, i.e.,
(9)
where
(10)
For our purposes, we typically use the Euclidean norm on
,
as it is self-dual, but many other choices for the norm may be
tractably used [6]. We operate under some restrictions on the
function
.
2
Assumption 1: The function satises the following.
a)
is convex in for all .
b)
for all .
One of the central ideas of [6] is to linearize the model of ro-
bustness as follows:
(11)
where
2
In [6], the authors assume the function is concave in the data. For our pur-
poses, convexity is more convenient. All results follow up to sign changes, and
we report them accordingly.
In the framework developed thus far, (11) turns out to be a re-
laxation of (9), i.e., we have the following.
Proposition 1 (BertsimasSim):
a) If
, then satises (11)
if and only if
satises (9).
b) Under Assumption 1, if
is feasible in (11), then is
feasible in (9).
Finally, (11) is tractable due to the following.
Theorem 1 (BertsimasSim): Under Assumption 1, we have
the following.
a) Constraint (11) is equivalent to
(12)
where
b) Equation (12) can be written as
(13)
Finally, Bertsimas and Sim derive a probability of constraint
violation.
Theorem 2 (BertsimasSim): In the model of uncertainty in
(10), when we use the
-norm, i.e., , and under the
assumption that
, we have the probability bound
where for linear programs (LPs), for SOCPs,
and
for SDPs ( is the dimension of the matrix in the
SDP).
III. A
N EXACT
APPROACH USING SDP
In this section, we apply the robust quadratic optimization
results to formulate (8) as an SDP. We then show that we can
compute optimal solutions to this SDP with a very simple con-
trol law.
First, exploiting the linearity of the system, we have the fol-
lowing, straightforward result.
Proposition 2: The cost function (2) for (1) can be written in
the form
(14)
for appropriate vectors
and matrices
,
and where
.

1830 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 52, NO. 10, OCTOBER 2007
Proof: Since the system is linear, we can write the state at
any instant
as
where
Now the cost of any state term is written
Thus, the overall cost is clearly written in the form stated above,
with
where
diag
Finally, positive (semi) deniteness of and follow from
positive (semi) deniteness of
and .
Next, for ease of notation, we will transform the coordinates
of the control space.
Proposition 3: To minimize the cost function
in
Proposition 2 over all
, it is sufcient instead to
optimize over all
the cost function
(15)
with
.
Proof: The proof is immediate from the fact that
exists since , and then using the transformation
.
By Proposition 3, then, (8) is equivalent to the problem
(16)
This problem may be solved using SDP, as we now show.
Theorem 3: Problem (16) may be solved by the following
SDP:
minimize
subject to
(17)
in decision variables
, and .
Proof: We rst rewrite the problem as
minimize
subject to
(18)
We may homogenize the system and rewrite this equivalently as
minimize
subject to
(19)
Clearly, feasibility of
in (19) implies feasibility of
in (18) (by setting ). For the other direction, assume
is feasible in (18) and set , where . This
implies
, and
where the inequality follows by (18). Thus, the claim is true.
But now we wish to check whether a homogenous quadratic
form in
is nonnegative over all satisfying another
quadratic form. Invoking Lemma 2, we know the constraint
holds if and only if there exists a
such that

Citations
More filters
Journal ArticleDOI

Theory and Applications of Robust Optimization

TL;DR: This paper surveys the primary research, both theoretical and applied, in the area of robust optimization (RO), focusing on the computational attractiveness of RO approaches, as well as the modeling power and broad applicability of the methodology.
Journal Article

Theory and Applications of Robust Optimization

TL;DR: In this article, the authors survey the primary research, both theoretical and applied, in the area of robust optimization and highlight applications of RO across a wide spectrum of domains, including finance, statistics, learning, and various areas of engineering.
Posted Content

Theory and Applications of Robust Optimization

TL;DR: In this paper, the authors survey the primary research, both theoretical and applied, in the area of robust optimization and highlight applications of RO across a wide spectrum of domains, including finance, statistics, learning, and various areas of engineering.
Journal ArticleDOI

Tractable Approximations to Robust Conic Optimization Problems

TL;DR: A relaxed robust counterpart for general conic optimization problems that preserves the computational tractability of the nominal problem and allows us to provide a guarantee on the probability that the robust solution is feasible when the uncertain coefficients obey independent and identically distributed normal distributions.
Journal ArticleDOI

Stochastic linear Model Predictive Control with chance constraints – A review

TL;DR: The main ideas underlying SMPC are presented and different classifications of the available methods are proposed in terms of the dynamic characteristics of the system under control, the performance index to be minimized, the meaning and management of the probabilistic constraints adopted, and their feasibility and convergence properties.
References
More filters
Book

Linear Matrix Inequalities in System and Control Theory

Edwin E. Yaz
TL;DR: In this paper, the authors present a brief history of LMIs in control theory and discuss some of the standard problems involved in LMIs, such as linear matrix inequalities, linear differential inequalities, and matrix problems with analytic solutions.
Book

Dynamic Programming and Optimal Control

TL;DR: The leading and most up-to-date textbook on the far-ranging algorithmic methododogy of Dynamic Programming, which can be used for optimal control, Markovian decision problems, planning and sequential decision making under uncertainty, and discrete/combinatorial optimization.
MonographDOI

Lectures on modern convex optimization: analysis, algorithms, and engineering applications

TL;DR: The authors present the basic theory of state-of-the-art polynomial time interior point methods for linear, conic quadratic, and semidefinite programming as well as their numerous applications in engineering.
Related Papers (5)
Frequently Asked Questions (8)
Q1. What have the authors contributed in "Constrained stochastic lqc: a tractable approach" ?

In this paper, the authors present an alternative approach based on results from robust optimization to solve the stochastic linear-quadratic control ( SLQC ) problem. The authors show that they can reduce this SDP to optimization of a convex function over a scalar variable followed by matrix multiplication in the current state, thus yielding an approach that is amenable to closed-loop control and analogous to the Riccati equation in their framework. The authors also consider a tight, second-order cone ( SOCP ) approximation to the SDP that can be solved much more efficiently when the problem has additional constraints. Both the SDP and SOCP are tractable in the presence of control and state space constraints ; moreover, compared to the Riccati approach, they provide much greater control over the stochastic behavior of the cost function when the noise in the system is distributed normally. 

Proposition 10: With noisy estimates of the state given by (54), the cost-to-go can be written in the form(55)where , and are as in Proposition 2, is as in Proposition 2 with replacing , andFurthermore, the matrix is positive semidefinite. 

Most approaches assume is a random variable possessing some distributional properties and proceed to minimize (2) in an expected value sense. 

B. Probabilistic State GuaranteesSince the state of the system is not exactly known, any constraints on can only be enforced in a probabilistic sense. 

In other words, the authors desire to find such thatBen-Tal and Nemirovski show that in the case of an ellipsoidal uncertainty set, the problem of optimizing over an uncertain conic quadratic inequality may be solved tractably using semidefinite programming. 

While it is true that this model does not seem, at first glance, to apply to random variables which are unbounded, the purpose of the probability results within this section is to show that the optimal solutions based their uncertainty model do in fact have reasonable performance guarantees even when the underlying disturbance vectors obey a different uncertainty model, namely, one admitting a probabilistic description. 

In the model of uncertainty in (10), when the authors use the -norm, i.e., , and under the assumption that , the authors have the probability boundwhere for linear programs (LPs), for SOCPs, and for SDPs ( is the dimension of the matrix in the SDP). 

Typically this runs longer than the SOCP, solidifying their assertion that the SOCP is much more suitable to efficient, closed-loop control.D. Performance on a Problem With Constraints