What have the authors contributed in "Constrained stochastic lqc: a tractable approach" ?

In this paper, the authors present an alternative approach based on results from robust optimization to solve the stochastic linear-quadratic control ( SLQC ) problem. The authors show that they can reduce this SDP to optimization of a convex function over a scalar variable followed by matrix multiplication in the current state, thus yielding an approach that is amenable to closed-loop control and analogous to the Riccati equation in their framework. The authors also consider a tight, second-order cone ( SOCP ) approximation to the SDP that can be solved much more efficiently when the problem has additional constraints. Both the SDP and SOCP are tractable in the presence of control and state space constraints ; moreover, compared to the Riccati approach, they provide much greater control over the stochastic behavior of the cost function when the noise in the system is distributed normally.

What is the cost-to-go function in Proposition 10?

Proposition 10: With noisy estimates of the state given by (54), the cost-to-go can be written in the form(55)where , and are as in Proposition 2, is as in Proposition 2 with replacing , andFurthermore, the matrix is positive semidefinite.

What is the common way to ensure that the constraints on can be enforced?

B. Probabilistic State GuaranteesSince the state of the system is not exactly known, any constraints on can only be enforced in a probabilistic sense.

What is the purpose of the probability results in Section III?

While it is true that this model does not seem, at first glance, to apply to random variables which are unbounded, the purpose of the probability results within this section is to show that the optimal solutions based their uncertainty model do in fact have reasonable performance guarantees even when the underlying disturbance vectors obey a different uncertainty model, namely, one admitting a probabilistic description.

What is the probability boundwhere for linear programs?

In the model of uncertainty in (10), when the authors use the -norm, i.e., , and under the assumption that , the authors have the probability boundwhere for linear programs (LPs), for SOCPs, and for SDPs ( is the dimension of the matrix in the SDP).

What is the performance of the SOCP?

Typically this runs longer than the SOCP, solidifying their assertion that the SOCP is much more suitable to efficient, closed-loop control.D. Performance on a Problem With Constraints

(Open Access) Constrained Stochastic LQC: A Tractable Approach (2007) | Dimitris Bertsimas

Q: How can the authors solve the problem of a conic quadratic inequality?

In other words, the authors desire to find such thatBen-Tal and Nemirovski show that in the case of an ellipsoidal uncertainty set, the problem of optimizing over an uncertain conic quadratic inequality may be solved tractably using semidefinite programming.

1826 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 52, NO. 10, OCTOBER 2007

Constrained Stochastic LQC: A Tractable Approach

Dimitris Bertsimas and David B. Brown

Abstract—Despite the celebrated success of dynamic program-

ming for optimizing quadratic cost functions over linear systems,

such an approach is limited by its inability to tractably deal with

even simple constraints. In this paper, we present an alternative

approach based on results from robust optimization to solve the

stochastic linear-quadratic control (SLQC) problem. In the uncon-

strained case, the problem may be formulated as a semideﬁnite op-

timization problem (SDP). We show that we can reduce this SDP

to optimization of a convex function over a scalar variable followed

by matrix multiplication in the current state, thus yielding an ap-

proach that is amenable to closed-loop control and analogous to

the Riccati equation in our framework. We also consider a tight,

second-order cone (SOCP) approximation to the SDP that can be

solved much more efﬁciently when the problem has additional con-

straints. Both the SDP and SOCP are tractable in the presence

of control and state space constraints; moreover, compared to the

Riccati approach, they provide much greater control over the sto-

chastic behavior of the cost function when the noise in the system

is distributed normally.

Index Terms—Control with constraints, linear-quadratic con-

trol, robust optimization, semideﬁnite optimization.

I. INTRODUCTION

HE theory of dynamic programming, while conceptually

elegant, is computationally impractical for all but a few

special cases of system dynamics and cost functions. One of

the notable triumphs of dynamic programming is its success

with stochastic linear systems and quadratic cost functions

(stochastic linear-quadratic control—SLQC). It is easily shown

(e.g., [4]) in this case that the cost-to-go functions are quadratic

in the state, and therefore the resulting optimal controls are

linear in the current state. As a result, solving Bellman’s

equation in this case is tantamount to ﬁnding appropriate

gain matrices, and these gain matrices are described by the

well-known Riccati equation [19].

This success, however, has some limitations. In particular,

Bellman’s equation in the SLQC has tractability issues with

even the simplest of constraints on either the control or state

vectors. It is not difﬁcult to ﬁnd applications that demand con-

straints on the controls or state. Bertsimas and Lo [5] describe

the dynamics of an optimal share-purchasing policy for stock-

holders. The unconstrained policy based on the Riccati equa-

tion requires the investor to purchase and

sell shares, which is

clearly absurd. This can be mitigated by a nonnegativity con-

straint on the control, which causes the cost-to-go function to be-

Manuscript received September 15, 2004; revised August 3, 2005 and August

14, 2006. Recommended by Associate Editor C. D. Charalambous.

D. Bertsimas is with the Sloan School of Management and Operations Re-

search Center, Massachusetts Institute of Technology, Cambridge, MA 02139

USA (e-mail: dbertsim@mit.edu).

D. B. Brown is with the Fuqua School of Business, Duke University, Durham,

NC 27705 USA (e-mail: dbbrown@duke.edu).

Digital Object Identiﬁer 10.1109/TAC.2007.906182

come piecewise quadratic with an exponential number of pieces.

Thus, a very simple constraint destroys the tractability of this

approach. Much of the current literature (e.g., [17] and [18])

derives necessary conditions for optimality for simple control

constraints but does not explicitly describe solution methods.

A further drawback of the Riccati approach from dynamic

programming is that it only deals with the expected value of the

resulting cost. In many cases, we may wish to know more in-

formation about the distribution of the cost function (e.g., cases

in which we want to provide a probabilistic level of protection

guaranteeing some system performance).

In this paper, we propose an alternative approach to the SLQC

problem. Rather than attempting to solve Bellman’s equation,

we exploit relatively new results from robust optimization to

propose an alternative solution technique for SLQC. Our ap-

proach has the following advantages over the traditional dy-

namic programming approach.

1) It can tractably handle a variety of constraints on both the

control and state vectors.

2) It admits a probabilistic description of the resulting cost,

allowing us to understand and control the system cost

distribution.

3) In the unconstrained case, its complexity is not much more

than the complexity of linear feedback (i.e., the Riccati

approach). In particular, optimal policies in this case may

be computed by optimizing a convex function over a scalar,

then multiplying the initial state by appropriate matrices.

Our approach is based on techniques from robust optimization.

Although the use of convex optimization techniques is common

in the control literature (see, e.g., [7], [9], [13], and [14]), we

believe our methodology is a new one. Chen and Zhou [8] pro-

vide an elegant solution to the SLQC problem with conic control

constraints, but their solution is limited to a scalar-valued state

variable and homogeneous system dynamics. Our approach here

is more general. We emphasize that we are not proposing a so-

lution for robust control (see, e.g., [21] for a start to the vast

literature on the subject); rather, we are proposing an approach

to the SLQC with the conceptual framework of robust optimiza-

tion as a guide.

The structure of this paper is as follows. In Section II, we

present a description of the SQLC problem, as well as the cur-

rently known results from dynamic programming and a con-

ceptual description of our methodology. In addition, we pro-

vide background for the robust optimization results we will later

use. In Section III, we develop our approach for the uncon-

strained SLQC problem. This approach is based on semideﬁ-

nite programming (SDP) and robust quadratic programming re-

sults from Ben-Tal and Nemirovski [3]. We further show that

this SDP has a very special structure that allows us to derive a

closed-loop control law suitable for real-time applications. Un-

fortunately, in the presence of constraints, this simpliﬁcation no

BERTSIMAS AND BROWN: CONSTRAINED STOCHASTIC LQC: A TRACTABLE APPROACH 1827

longer applies, and the complexity of solving the SDP is im-

practical in a closed-loop setting. This motivates us to simplify

the SDP, which we do in Section IV. Here we use recent results

from robust conic optimization developed by Bertsimas and Sim

[6] to develop a tight SOCP approximation that is far easier to

solve. We then show in Section V how this approach admits var-

ious constraints and performance guarantees. These constraints

may be deterministic constraints on the control or probabilistic

guarantees on the state and objective function. In Section VI,

we show that a particular model for imperfect state information

ﬁts into the framework already developed, and in Section VII,

we provide computational results. Section VIII concludes this

paper.

II. P

ROBLEM

STATEMENT AND

PRELIMINARIES

Throughout this paper, we will work with discrete-time sto-

chastic linear systems of the form

(1)

where

is a state vector, is a control vector,

and

is a disturbance vector (an unknown quantity).

We assume throughout that the matrices

, and are known exactly.

It is desired to control the system in question in a way that

keeps the cost function

(2)

as small as possible. Here we will assume

, ,

and, again, that the data

, and , are known exactly.

We are also using the shorthand

and to denote the entire

vector of controls and disturbances, i.e.,

(3)

(4)

Finally, our convention will be for the system to be in some

initial state

. Unless otherwise stated, we assume this initial

state is also known exactly.

Note that (2) is an uncertain quantity, as it depends on the

realization of

, which is unknown. Most approaches assume

is a random variable possessing some distributional properties

and proceed to minimize (2) in an expected value sense. We now

survey the traditional approach to this problem.

A. The Traditional Approach: Bellman’s Recursion

The dynamic programming approach requires a few distribu-

tional assumptions on the disturbance vectors. Typically, it is as-

sumed that the

are independent, and independent of both

and . Moreover, we have , and has ﬁnite second

moment. For this derivation, we will assume

and

for ease of notation, but the result holds more gen-

erally after some simple manipulations. Modiﬁcations of some

of the distributional assumptions (such as nonzero mean, corre-

lations) are also possible, but we do not detail them here.

The literature on this subject is vast, and the problem is well

understood. The main result is that the expected cost-to-go func-

tions

deﬁned by

(5)

are quadratic in the state

. Thus, it follows that the optimal

policy is linear in the current state. In particular, one can show

(see, e.g., [4]) that the optimal control

is given by

where and

are symmetric, positive semideﬁnite matrices computed recur-

sively. The fact that the recursion given in (5) works so well

(from a complexity standpoint) is quite particular to the case of

linear systems and quadratic costs. For more arbitrary systems

or cost functions such an approach is, in general, intractable.

A more troubling difﬁculty, however, is that even with the

same system and cost function, this approach explodes compu-

tationally with ostensibly simple constraints, such as

For instance, the cost-to-go function (5) in this case becomes

piecewise quadratic with an exponential (in

) number of

pieces.

Of course, one way to suboptimally handle this issue is to

apply Lagrangian duality techniques to the constraints. For ex-

ample, in the case of quadratic constraints on the control vectors,

i.e.,

, one may relax the constraints and then max-

imize over a dual vector

. In particular, the cost-to-go functions

now have the form

(6)

where the dual functions

have the form

From here, one approach to solving (6) suboptimally is to

select a priori a dual vector

and then apply the Riccati equation

as usual. An optimal solution, however, relies on computation

of the optimal dual vector

, which, in general, is difﬁcult and

destroys the quadratic form of the cost-to-go functions.

Thus, the traditional, dynamic programming approach can be

solved very rapidly with linear feedback in the unconstrained

case but becomes, for large-scale problems, impossible to solve

optimally when the constraints are included. This is a very unfa-

vorable property of the DP approach, and it is in direct contrast

to the ﬁeld of convex optimization, whose problem instances

are quite robust (in terms of complexity) to perturbations in the

constraint structure. Our approach, which we now detail, will

leverage this useful property of convex optimization.

1828 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 52, NO. 10, OCTOBER 2007

B. A Tractable Approach: Overview

The traditional approach above is not amenable to problem

changes such as the addition of constraints for two primary

reasons.

1) Complexity of distributional calculations. Computing the

expectation in (5), except for very special cases, is cum-

bersome computationally.

2) Intractability of Bellman’s recursion. The recursion in (5)

requires us, when computing the current control, to have

advance knowledge of all future controls for all possible

future states, even states that are extraordinarily improb-

able. While this recursion is an elegant idea conceptually,

it is not well suited to computation because the number of

possible future states grows so rapidly with problem size.

We propose the following approach, which circumvents these

difﬁculties.

a) Given our current state

and problem data, we

consider the entire control and disturbance vectors

, respectively, as in (3) and (4).

b) We do not assume a particular distribution for

. As-

sume only that

belongs within some “reasonable” un-

certainty set. In particular, assume

belongs to some

norm-bounded set

(7)

parameterized by

c) Discard the notion of Bellman’s recursion. Instead, do

the best we can for all possible disturbances within

That is, rather than computing controls for every possible

state realization, we simply choose a control vector for

the remaining stages that performs best for the most pes-

simistic disturbance within this “reasonable” uncertainty

set. Speciﬁcally, we search for an optimal control

the problem

(8)

Of course, this brings up the issue of open-loop versus closed-

loop control. At ﬁrst glance, this approach appears to be an

open-loop method only. We can, however, compute a solution

to (8), take the ﬁrst components, and apply this as the

current control. After a new state observation, we can repeat the

calculation in (8) with the updated problem data (most of this

updating can be done ofﬂine). The only issue is that the routine

for solving (8) be computationally simple enough for the appli-

cation at hand. The complexity of these solution procedures will

indeed be a central issue for much of the remaining discussion.

Note that the model in (8) is similar in spirit to the approach

control (e.g., [2]) in that it is worst case over a determin-

istic uncertainty set. In contrast to

control, however, our

methodology explicitly relies on new results in robust optimiza-

tion. In particular, our approach has the following properties.

If we wish instead to have

www





, where



then we may rescale coordinates and obtain a problem of the same form. Note

that the statistical appropriateness of ellipsoids and their explicit construction is

not the subject of this paper, but the interested reader may see Paganini [11] for

uncertainty set modelling for the case of white noise.

• It is tractable, even in the presence of control and state-

space constraints.

• We solve a deterministic problem (8) to compute an op-

timal solution

. Thus far, we have not discussed proba-

bility in any way. Nonetheless, our approach is amenable

to examining how good

is when the disturbances ,

rather being chosen in an adversarial manner from an el-

lipsoid, instead obey a probability distribution. We show in

Theorem 8 that under normality for

, the solution sat-

isﬁes very strong probabilistic guarantees. In other words,

when nature gives rise to disturbances that are bounded, we

solve the problem optimally. When, on the other hand, na-

ture gives rise to disturbances that do not satisfy

we still show strong probabilistic guarantees on the perfor-

mance of

• In the unconstrained case, it yields an efﬁcient control law

that is linear in the current state after a simple, scalar opti-

mization procedure. In addition, for

, we recover the

traditional (Riccati) solution, whereas, for

,wehave

a family of increasingly conservative approaches.

To solve (8), we will utilize a number of results from robust

optimization, which we now describe.

C. Results From Robust Quadratic Optimization Over

Ellipsoids

We will leverage some robust quadratic programming results

popularized by Ben-Tal and Nemirovski [3]. In particular, they

consider the conic quadratic constraint

when the data are uncertain and known only to be-

long to some bounded uncertainty set

. The goal of robust

quadratic programming is to optimize over the set of all

such

that the constraint holds for all possible values of the data within

the set

. In other words, we desire to ﬁnd such that

Ben-Tal and Nemirovski show that in the case of an ellipsoidal

uncertainty set, the problem of optimizing over an uncertain

conic quadratic inequality may be solved tractably using

semideﬁnite programming. This turns out also to be the case

for (8). To this end, we will need the following two classical

results, proofs of which may be found in [3], among others.

First, we have the Schur complement lemma.

Lemma 1: Let

where . Then is positive (semi) deﬁnite if and only if

the matrix

is positive (semi) deﬁnite.

In addition, we have the

-lemma.

Lemma 2: Let

be symmetric matrices and assume

that the quadratic inequality

BERTSIMAS AND BROWN: CONSTRAINED STOCHASTIC LQC: A TRACTABLE APPROACH 1829

is strictly feasible. Then the minimum value of the problem

minimize

subject to

is nonnegative if and only if there exists a such that

D. Results From Robust Conic Optimization Over

Norm-Bounded Sets

To improve the complexity of solving (8) when we have con-

straints, we will utilize recent results from robust conic opti-

mization results due to Bertsimas and Sim [6]. This approach

is a relaxation of the exact min-max approach but is computa-

tionally less complex and leads to a uniﬁed probability bound

across a variety of conic optimization problems. We survey the

main ideas and developments here.

Bertsimas and Sim use the following model for data

uncertainty:

where is the nominal data value and are data perturba-

tions. The

are random variables with mean zero and indepen-

dent, identical distributions. The goal is to ﬁnd a policy

such

that a given constraint is “robust feasible,” i.e.,

(9)

where

(10)

For our purposes, we typically use the Euclidean norm on

as it is self-dual, but many other choices for the norm may be

tractably used [6]. We operate under some restrictions on the

function

Assumption 1: The function satisﬁes the following.

is convex in for all .

for all .

One of the central ideas of [6] is to linearize the model of ro-

bustness as follows:

(11)

where

In [6], the authors assume the function is concave in the data. For our pur-

poses, convexity is more convenient. All results follow up to sign changes, and

we report them accordingly.

In the framework developed thus far, (11) turns out to be a re-

laxation of (9), i.e., we have the following.

Proposition 1 (Bertsimas–Sim):

a) If

, then satisﬁes (11)

if and only if

satisﬁes (9).

b) Under Assumption 1, if

is feasible in (11), then is

feasible in (9).

Finally, (11) is tractable due to the following.

Theorem 1 (Bertsimas–Sim): Under Assumption 1, we have

the following.

a) Constraint (11) is equivalent to

(12)

where

b) Equation (12) can be written as

(13)

Finally, Bertsimas and Sim derive a probability of constraint

violation.

Theorem 2 (Bertsimas–Sim): In the model of uncertainty in

(10), when we use the

-norm, i.e., , and under the

assumption that

, we have the probability bound

where for linear programs (LPs), for SOCPs,

and

for SDPs ( is the dimension of the matrix in the

SDP).

III. A

N EXACT

APPROACH USING SDP

In this section, we apply the robust quadratic optimization

results to formulate (8) as an SDP. We then show that we can

compute optimal solutions to this SDP with a very simple con-

trol law.

First, exploiting the linearity of the system, we have the fol-

lowing, straightforward result.

Proposition 2: The cost function (2) for (1) can be written in

the form

(14)

for appropriate vectors

and matrices

and where

1830 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 52, NO. 10, OCTOBER 2007

Proof: Since the system is linear, we can write the state at

any instant

where

Now the cost of any state term is written

Thus, the overall cost is clearly written in the form stated above,

with

where

diag

Finally, positive (semi) deﬁniteness of and follow from

positive (semi) deﬁniteness of

and .

Next, for ease of notation, we will transform the coordinates

of the control space.

Proposition 3: To minimize the cost function

Proposition 2 over all

, it is sufﬁcient instead to

optimize over all

the cost function

(15)

with

Proof: The proof is immediate from the fact that

exists since , and then using the transformation

By Proposition 3, then, (8) is equivalent to the problem

(16)

This problem may be solved using SDP, as we now show.

Theorem 3: Problem (16) may be solved by the following

SDP:

minimize

subject to

(17)

in decision variables

, and .

Proof: We ﬁrst rewrite the problem as

minimize

subject to

(18)

We may homogenize the system and rewrite this equivalently as

minimize

subject to

(19)

Clearly, feasibility of

in (19) implies feasibility of

in (18) (by setting ). For the other direction, assume

is feasible in (18) and set , where . This

implies

, and

where the inequality follows by (18). Thus, the claim is true.

But now we wish to check whether a homogenous quadratic

form in

is nonnegative over all satisfying another

quadratic form. Invoking Lemma 2, we know the constraint

holds if and only if there exists a

such that

Constrained Stochastic LQC: A Tractable Approach

Figures

Citations

Theory and Applications of Robust Optimization

Theory and Applications of Robust Optimization

Theory and Applications of Robust Optimization

Tractable Approximations to Robust Conic Optimization Problems

Stochastic linear Model Predictive Control with chance constraints – A review

References

Linear matrix inequalities in system and control theory

Linear Matrix Inequalities in Systems and Control Theory

Linear Matrix Inequalities in System and Control Theory

Dynamic Programming and Optimal Control

Lectures on modern convex optimization: analysis, algorithms, and engineering applications

Related Papers (5)

Adjustable robust solutions of uncertain linear programs

Optimization over state feedback policies for robust control with constraints

Survey Constrained model predictive control: Stability and optimality

Dynamic Programming and Optimal Control

Convex Optimization

Frequently Asked Questions (8)

Q1. What have the authors contributed in "Constrained stochastic lqc: a tractable approach" ?

Q2. What is the cost-to-go function in Proposition 10?

Q3. What does the traditional approach assume is a random variable?

Q4. What is the common way to ensure that the constraints on can be enforced?

Q5. How can the authors solve the problem of a conic quadratic inequality?

Q6. What is the purpose of the probability results in Section III?

Q7. What is the probability boundwhere for linear programs?

Q8. What is the performance of the SOCP?