SIAM REVIEW
c
2003 Society for Industrial and Applied Mathematics
Vol. 45, No. 3, pp. 485–503
Constrained Nonlinear
Programming for Volatility
Estimation with GARCH Models
∗
Aslıhan Altay-Salih
†
Mustafa C¸ . Pınar
‡
Sven Leyffer
§
Abstract. This paper proposes a constrained nonlinear programming view of generalized autore-
gressive conditional heteroskedasticity (GARCH) volatility estimation models in financial
econometrics. These models are usually presented to the reader as unconstrained opti-
mization models with recursive terms in the literature, whereas they actually fall into the
domain of nonconvex nonlinear programming. Our results demonstrate that constrained
nonlinear programming is a worthwhile exercise for GARCH models, especially for the
bivariate and trivariate cases, as they offer a significant improvement in the quality of the
solution of the optimization problem over the diagonal VECH and the BEKK representa-
tions of the multivariate GARCH model.
Key words. time series econometrics, constrained nonlinear programming, multivariate GARCH,
volatility estimation, maximum likelihood estimation
AMS subject classifications. 90C30, 90C90, 62M10, 91B84
DOI. 10.1137/S0036144501400115
1. Introduction. Volatility plays an important role in several areas of current
finance literature. It is central to portfolio selection models as efficient portfolios are
formed by computing the maximum return for a given level of volatility. Equilib-
rium models like the capital asset pricing model (CAPM) require the estimation of
market variance as well as the covariance of risky assets with the market portfolio.
Prices of options are also expressed as functions of volatility. As a result, volatility
and covariance estimation is an important research area for both academicians and
practitioners.
ARCH (autoregressive conditional heteroskedasticity; Engle (1982)) and GARCH
(generalized ARCH; Bollerslev (1986)) volatility forecasting models have been the
major tool for characterizing volatility, by using past unpredictable changes in the
returns of an asset to predict the future time-varying second-order moments. Volatility
clustering phenomena (Mandelbrot (1963), Fama (1965)) are the driving force for the
∗
Received by the editors December 21, 2001; accepted for publication (in revised form) November
11, 2002; published electronically August 11, 2003.
http://www.siam.org/journals/sirev/45-3/40011.html
†
Faculty of Management, Bilkent University, 06533 Ankara, Turkey (asalih@bilkent.edu.tr).
‡
Department of Industrial Engineering, Bilkent University, 06800 Ankara, Turkey (mustafap@
bilkent.edu.tr). Part of this research was conducted while this author was visiting the Department of
Mathematics, University of Dundee, in July 2000 with support from a grant from the Royal Society.
§
MCS Division, Argonne National Laboratory, 5700 South Cass Avenue, Argonne, IL 60439-4844
(leyffer@mcs.anl.gov).
485
Downloaded 02/11/19 to 139.179.72.105. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
486 ASLIHAN ALTAY-SALIH, MUSTAFA C¸ . PINAR, AND SVEN LEYFFER
GARCH family of models. The success of these models in the univariate case for
volatility estimation has inspired an interest in covariance estimation, which is a
harder problem, and has led to the development and application of the multivariate
extensions.
1
The major difficulty in the multivariate case stems from the highly
nonlinear and nonconvex nature of the resulting optimization problem.
The first attempt to solve the multivariate GARCH model was the diagonal VECH
model of Bollerslev, Engle, and Wooldridge (1988), who assumed constant covariances
for solvability. This extension can be thought of as a trade-off between estimation
intractability and practical applicability. Later, statistical tests were developed to
check the empirical validity of the assumption of constant covariances; see Bera and
Kim (1996) and Tse (2000). Their results for national stock markets show that the
covariances are in fact time varying. Therefore, other solutions that can deal with the
complexity of the multivariate estimation problem need to be developed.
The factor ARCH model of Engle, Ng, and Rothschild (1990) and the BEKK
model of Baba, Engle, Kraft, and Kroner (1989) were attempts to solve the same
problem by ensuring positive definiteness of the variance-covariance matrices in the
process of optimization, which is an important constraint in multivariate GARCH
models. All of these specifications impose very different restrictions on the variance-
covariance matrix for computational tractability. For example, Schoenberg (1998), in
his GAUSS-based commercial software FANPAC, claims to impose constraints on the
eigenvalues of the variance-covariance matrices, although the details are not revealed.
The purpose of the present paper is to solve the multivariate GARCH optimiza-
tion problem in which we follow a more general approach by taking a constrained
nonlinear programming view of GARCH volatility estimation models without impos-
ing artificial restrictions for tractability. This is made possible by recent advances
in numerical optimization algorithms and software. ARCH and GARCH models are
usually presented to the reader as unconstrained optimization models with recursive
terms in econometrics and finance texts (see, e.g., Hamilton (1987) and Gourieroux
(1997)), whereas they actually fall into the domain of nonconvex, nonlinearly con-
strained nonlinear programming. They are usually solved by extensions of Newton
or quasi-Newton methods that take into account the recursive nature of terms defin-
ing the objective function. Against this background a major goal of this paper is to
test the practical solvability (i.e., computing a Karush–Kuhn–Tucker point) of these
models as nonlinearly constrained nonconvex programs using the AMPL modeling
language (Fourer, Gay, and Kernighan (1993)) and the state-of-the-art optimization
packages available through the recently developed NEOS
2
interface at the Argonne
National Laboratory.
We believe this research effort is a worthwhile undertaking, as the current fi-
nancial econometrics literature does not use these valuable sources of optimization
software, to the best of our knowledge. Second, we establish through our computa-
tional results that the bivariate and trivariate GARCH volatility estimation models
for which relatively few software systems exist in the market are solved very effectively
by our approach, thus contributing a new tool to the econometric finance literature.
1
See Engle (1987); Bollerslev, Engle, and Wooldridge (1988); Giovannini and Jorion (1989);
Engle, Ng, and Rothschild (1990); Bollerslev (1990); Ng, Engle, and Rothschild (1991); Conrad,
G¨ultekin, and Kaul (1991); Kroner and Claesens (1991); Kroner and Sultan (1993); Lien and Luo
(1994); Karolyi (1995); Park and Switzer (1995); Tse (2000).
2
http://www-neos.mcs.anl.gov; see Cyzik, Mesnier, and Mor´e (1998)
Downloaded 02/11/19 to 139.179.72.105. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
NONLINEAR PROGRAMMING FOR GARCH ESTIMATION 487
Furthermore, our empirical results for the DAX, FTSE, and S & P 500 indices demon-
strate that this approach tracks the variability in realized volatility better than both
the diagonal VECH and the BEKK representations. However, we should stress that
the major contribution of this paper lies in the proposed general approach and its
documented superior solution quality from an optimization point of view. Although
a visual inspection of the results and mean-square errors of the trivariate application
is promising, a thorough empirical investigation of the forecasting accuracy is a topic
for further research.
We organize the rest of this paper as follows. In section 2, we review the univariate
GARCH model. Section 3 is devoted to a review and discussion of the multivariate
and, in particular, the bivariate and trivariate GARCH models on which we concen-
trate. In section 4, we illustrate our approach by applying it to daily returns of the
S & P 500, FTSE 100, and DAX indices, report our results, and compare them with
the diagonal VECH and BEKK representations. Section 5 concludes the paper.
2. Univariate GARCH Model. The analysis of time series dynamics of economic
data is usually based on observations of relevant processes, e.g., the behavior of short-
and long-term interest rates, rate of inflation, stock prices, etc. In general terms, an
observed time series is viewed as a realization of a stochastic process, i.e., a sequence of
random variables that are defined on some state space Ω. These random variables may
be unidimensional, leading to univariate econometric models, or multidimensional, in
which case multivariate models are appropriate. Furthermore, the random variables
are indexed by time, where we assume that observations are recorded at regularly
spaced intervals, which allows one to consider time indices taking only integer values.
The stochastic process is denoted by
Y =(Y
t
,t∈T),
where the index set T is the set of nonnegative integers or the set of natural numbers.
In the present paper we consider the following autoregressive process for stock index
returns, which explains the behavior of the random variable in terms of its past values
as
Y
t
= φ
1
Y
t−1
+ φ
2
Y
t−2
+ ···+ φ
m
Y
t−m
+ ε
t
,
where ε =(ε
t
) is a weak white noise satisfying the martingale difference sequence
condition
E(ε
t
|ε
t−1
)=0,
where the notation E(.) denotes mathematical expectation and ε
t−1
= {ε
t−1
,ε
t−2
,...}
represents the vector of past values. It is important to model the level of financial time
series {Y
t
}, but sometimes it might be even more important to model the volatility
of the series to quantify the risks involved in a specific trading strategy, especially
when the empirical evidence suggests that the level process {Y
t
} shows no particular
time dependence, whereas the volatility process exhibits a certain time dependence.
Instead of assuming that the conditional variance of the noise, i.e., E(ε
2
t
|ε
t−1
), is
time independent, we allow for time dependence through an autoregressive equation
for the squared error terms (innovations) as follows:
E(ε
2
t
|ε
t−1
) ≡ h
t
= c +
q
i=1
α
i
ε
2
t−i
+
p
j=1
β
j
h
t−j
.(2.1)
Downloaded 02/11/19 to 139.179.72.105. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
488 ASLIHAN ALTAY-SALIH, MUSTAFA C¸ . PINAR, AND SVEN LEYFFER
The above model is referred to as GARCH(p,q).
3
In the case p = 0, we have the
ARCH(q) model:
E(ε
2
t
|ε
t−1
) ≡ h
t
= c +
q
i=1
α
i
ε
2
t−i
.(2.2)
An important consideration in the study of time series is the stationarity properties
of the time series in the interest of forecasting ability. Imposing stationarity is a
vital part of modeling. In particular, if {Y
t
} is stationary, the mean, variance, and
autocorrelations can usually be well approximated by sufficiently long time averages.
Formally, a stochastic process with a finite mean and variance is called covariance (or
second-order) stationary if for all t, t − s,
E(Y
t
)=E(Y
t−s
)=µ,(2.3)
E[(Y
t
− µ)
2
]=E[(Y
t−s
− µ)
2
]=σ
2
Y
,(2.4)
E[(Y
t
− µ)(Y
t−s
− µ)] = E[(Y
t−j
− µ)(Y
t−j−s
− µ)] = γ
s
,(2.5)
where µ, σ
2
Y
, and γ
s
are all constants. Simply put, a time series is covariance
stationary if its mean and all auto-covariances are unaffected by a change of time
origin. In the above models, φ ∈
m
, α ∈
q
++
, β ∈
p
++
(the notation
q
++
and
q
++
represent the space of q - and p-dimensional real vectors with strictly positive
components, respectively), c is a positive scalar, and
q
i=1
α
i
+
p
i=1
β
i
< 1(2.6)
is sufficient to ensure second-order stationarity asymptotically. For further details the
reader is referred to Property 3.19 of Gourieroux (1997).
An important tool in the estimation of the above parameters is the technique of
maximum likelihood estimation. Assuming a normal distribution for Y
t
given the
past observations, application of the maximum likelihood technique in the case of
GARCH(p,q) leads to the following optimization problem:
max −
T
2
log 2π −
1
2
T
t=1
log h
t
−
1
2
T
t=1
ε
2
t
h
t
(2.7)
subject to the stationarity condition (2.6), the specification of conditional variances
h
t
given by (2.1), and the nonnegativity condition on c, α, β .
Therefore, for the GARCH(p,q) case we can formulate the following optimization
problem:
max −
1
2
T
t=1
log h
t
−
1
2
T
t=1
ε
2
t
h
t
s.t. c +
q
i=1
α
i
ε
2
t−i
+
p
j=1
β
j
h
t−j
= h
t
∀t =1,...,T,
3
Excellent references are available on this important topic. The interested reader is referred to
Droesbeke, Fichet, and Tassi (1994); Gourieroux (1997); and Hamilton (1987) for details.
Downloaded 02/11/19 to 139.179.72.105. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
NONLINEAR PROGRAMMING FOR GARCH ESTIMATION 489
m
i=1
φ
i
Y
t−i
+ ε
t
= Y
t
∀t =1,...,T,
q
i=1
α
i
+
p
i=1
β
i
≤ 1,
h
t
≥ 0 ∀ t =1,...,T,
c ≥ 0,
α
i
≥ 0 ∀ i =1,...,q,
β
i
≥ 0 ∀ i =1,...,p,
where we have replaced the strict inequality in (2.6) with a nonstrict inequality in the
interest of computational tractability. This modification did not create any problems,
as this constraint turned out to be inactive (satisfied as a strict inequality) at the
reported solution in our computational tests (see values of α
1
and β
1
in Table 1,
section 4).
Regarding issues of convexity in the above model, we notice that the function
log h
t
+
ε
2
t
h
t
is a quasi-convex function in (ε
t
,h
t
). Unfortunately, the sum of quasi-
convex functions is not necessarily quasi-convex. Therefore, we do not expect to
detect hidden convexity in the objective function of the above model. The constraints
are also of a polynomial nature and obviously nonconvex. These observations imply
that any attempt at numerical solution of the above model is bound to yield at best
a Karush–Kuhn–Tucker point (not necessarily a local maximum).
3. Multivariate Model. When the error term ε
t
is a multivariate process of
dimension n, we can introduce the same formulation as in the univariate case for
all the components of the conditional variance-covariance matrix. Now, for all t =
1,...,T we have Y
t
∈
n
and ε
t
∈
n
with components Y
lt
and ε
lt
,l =1,...,n,
respectively. We denote the components of the n ×n conditional variance-covariance
matrix H
t
= E(ε
t
ε
T
t
|ε
t−1
)byh
klt
. The log-likelihood function to be maximized in
the multivariate case is given as
−
1
2
T
t=1
(log det H
t
+ ε
T
t
H
−1
t
ε
t
).
Following Kraft and Engle (1982) and Bollerslev, Engle, and Wooldridge (1988),
a multivariate extension of univariate GARCH (2.1) is as follows:
vech(H
t
) = vech(C)+
q
i=1
A
i
vech(ε
t−i
ε
T
t−i
)+
p
j=1
B
j
vech(H
t−j
),(3.1)
where vech is the operator that consists in stacking up the lower triangular and the
diagonal portions of the columns of a symmetric matrix into a vector, the matrices
A
i
and B
j
are of size
n(n+1)
2
×
n(n+1)
2
, and C is a symmetric matrix of size n × n.
This general formulation is termed the VECH model by Engle and Kroner (1995).
Now, we consider the following estimation problem that we refer to as the con-
strained nonlinear programming (NLP) formulation:
Downloaded 02/11/19 to 139.179.72.105. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php