What is the simplest way to analyze the posterior?

Use as importance sampling density a tdistribution centered at B with Hessian J , whose degrees of freedom are chosen to ensure fatter tails than those of the marginal posterior T;marg(B): choose with 0 < < T + l + ml, preferably close to the upper bound.

What is the simplest way to calculate the Bayesian posterior?

This paper introduced Bayesian vector autoregressions with stochastic volatility, deriving in closed form the Bayesian posterior, when the error precision matrix is stochas-9 tically time-varying.

What is the simplest way to solve the q-value problem?

Note that N 1 t can be computed numerically cheaply viaN 1t = N 1t 1 N 1 t 1XtX 0 tN 1 t 1=(X 0 tN 1 t 1Xt + ) = ;as can be veri ed directly or with rule (T8), p. 324 in Leamer (1978).

What is the generalization of the multiplication of two real numbers in equation (2)?

Equation (9) is one of two rather natural generalizations of the multiplication of two real numbers in equation (2) in order to guarantee the symmetry of the resulting matrix Ht+1.

What is the posterior of the HT+1?

Numerical methods are needed, since the posterior (11) is proportional to a Normal-Wishart distribution scaled with the function gT (B).

How many draws are the heavily weighted?

The weights can di er by orders of magnitude: examining the raw numbers shows that the draw with the largest weight receives 5.3% of the sum of all weights, the 109 most heavily weighted draws constitute 50% of the mass and the 741 draws with the highest weights make up 90%.

What is the gamma function in Muirhead (1982)?

For each t = 1; : : : ; T calculate et = Yt BtXt andNt = Nt 1 +XtX 0 t(12)5 Bt = Bt 1Nt 1 + YtX 0t N 1t(13)St = St 1 + et 1 X 0tN 1 t Xt e0t(14)gt(B) gt 1(B) j (B Bt)Nt(B Bt) 0 + St j 1=2(15)t( ; ) m(( + l + 1)=2)m(( + l)=2) m(l+ )=2 t 1( ; )(16)In equation (16), m( ) is the multivariate gamma function, de ned in Muirhead (1982), De nition 2.1.10.

(Open Access) Bayesian vector autoregressions with stochastic volatility (1997) | Harald Uhlig

Q: What are the contributions in this paper?

This paper proposes a Bayesian approach to a vector autoregression with stochastic volatility, where the multiplicative evolution of the precision matrix is driven by a multivariate beta variate.

Q: What is the model used in this paper?

The stochastic volatility model used here is similar to Shephard (1994), whose model is a univariate, non-Bayesian and nonautoregressive special case of the model proposed here.

Q: What is the simplest way to calculate the Bayesian posterior?

This paper introduced Bayesian vector autoregressions with stochastic volatility, deriving in closed form the Bayesian posterior, when the error precision matrix is stochas-9 tically time-varying.

Q: What is the simplest way to solve the q-value problem?

Note that N 1 t can be computed numerically cheaply viaN 1t = N 1t 1 N 1 t 1XtX 0 tN 1 t 1=(X 0 tN 1 t 1Xt + ) = ;as can be veri ed directly or with rule (T8), p. 324 in Leamer (1978).

Q: What is the generalization of the multiplication of two real numbers in equation (2)?

Equation (9) is one of two rather natural generalizations of the multiplication of two real numbers in equation (2) in order to guarantee the symmetry of the resulting matrix Ht+1.

Q: What is the posterior of the HT+1?

Numerical methods are needed, since the posterior (11) is proportional to a Normal-Wishart distribution scaled with the function gT (B).

Q: How many draws are the heavily weighted?

The weights can di er by orders of magnitude: examining the raw numbers shows that the draw with the largest weight receives 5.3% of the sum of all weights, the 109 most heavily weighted draws constitute 50% of the mass and the 741 draws with the highest weights make up 90%.

Q: What is the gamma function in Muirhead (1982)?

For each t = 1; : : : ; T calculate et = Yt BtXt andNt = Nt 1 +XtX 0 t(12)5 Bt = Bt 1Nt 1 + YtX 0t N 1t(13)St = St 1 + et 1 X 0tN 1 t Xt e0t(14)gt(B) gt 1(B) j (B Bt)Nt(B Bt) 0 + St j 1=2(15)t( ; ) m(( + l + 1)=2)m(( + l)=2) m(l+ )=2 t 1( ; )(16)In equation (16), m( ) is the multivariate gamma function, de ned in Muirhead (1982), De nition 2.1.10.

Tilburg University

Bayesian Vector Autoregressions with Stochastic Volatility

Uhlig, H.F.H.V.S.

Publication date:

1996

Link to publication in Tilburg University Research Portal

Citation for published version (APA):

Uhlig, H. F. H. V. S. (1996).

Bayesian Vector Autoregressions with Stochastic Volatility

. (CentER Discussion

Paper; Vol. 1996-09). Macroeconomics.

General rights

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners

and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research.

• You may not further distribute the material or use it for any profit-making activity or commercial gain

• You may freely distribute the URL identifying the publication in the public portal

Take down policy

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately

and investigate your claim.

Download date: 10. aug.. 2022

BAYESIAN VECTOR AUTOREGRESSIONS

WITH STOCHASTIC VOLATILITY

By Harald Uhlig

February 2, 1996

Abstract

This paper proposes a Bayesian approachtoavector autoregression with

stochastic volatility, where the multiplicativeevolution of the precision matrix

is driven byamultivariate b eta variate. Exact updating formulas are given to

the nonlinear ltering of the precision matrix. Estimation of the autoregressive

parameters requires numerical metho ds: an importance-sampling based approach

is explained here.

1 Intro duction

This paper introduces Bayesian vector autoregressions with sto chastic volatility. In con-

trast to multivariate autoregressive conditional heteroskedasticity (ARCH), the sto chas-

tic volatility setup here models the error precision matrix as an unobserved component

with sho cks drawn from a multivariate b eta distribution. This allows the interpretation

of a sudden large movement in the data as the result of a draw from a distribution

with a randomly increased but unobserved variance. Exploiting a conjugacy between

Wishart distributions and multivariate singular beta distributions, the integration over

the unobserved shock to the precision matrix can be p erformed in closed form, lead-

ing to a generalization of the standard Kalman-Filter formulas to the nonlinear ltering

problem at hand. Estimating the autoregressive parameters requires numerical methods,

however. The paper focusses on an importance-sampling based approach.

Bayesian vector autoregressions have been studied and popularized by e.g. Litter-

man (1979), Doan, Litterman and Sims (1984) and Doan's RATS Manual (1990). ARCH

models have been introduced by Engle (1982), see the review in Bollerslev, Chou and

Kroner (1992). Sto chastic volatility mo dels provide an alternative approach to model

time variation in the size of uctuations. The stochastic volatility mo del used here

is similar to Shephard (1994), whose model is a univariate, non-Bayesian and non-

autoregressive special case of the model proposed here. In contrast to other Bayesian

approaches to stochastic volatility, see Jacquier, Polson and Rossi (1994), the method

here results in exact up dating formulas for the posterior in the sense that the integra-

tion over the unobserved shocks to the precision matrices is done in closed form. The

conjugacy result needed for this step is established in Uhlig (1994b).

For simplicity, the main ideas are explained in section 2 for the univariate case with

the general case presented in section 3. Section 4 discusses how to analyze the posterior

numerically. Section 5 concludes. Appendix A lists some of the distributions used and

xes notation. Appendix B contains the proofs and one additional theorem. Appendix C

proposes a prior.

2 A Simple Case

Consider the following simple version of the mo del studied in this pap er:

y





;

with



N

;

(1)

=;

with

B

((



+1)

;

(2)

where all

's and



's drawn independently, where

;:::;T

denotes time,

;:::;T

is data and observable,

>

>

0 are parameters and

(

p; q

) denotes the

(one-dimensional) b eta-distribution on the interval [0,1].

Equation (2) sp ecies the unobserved precision

of the innovation





to b e

stochastic. The model thus belongs to the family of sto chastic volatility mo dels, see e.g.

Jacquier-Polson and Rossie (1994). The model captures auto correlated heteroskedastic-

ity, a feature often found especially in nancial data series. Another popular specica-

tion whichdoessoistheARCH-family of models. A GARCH(1,1) mo del, for example,

replaces



in (1) with



and replaces (2) with











;

(3)

where





and



are parameters. It thus ties the innovation in the variance to the

size of the current innovation





. Given





and







,anunusually large

innovation in (2) can result from a randomly decreased

as well as a large



, whereas

the GARCH-model (3) only allows for an unusually large draw



To analyze the system (1) to (2) in a Bayesian fashion, one needs to choose a prior

density



(

; h

) for



and

, given

. The goal is to nd the p osterior density



(

; h

) given data

;:::;y

.We restrict the choice of priors to b e of the following

form. Fix

>

0 and

>

0 (for a more general treatment, see section 3). Cho ose



0 and a function

(



)



0 to describ e a prior density prop ortional to



(

; h

)

(



)

(

; h



;n

;

)

;

where

denotes the Normal-gamma density, see app endix A. The form of the prior

allows for a exible treatment of a root near or above unity via the function

(



), see

Uhlig (1994a).

Adapting the Bayesian updating formulas (12), (13), (14) and (15) derived b elowin

section 3 to the simple mo del ab ove results in

n



(4)













(5)

s













;

(6)

where







;

and

(





(



)



(







)









(7)

for

;:::;T

. These deliver the posterior density



(

; h

)

(



)

(

; h



;n

;

)

Equations (4) and (5) are the recursion formulas or Kalman Filter formulas for geomet-

rically weighted least squares. Dierent observations receive dierentweights according

to the size of

via equation (7). Equation (6) prescribes to nd the \estimate"

essentially via a geometric lag on past squared residuals. Notice the formal similarityto

GARCH: ignoring the term (1



), equation (6) resembles equation (3) rewritten

in terms of observables, using



= 0 and



=

The key for proving the validity of these updating formulas here or in the next section

is theorem 2 and its pro of (see appendix B): as the unobserved shock

occurs, one needs

to do a \change of variable" from

d dh

for some suitably dened

. Thanks to the conjugacy b etween the b eta and the gamma distribution, integration

over

can be p erformed in closed form, resulting in an integration constant depending



and the data. This constant is captured by the function

(



Shephard (1994) nds similar formulas with a classical interpretation for (1) to (2)

without the autoregressive term

y



.To include autoregressive terms, Shephard (1994)

suggests approximate ltering formulas. In contrast, the Bayesian formulas here are

exact. They do, however, require numerical techniques such as importance-sampling for

the estimation of



. There is no treatment of the multivariate case in Shephard (1994).

For



=

(



+1) wehave

=





in equation (6). For





+1)

(



+2)

the precision

is a martingale

[

on the positive part of the real axis.

Shephard (1994) suggests setting



, where

[log

]. This avoids the problem,

that otherwise

a.s. or

0 a.s. (see Nelson (1990)) and makes log

random walk.

For



one obtains a mo del where

is known a priori,



, and where

(



+1)

(



(



+ 2)). In other words the mo del allows for the greater time

variation in the precision, the smaller the parameter



Figure 1 shows parts of the densities for

=#

, which are the multiplicative distur-

bances of the variance







. It shows that (2) typically leads to a slight decrease in

the innovation variance except for occasional and potentially large increases.

3 The General Model

Consider the VAR(k)-mo del with time-varying error precision matrices

(0)

(1)



(2)



:::

(

)



(



)



;

with



N

)

;

(8)

(

)



(

)

=;

with 

B

((



)

;

(9)

where

;:::;T

denotes time,



k;:::;T

, size



1, is observable data,

and

, size



1, denotes deterministic regressors such as a constant and a time trend.

The co ecient matrix

(0)

is of size



, the co ecient matrices

(

)

;:::;k

are

Bayesian vector autoregressions with stochastic volatility

Citations

What are the effects of monetary policy on output? Results from an agnostic identification procedure

Time Varying Structural Vector Autoregressions and Monetary Policy

Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks.

Adaptive pattern recognition based control system and method

Rao-blackwellised particle filtering for dynamic Bayesian networks

References

Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation

Aspects of multivariate statistical theory

ARCH modeling in finance: A review of the theory and empirical evidence

Bayesian inference in statistical analysis

Bayesian Inference in Statistical Analysis

Related Papers (5)

Time Varying Structural Vector Autoregressions and Monetary Policy

Drifts and volatilities: monetary policies and outcomes in the post WWII US

On Gibbs sampling for state space models

Stochastic volatility : likelihood inference and comparison with arch models

Bayesian Analysis of Stochastic Volatility Models

Frequently Asked Questions (10)

Q1. What are the contributions in this paper?

Q2. What is the simplest way to analyze the posterior?

Q3. What is the model used in this paper?

Q4. What is the simplest way to calculate the Bayesian posterior?

Q5. What is the key for proving the validity of these updating formulas?

Q6. What is the simplest way to solve the q-value problem?

Q7. What is the generalization of the multiplication of two real numbers in equation (2)?

Q8. What is the posterior of the HT+1?

Q9. How many draws are the heavily weighted?

Q10. What is the gamma function in Muirhead (1982)?