scispace - formally typeset
Open AccessJournal ArticleDOI

Determining the Number of Factors in Approximate Factor Models

Jushan Bai, +1 more
- 01 Jan 2002 - 
- Vol. 70, Iss: 1, pp 191-221
Reads0
Chats0
TLDR
In this article, the convergence rate for the factor estimates that will allow for consistent estimation of the number of factors is established, and some panel criteria are proposed to obtain the convergence rates.
Abstract
In this paper we develop some econometric theory for factor models of large dimensions. The focus is the determination of the number of factors (r), which is an unresolved issue in the rapidly growing literature on multifactor models. We first establish the convergence rate for the factor estimates that will allow for consistent estimation of r. We then propose some panel criteria and show that the number of factors can be consistently estimated using the criteria. The theory is developed under the framework of large cross-sections (N) and large time dimensions (T). No restriction is imposed on the relation between N and T. Simulations show that the proposed criteria have good finite sample properties in many configurations of the panel data encountered in practice.

read more

Content maybe subject to copyright    Report

DETERMINING THE NUMBER OF FACTORS IN APPROXIMATE
FACTOR MODELS
Jushan Bai
Serena Ng
Department of Economics
Boston College
Chestnut Hill
MA 02467
December 2000
Abstract
In this paper we develop some econometric theory for factor models of large di-
mensions. The focus is the determination of the number of factors (r), which is an
unresolved issue in the rapidly growing literature on multifactor models. We first
establish the convergence rate for the factor estimates that will allow for consistent
estimation of r. We then propose some panel
C
p
criteria and show that the number of
factors can be consistently estimated using the criteria. The theory is developed under
the framework of large cross-sections (N
) and large time dimensions (
T ). No restric-
tion is imposed on the relation between N and
T . Simulations show that the proposed
criteria have good finite sample properties in many configurations of the panel data
encountered in practice.
JEL Classification: C13, C33, C43
Keywords: Factor analysis, asset pricing, principal components, model selection.
Email: Jushan.Bai@bc.edu Phone: 617-552-3689
Email: Serena.Ng@bc.edu Phone: 617-552-2182
We thank two referees for their very constructive comments, which led to a much improved presentation. The
first author acknowledges financial support from the National Science Foundation under grant SBR-9709508.
We would like to thank participants in the econometrics seminars at Harvard-MIT, Cornell University, the
University of Rochester, and the University of Pennsylvania for help suggestions and comments. Remaining
errors are our own.

1 Introduction
The idea that variations in a large number of economic variables can b e modeled by a
small number of reference variables is appealing and is used in many economic analysis.
For example, asset returns are often modeled as a function of a small number of factors.
Stock and Watson (1989) used one reference variable to model the comovements of four
main macroeconomic aggregates. Cross-country variations are also found to have common
components, see Gregory and Head (1999) and Forni, Hallin, Lippi and Reichlin (2000).
More recently, Sto ck and Watson (1999) showed that the forecast mean squared error of a
large number of macroeconomic variables can be reduced by including diffusion indexes, or
factors, in structural as well as non-structural forecasting models. In demand analysis, engel
curves can be expressed in terms of a finite numb er of factors. Lewbel (1991) showed that if
a demand system has one common factor, budget shares should be independent of the level
of income. In such a case, the number of factors is an object of economic interest since if
more than one factor is found, homothetic preferences can be rejected. Factor analysis also
provides a convenient way to study the aggregate implications of microeconomic behavior,
as shown in Forni and Lippi (1997).
Central to both the theoretical and the empirical validity of factor models is the correct
specification of the number of factors. To date, this crucial parameter is often assumed rather
than determined by the data.
1
This paper develops a formal statistical procedure that can
consistently estimate the number of factors from observed data. We demonstrate that the
penalty for overfitting must be a function of both N
and T
(the cross-section dimension and
the time dimension, respectively) in order to consistently estimate the number of factors.
Consequently the usual AIC and BIC which are functions of
N
or
T
alone do not work when
the both dimensions of the panel are large. Our theory is developed under the assumption
that both
N and
T
converge to infinity. This flexibility is of empirical relevance because the
time dimension of datasets relevant to factor analysis, although small relative to the cross
section dimension, is too large to justify the assumption of a fixed
T .
A small number of papers in the literature have also considered the problem of deter-
mining the number of factors, but the present analysis differs from these works in important
ways. Lewbel (1991) and Donald (1997) used the rank of a matrix to test for the num-
ber of factors, but these theories assume either N
or T
is fixed. Cragg and Donald (1997)
1
Lehmann and Modest (1988), for example, tested the APT for 5, 10 and 15 factors. Stock and Watson
(1989) assumed there is one factor underlying the coincident index. Ghysels and Ng (1998) tested the affine
term structure model assuming two factors.
1

considered the use of information criterion when the factors are functions of a set of observ-
able explanatory variables, but the data still have a fixed dimension. For large dimensional
panels, Connor and Korajczyk (1993) developed a test for the number of factors in asset
returns, but their test is derived under sequential limit asymptotics, i.e., N
converges to
infinity with a fixed
T
and then
T
converges to infinity. Furthermore, because their test is
based on a comparison of variances over different time periods, covariance stationarity and
homoskedasticity are not only technical assumptions, but are crucial for the validity of their
test. Under the assumption that
N
for fixed
T , Forni and Reichlin (1998) suggested a
graphical approach to identify the number of factors, but no theory is available. Assuming
N, T with
N/T
, Stock and Watson (1998) showed that a modification to
the BIC can be used to select the number of factors optimal for forecasting a single series.
Their criterion is restrictive not only because it requires N >> T , but also because there
can be factors that are pervasive for a set of data and yet have no predictive ability for an
individual data series. Thus, their rule may not be appropriate outside of the forecasting
framework. Forni, Hallin, Lippi and Reichlin (1999) suggested a multivariate variant of the
AIC but neither the theoretical nor the empirical properties of the criterion are known.
We set up the determination of factors as a model selection problem. In consequence, the
proposed criteria depend on the usual trade-off between good fit and parsimony. However,
the problem is non-standard not only because account needs to be taken of the sample size
in both the cross section and the time series dimensions, but also because the factors are
not observed. The theory we developed does not rely on sequential limit, nor does it impose
any restrictions between
N and T
. The results hold under heteroskedasticity in both the
time and the cross-section dimensions. The results also hold under weak serial dependence
and cross-section dependence. Simulations show that the criteria have good finite sample
properties.
The rest of the paper is organized as follows. Section 2 sets up the preliminaries and
introduces notation and assumptions. Estimation of the factors is considered in Section 3
and the estimation of the number of factors is studied in Section 4. Specific criteria are
considered in Section 5 and their finite sample prop erties are considered in Section 6, along
with an empirical application to asset returns. Concluding remarks are provided in Section
7. All the proofs are given in the Appendix.
2

2 Factor Models
Let X
it
be the observed data for the i
th
cross section unit at time
t
, for
i
= 1
, . . . N, and
t
= 1, . . . T . Consider the following model
X
it
= λ
0
i
F
t
+ e
it
, (1)
where F
t
is a vector of common factors, λ
i
is a vector of factor loadings associated with
F
t
, and
e
it
is the idiosyncratic component of X
it
. The product λ
0
i
F
t
is called the common
component of
X
it
. Equation (1) is then the factor representation of the data. Note that the
factors, their loadings, as well as the idiosyncratic errors are not observable.
Factor analysis allows for dimension reduction and is a useful statistical tool. Many
economic analyses fit naturally into the framework given by (1).
1.
Arbitrage pricing theory. In the finance literature, the arbitrage pricing theory (APT)
of Ross (1976) assumes that a small number of factors can be used to explain a large numb er
of asset returns. In this case, X
it
represents the return of asset i at time
t
, F
t
represents
the vector of factor returns and e
it
is the idiosyncratic component of returns. Although
analytical convenience makes it appealing to assume one factor, there is growing evidence
against the adequacy of a single factor in explaining asset returns.
2
The shifting interest
towards use of multifactor models inevitably calls for a formal procedure to determine the
number of factors. The analysis to follow allows the number of factors to be determined
even when
N and T are both large. This is especially suited for financial applications when
data are widely available for a large number of assets over an increasingly long span. Once
the number of factors is determined, the factor returns
F
t
can also be consistently estimated
(up to a invertible transformation).
2. The rank of a demand system
. Let p be a price vector for J
goods and services, e
h
be
total spending on the
J
goods by household h
. Consumer theory postulates that Marshallian
demand for good j by consumer
h
is
X
jh
= g
j
(p, e
h
). Let w
jh
= X
jh
/e
h
be the budget share
for household h on the
j
th
good. The rank of a demand system holding prices fixed is the
smallest integer r such that w
j
(e) = λ
j1
G
1
(e
) + . . . λ
jr
G
r
(
e
). Demand systems are of the
form (1) where the r factors, common across goods, are
F
h
= [G
1
(
e
h
) . . . G
r
(e
h
)]
0
. When
the number of households, H
, converges to infinity with a fixed J
,
G
1
(
e) . . . G
r
(
e) can be
estimated simultaneously, such as by non-parametric methods developed in Donald (1997).
2
Cochrane (1999) stressed that financial economists now recognize that there are multiple sources of risk,
or factors, that give rise to high returns. Backus, Forsei, Mozumdar and Wu (1997) made similar conclusions
in the context of the market for foreign assets.
3

Their approach will not work when the number of goods, J
, also converges to infinity. How-
ever, when
J is large, the theory developed in this paper still provides a consistent estimation
of the rank of the demand system and without the need for nonparametric estimation of the
G(
·) functions. This flexibility can be useful since some datasets have detailed information
on a large number of consumption goods. Once the rank of the demand system is deter-
mined, the nonparametric functions evaluated at
e
h
allows F
h
to be consistently estimable
(up to a transformation). Then functions G
1
(
e) . . . G
r
(e
) may then be recovered (also up to
a matrix transformation) from
b
F
h
(h = 1
, .., H
) via nonparametric estimation.
3. Forecasting with diffusion indices. Stock and Watson (1999) considered forecasting
inflation with diffusion indices (“factors”) constructed from a large number of macroeconomic
series. The underlying premise is that the movement of a large number of macroeconomic
series may be driven by a small number of unobservable factors. Consider the forecasting
equation for a scalar series
y
t+1
= α
0
F
t
+
β
0
W
t
+
t
.
The variables W
t
are observable. Although we do not observe F
t
, we observe X
it
,
i = 1
, . . . N
.
Suppose
X
it
bears relation with
F
t
as in (1). In the present context, we interpret (1) as
the reduced-form representation of
X
it
in terms of the unobservable factors. We can first
estimate F
t
from (1). Denote it by
b
F
t
. We can then regress y
t
on
b
F
t
1
and W
t
1
to obtain
the coefficients
b
α
and
b
β, from which a forecast
b
y
T +1|T
=
b
α
0
b
F
T
+
b
βW
T
can be formed. Stock and Watson (1998, 1999) showed that this approach of forecasting
outperforms many competing forecasting methods. But as p ointed out earlier, the dimension
of
F in Stock and Watson (1998) was determined using a criterion that minimizes the mean
squared forecast errors of
y. This may not be the same as the number of factors underlying
X
it
, which is the focus of this paper.
2.1 Notation and Preliminaries
Let F
0
t
,
λ
0
i
and r
denote the true common factors, the factor loadings, and the true number
of factors, respectively. Note that
F
0
t
is r dimensional. We assume that r does not depend
on
N. At a given
t
, we have
X
t
= Λ
0
F
0
t
+ e
t
.
(N × 1) (N ×
r) (
r ×1) (
N × 1)
(2)
4

Citations
More filters
Journal ArticleDOI

A large factor model for forecasting macroeconomic variables in South Africa

TL;DR: In this article, the authors used large factor models (FMs), which accommodate a large cross-section of macroeconomic time series for forecasting the per capita growth rate, inflation, and the nominal short-term interest rate for the South African economy.
Posted Content

PRACTITIONERS' CORNER Panel Unit-root Tests for Cross-sectionally Correlated Panels: A Monte Carlo Comparison*

TL;DR: In this paper, the finite sample performance of a set of unit-root tests for cross-correlated panels is investigated, and the size and power of Choi's and Perron's tests are analyzed by a Monte Carlo simulation study.
Journal ArticleDOI

Improving VWAP Strategies: A Dynamical Volume Approach

TL;DR: In this article, a new methodology for modeling intraday volume which allows for a reduction of the execution risk in VWAP (Volume Weighted Average Price) orders is presented, based on decomposition of traded volume into two parts: one reflects volume changes due to market evolutions; the second describes the stock specific volume pattern.
Book ChapterDOI

Variable Selection in Predictive Regressions

TL;DR: In this article, a review of methods for selecting empirically relevant predictors from a set of potentially relevant ones for the purpose of forecasting a scalar time series is presented. But regardless of the model size, there is an unavoidable tension between prediction accuracy and consistent model determination.
Posted Content

Are prices really affected by mergers

TL;DR: In this article, a difference in differences approach is applied to price movements around mergers, where the rate of inflation in a sector where a merger has occurred is compared to a counterfactual.
References
More filters
Book

An Introduction to Multivariate Statistical Analysis

TL;DR: In this article, the distribution of the Mean Vector and the Covariance Matrix and the Generalized T2-Statistic is analyzed. But the distribution is not shown to be independent of sets of Variates.
Book

The econometrics of financial markets

TL;DR: In this paper, Campbell, Lo, and MacKinlay present an attempt by three well-known and well-respected scholars to fill an acknowledged void in the empirical finance literature, a text covering the burgeoning field of empirical finance.
Journal ArticleDOI

The arbitrage theory of capital asset pricing

TL;DR: Ebsco as mentioned in this paper examines the arbitrage model of capital asset pricing as an alternative to the mean variance pricing model introduced by Sharpe, Lintner and Treynor.
Journal Article

Some comments on C_p

C. L. Mallows
- 01 Jan 1973 - 
Journal ArticleDOI

The Generalized Dynamic-Factor Model: Identification and Estimation

TL;DR: In this article, a generalized dynamic factor model with infinite dynamics and nonorthogonal idiosyncratic components is proposed, which generalizes the static approximate factor model of Chamberlain and Rothschild (1983), as well as the exact factor model a la Sargent and Sims (1977).
Related Papers (5)
Frequently Asked Questions (14)
Q1. What are the contributions mentioned in the paper "Determining the number of factors in approximate factor models" ?

In this paper the authors develop some econometric theory for factor models of large dimensions. The authors then propose some panel Cp criteria and show that the number of factors can be consistently estimated using the criteria. 

Many issues in factor analysis await further research. But using Theorem 1, it maybe possible to obtain these limiting distributions. It can be shown that ŷT+1|T is not only a consistent but a√ T consistent estimator of yT+1, conditional on the information up to time T ( provided that N is of no smaller order of magnitude than T ). Stock and Watson ( 1998 ) suggest how dynamics can be introduced into factor models when both N and T are large, although their empirical applications assume a static factor structure. 

The drawback of the approach is that, because the number of parameters increases with N ,3 computational difficulties make it necessary to abandoninformation on many series even though they are available. 

because their test is based on a comparison of variances over different time periods, covariance stationarity and homoskedasticity are not only technical assumptions, but are crucial for the validity of their test. 

The main advantage of these three panel information criteria (ICp) is that they do not depend on the choice of kmax through σ̂2, which could be desirable in practice. 

For large dimensional panels, Connor and Korajczyk (1993) developed a test for the number of factors in asset returns, but their test is derived under sequential limit asymptotics, i.e., N converges to infinity with a fixed T and then T converges to infinity. 

A likelihood ratio test can also, in theory, be used to select the number of factors if, in addition, normality of et is assumed. 

More recently, Stock and Watson (1999) showed that the forecast mean squared error of a large number of macroeconomic variables can be reduced by including diffusion indexes, or factors, in structural as well as non-structural forecasting models. 

Assuming N, T → ∞ with √ N/T → ∞, Stock and Watson (1998) showed that a modification tothe BIC can be used to select the number of factors optimal for forecasting a single series. 

In this case, Xit represents the return of asset i at time t, Ft represents the vector of factor returns and eit is the idiosyncratic component of returns. 

However the proof of Theorem 2 mainly uses the fact that ̂Ft satisfies Theorem 1, and does not rely on the principle components per se. 

Stock and Watson (1998) suggest how dynamics can be introduced into factor models when both N and T are large, although their empirical applications assume a static factor structure. 

when J is large, the theory developed in this paper still provides a consistent estimation of the rank of the demand system and without the need for nonparametric estimation of theG(·) functions. 

The shifting interesttowards use of multifactor models inevitably calls for a formal procedure to determine the number of factors.