What are the contributions mentioned in the paper "Determining the number of factors in approximate factor models" ?

In this paper the authors develop some econometric theory for factor models of large dimensions. The authors then propose some panel Cp criteria and show that the number of factors can be consistently estimated using the criteria.

What future works have the authors mentioned in the paper "Determining the number of factors in approximate factor models" ?

Many issues in factor analysis await further research. But using Theorem 1, it maybe possible to obtain these limiting distributions. It can be shown that ŷT+1|T is not only a consistent but a√ T consistent estimator of yT+1, conditional on the information up to time T ( provided that N is of no smaller order of magnitude than T ). Stock and Watson ( 1998 ) suggest how dynamics can be introduced into factor models when both N and T are large, although their empirical applications assume a static factor structure.

What is the drawback of the approach?

The drawback of the approach is that, because the number of parameters increases with N ,3 computational difficulties make it necessary to abandoninformation on many series even though they are available.

What is the main advantage of these three panel information criteria?

The main advantage of these three panel information criteria (ICp) is that they do not depend on the choice of kmax through σ̂2, which could be desirable in practice.

What is the significance of the likelihood ratio test?

A likelihood ratio test can also, in theory, be used to select the number of factors if, in addition, normality of et is assumed.

What is the way to determine the number of factors in a series?

Assuming N, T → ∞ with √ N/T → ∞, Stock and Watson (1998) showed that a modification tothe BIC can be used to select the number of factors optimal for forecasting a single series.

What is the idiosyncratic component of returns?

In this case, Xit represents the return of asset i at time t, Ft represents the vector of factor returns and eit is the idiosyncratic component of returns.

What is the implication of the proof of the theorem 2?

However the proof of Theorem 2 mainly uses the fact that ̂Ft satisfies Theorem 1, and does not rely on the principle components per se.

What is the way to determine the number of factors in a dynamic setting?

Stock and Watson (1998) suggest how dynamics can be introduced into factor models when both N and T are large, although their empirical applications assume a static factor structure.

(Open Access) Determining the Number of Factors in Approximate Factor Models (2002) | Jushan Bai

Q: How can the forecast mean squared error of a large number of macroeconomic variables be reduced?

More recently, Stock and Watson (1999) showed that the forecast mean squared error of a large number of macroeconomic variables can be reduced by including diffusion indexes, or factors, in structural as well as non-structural forecasting models.

DETERMINING THE NUMBER OF FACTORS IN APPROXIMATE

FACTOR MODELS

Jushan Bai

∗

Serena Ng

†

Department of Economics

Boston College

Chestnut Hill

MA 02467

December 2000

Abstract

In this paper we develop some econometric theory for factor models of large di-

mensions. The focus is the determination of the number of factors (r), which is an

unresolved issue in the rapidly growing literature on multifactor models. We ﬁrst

establish the convergence rate for the factor estimates that will allow for consistent

estimation of r. We then propose some panel

criteria and show that the number of

factors can be consistently estimated using the criteria. The theory is developed under

the framework of large cross-sections (N

) and large time dimensions (

T ). No restric-

tion is imposed on the relation between N and

T . Simulations show that the proposed

criteria have good ﬁnite sample properties in many conﬁgurations of the panel data

encountered in practice.

JEL Classiﬁcation: C13, C33, C43

Keywords: Factor analysis, asset pricing, principal components, model selection.

∗

Email: Jushan.Bai@bc.edu Phone: 617-552-3689

†

Email: Serena.Ng@bc.edu Phone: 617-552-2182

We thank two referees for their very constructive comments, which led to a much improved presentation. The

ﬁrst author acknowledges ﬁnancial support from the National Science Foundation under grant SBR-9709508.

We would like to thank participants in the econometrics seminars at Harvard-MIT, Cornell University, the

University of Rochester, and the University of Pennsylvania for help suggestions and comments. Remaining

errors are our own.

1 Introduction

The idea that variations in a large number of economic variables can b e modeled by a

small number of reference variables is appealing and is used in many economic analysis.

For example, asset returns are often modeled as a function of a small number of factors.

Stock and Watson (1989) used one reference variable to model the comovements of four

main macroeconomic aggregates. Cross-country variations are also found to have common

components, see Gregory and Head (1999) and Forni, Hallin, Lippi and Reichlin (2000).

More recently, Sto ck and Watson (1999) showed that the forecast mean squared error of a

large number of macroeconomic variables can be reduced by including diﬀusion indexes, or

factors, in structural as well as non-structural forecasting models. In demand analysis, engel

curves can be expressed in terms of a ﬁnite numb er of factors. Lewbel (1991) showed that if

a demand system has one common factor, budget shares should be independent of the level

of income. In such a case, the number of factors is an object of economic interest since if

more than one factor is found, homothetic preferences can be rejected. Factor analysis also

provides a convenient way to study the aggregate implications of microeconomic behavior,

as shown in Forni and Lippi (1997).

Central to both the theoretical and the empirical validity of factor models is the correct

speciﬁcation of the number of factors. To date, this crucial parameter is often assumed rather

than determined by the data.

This paper develops a formal statistical procedure that can

consistently estimate the number of factors from observed data. We demonstrate that the

penalty for overﬁtting must be a function of both N

and T

(the cross-section dimension and

the time dimension, respectively) in order to consistently estimate the number of factors.

Consequently the usual AIC and BIC which are functions of

alone do not work when

the both dimensions of the panel are large. Our theory is developed under the assumption

that both

N and

converge to inﬁnity. This ﬂexibility is of empirical relevance because the

time dimension of datasets relevant to factor analysis, although small relative to the cross

section dimension, is too large to justify the assumption of a ﬁxed

T .

A small number of papers in the literature have also considered the problem of deter-

mining the number of factors, but the present analysis diﬀers from these works in important

ways. Lewbel (1991) and Donald (1997) used the rank of a matrix to test for the num-

ber of factors, but these theories assume either N

or T

is ﬁxed. Cragg and Donald (1997)

Lehmann and Modest (1988), for example, tested the APT for 5, 10 and 15 factors. Stock and Watson

(1989) assumed there is one factor underlying the coincident index. Ghysels and Ng (1998) tested the aﬃne

term structure model assuming two factors.

considered the use of information criterion when the factors are functions of a set of observ-

able explanatory variables, but the data still have a ﬁxed dimension. For large dimensional

panels, Connor and Korajczyk (1993) developed a test for the number of factors in asset

returns, but their test is derived under sequential limit asymptotics, i.e., N

converges to

inﬁnity with a ﬁxed

and then

converges to inﬁnity. Furthermore, because their test is

based on a comparison of variances over diﬀerent time periods, covariance stationarity and

homoskedasticity are not only technical assumptions, but are crucial for the validity of their

test. Under the assumption that

N → ∞

for ﬁxed

T , Forni and Reichlin (1998) suggested a

graphical approach to identify the number of factors, but no theory is available. Assuming

N, T → ∞ with

√

N/T

→ ∞

, Stock and Watson (1998) showed that a modiﬁcation to

the BIC can be used to select the number of factors optimal for forecasting a single series.

Their criterion is restrictive not only because it requires N >> T , but also because there

can be factors that are pervasive for a set of data and yet have no predictive ability for an

individual data series. Thus, their rule may not be appropriate outside of the forecasting

framework. Forni, Hallin, Lippi and Reichlin (1999) suggested a multivariate variant of the

AIC but neither the theoretical nor the empirical properties of the criterion are known.

We set up the determination of factors as a model selection problem. In consequence, the

proposed criteria depend on the usual trade-oﬀ between good ﬁt and parsimony. However,

the problem is non-standard not only because account needs to be taken of the sample size

in both the cross section and the time series dimensions, but also because the factors are

not observed. The theory we developed does not rely on sequential limit, nor does it impose

any restrictions between

N and T

. The results hold under heteroskedasticity in both the

time and the cross-section dimensions. The results also hold under weak serial dependence

and cross-section dependence. Simulations show that the criteria have good ﬁnite sample

properties.

The rest of the paper is organized as follows. Section 2 sets up the preliminaries and

introduces notation and assumptions. Estimation of the factors is considered in Section 3

and the estimation of the number of factors is studied in Section 4. Speciﬁc criteria are

considered in Section 5 and their ﬁnite sample prop erties are considered in Section 6, along

with an empirical application to asset returns. Concluding remarks are provided in Section

7. All the proofs are given in the Appendix.

2 Factor Models

Let X

be the observed data for the i

cross section unit at time

, for

= 1

, . . . N, and

= 1, . . . T . Consider the following model

= λ

+ e

, (1)

where F

is a vector of common factors, λ

is a vector of factor loadings associated with

, and

is the idiosyncratic component of X

. The product λ

is called the common

component of

. Equation (1) is then the factor representation of the data. Note that the

factors, their loadings, as well as the idiosyncratic errors are not observable.

Factor analysis allows for dimension reduction and is a useful statistical tool. Many

economic analyses ﬁt naturally into the framework given by (1).

Arbitrage pricing theory. In the ﬁnance literature, the arbitrage pricing theory (APT)

of Ross (1976) assumes that a small number of factors can be used to explain a large numb er

of asset returns. In this case, X

represents the return of asset i at time

, F

represents

the vector of factor returns and e

is the idiosyncratic component of returns. Although

analytical convenience makes it appealing to assume one factor, there is growing evidence

against the adequacy of a single factor in explaining asset returns.

The shifting interest

towards use of multifactor models inevitably calls for a formal procedure to determine the

number of factors. The analysis to follow allows the number of factors to be determined

even when

N and T are both large. This is especially suited for ﬁnancial applications when

data are widely available for a large number of assets over an increasingly long span. Once

the number of factors is determined, the factor returns

can also be consistently estimated

(up to a invertible transformation).

2. The rank of a demand system

. Let p be a price vector for J

goods and services, e

total spending on the

goods by household h

. Consumer theory postulates that Marshallian

demand for good j by consumer

= g

(p, e

). Let w

= X

be the budget share

for household h on the

good. The rank of a demand system holding prices ﬁxed is the

smallest integer r such that w

(e) = λ

) + . . . λ

(

). Demand systems are of the

form (1) where the r factors, common across goods, are

= [G

(

) . . . G

)]

. When

the number of households, H

, converges to inﬁnity with a ﬁxed J

(

e) . . . G

(

e) can be

estimated simultaneously, such as by non-parametric methods developed in Donald (1997).

Cochrane (1999) stressed that ﬁnancial economists now recognize that there are multiple sources of risk,

or factors, that give rise to high returns. Backus, Forsei, Mozumdar and Wu (1997) made similar conclusions

in the context of the market for foreign assets.

Their approach will not work when the number of goods, J

, also converges to inﬁnity. How-

ever, when

J is large, the theory developed in this paper still provides a consistent estimation

of the rank of the demand system and without the need for nonparametric estimation of the

·) functions. This ﬂexibility can be useful since some datasets have detailed information

on a large number of consumption goods. Once the rank of the demand system is deter-

mined, the nonparametric functions evaluated at

allows F

to be consistently estimable

(up to a transformation). Then functions G

(

e) . . . G

) may then be recovered (also up to

a matrix transformation) from

(h = 1

, .., H

) via nonparametric estimation.

3. Forecasting with diﬀusion indices. Stock and Watson (1999) considered forecasting

inﬂation with diﬀusion indices (“factors”) constructed from a large number of macroeconomic

series. The underlying premise is that the movement of a large number of macroeconomic

series may be driven by a small number of unobservable factors. Consider the forecasting

equation for a scalar series

t+1

= α

+ 

The variables W

are observable. Although we do not observe F

, we observe X

i = 1

, . . . N

Suppose

bears relation with

as in (1). In the present context, we interpret (1) as

the reduced-form representation of

in terms of the unobservable factors. We can ﬁrst

estimate F

from (1). Denote it by

. We can then regress y

−1

and W

−

to obtain

the coeﬃcients

and

β, from which a forecast

T +1|T

βW

can be formed. Stock and Watson (1998, 1999) showed that this approach of forecasting

outperforms many competing forecasting methods. But as p ointed out earlier, the dimension

F in Stock and Watson (1998) was determined using a criterion that minimizes the mean

squared forecast errors of

y. This may not be the same as the number of factors underlying

, which is the focus of this paper.

2.1 Notation and Preliminaries

Let F

and r

denote the true common factors, the factor loadings, and the true number

of factors, respectively. Note that

is r dimensional. We assume that r does not depend

N. At a given

, we have

= Λ

+ e

(N × 1) (N ×

r) (

r ×1) (

N × 1)

(2)

Determining the Number of Factors in Approximate Factor Models

Figures

Citations

A large factor model for forecasting macroeconomic variables in South Africa

PRACTITIONERS' CORNER Panel Unit-root Tests for Cross-sectionally Correlated Panels: A Monte Carlo Comparison*

Improving VWAP Strategies: A Dynamical Volume Approach

Variable Selection in Predictive Regressions

Are prices really affected by mergers

References

An Introduction to Multivariate Statistical Analysis

The econometrics of financial markets

The arbitrage theory of capital asset pricing

Some comments on C_p

The Generalized Dynamic-Factor Model: Identification and Estimation

Related Papers (5)

Forecasting Using Principal Components From a Large Number of Predictors

Inferential Theory for Factor Models of Large Dimensions

Macroeconomic Forecasting Using Diffusion Indexes

The Generalized Dynamic-Factor Model: Identification and Estimation

Measuring the Effects of Monetary Policy: A Factor-Augmented Vector Autoregressive (FAVAR) Approach

Frequently Asked Questions (14)

Q1. What are the contributions mentioned in the paper "Determining the number of factors in approximate factor models" ?

Q2. What future works have the authors mentioned in the paper "Determining the number of factors in approximate factor models" ?

Q3. What is the drawback of the approach?

Q4. What are the two main assumptions that are crucial for the validity of the test?

Q5. What is the main advantage of these three panel information criteria?

Q6. What is the test for the number of factors in asset returns?

Q7. What is the significance of the likelihood ratio test?

Q8. How can the forecast mean squared error of a large number of macroeconomic variables be reduced?

Q9. What is the way to determine the number of factors in a series?

Q10. What is the idiosyncratic component of returns?

Q11. What is the implication of the proof of the theorem 2?

Q12. What is the way to determine the number of factors in a dynamic setting?

Q13. What is the simplest way to estimate the rank of a demand system?

Q14. What is the simplest way to determine the number of factors?