scispace - formally typeset
Open AccessJournal ArticleDOI

Estimation of Spatial Regression Models with Autoregressive Errors by Two Stage Least Squares Procedures: A Serious Problem

Reads0
Chats0
TLDR
In this article, it is shown that these two-stage least squares procedures are generally, in a typical cross-sectional spatial context, not consistent and therefore should not be used.
Abstract
Time series regression models that have autoregressive errors are often estimated by two-stage procedures which are based on the Cochrane-Orcutt (1949) transformation. It seems natural to also attempt the estimation of spatial regression models whose error terms are autoregressive in terms of an analogous transformation. Various two-stage least squares procedures suggest themselves in this context, including an analog to Durbin's (1960) procedure. Indeed, these procedures are so suggestive and computationally convenient that they are quite "tempting." Unfortunately, however, as shown in this paper, these two-stage least squares procedures are generally, in a typical cross-sectional spatial context, not consistent and therefore should not be used.

read more

Content maybe subject to copyright    Report

©INTERNATIONAL REGIONAL SCIENCE REVIEW 20, 1 & 2: 103–111 (1997)
ESTIMATION OF SPATIAL REGRESSION MODELS
WITH AUTOREGRESSIVE ERRORS BY TWO-
STAGE LEAST SQUARES PROCEDURES:
A SERIOUS PROBLEM
HARRY H. KELEJIAN
Department of Economics, University of Maryland, College Park, MD 20742 USA
(kelejian@econ.umd.edu)
INGMAR R. PRUCHA
Department of Economics, University of Maryland, College Park, MD 20742 USA
(prucha@econ.umd.edu)
Time series regression models that have autoregressive errors are often estimated by two-stage proce-
dures which are based on the Cochrane-Orcutt (1949) transformation. It seems natural to also attempt
the estimation of spatial regression models whose error terms are autoregressive in terms of an analo-
gous transformation. Various two-stage least squares procedures suggest themselves in this context,
including an analog to Durbin’s (1960) procedure. Indeed, these procedures are so suggestive and
computationally convenient that they are quite “tempting.” Unfortunately, however, as shown in this
paper, these two-stage least squares procedures are generally, in a typical cross-sectional spatial con-
text, not consistent and therefore should not be used.
0
INTRODUCTION
The spatial autoregressive model studied by Cliff and Ord (1973, 1981),
which is a variant of the model considered by Whittle (1954), is widely used to
describe the properties of the error terms in spatial regressions. As typically
specified, the error terms of a spatial autoregressive model depend on two
unknown parameters. One is an autoregressive parameter, say ρ, and the other is
a variance, say . Interest often focuses on ρ as a measure of spatial depen-
dence, and also because it is a component of the generalized least estimator of the
regression parameters. However, consistent estimation of both ρ and is
important for making inferences based on the regression model.
Based on an analogy with the Cochrane-Orcutt (1949) transformation in a
linear time series model with autocorrelated error terms, one might think that, in
a spatial context, the parameter ρ can be estimated consistently by two-stage least
squares (2SLS) procedures. In particular, one might consider the estimation of
the parameter ρ by a procedure that is analogous to that suggested by Durbin
(1960) for linear time series models, referred to in the spatial literature as the spa-
tial Durbin procedure. Unfortunately, however, as shown below, under typical
0
Luc Anselin and Serge Rey provided helpful comments.
Received January 1997; revised April 1997.
σ
2
σ
2

104 INTERNATIONAL REGIONAL SCIENCE REVIEW, VOL. 20, NOS. 1 & 2, 1997
assumptions these procedures are, in general, not consistent. This point is impor-
tant, especially since these 2SLS procedures are computationally convenient and
therefore their use is “tempting.”
In this paper, the basic model is first specified, then results concerning the
inconsistency of the 2SLS procedures are presented, and finally some concluding
remarks are given in the last section. Technical details are relegated to the
Appendix.
THE MODEL
In this section, the regression model is specified, along with its assumptions.
Those assumptions are then discussed. The following concept will be needed for
the discussion. Let a
ij
denote the (i, j)-th element of an n by n matrix A. Then, the
row and column sums of A are said to be uniformly bounded in absolute value if
where c
a
is a finite constant.
1
The model considered is
, (1)
, (2)
where y is the n by 1 vector of observations on the dependent variable, X is the n
by k matrix of observations on k exogenous regressors, β is the k by 1 vector of
regression parameters, ε is the n by 1 vector of regression disturbances, ρ is the
scalar autoregressive parameter, W is an n by n weights matrix, and u is an n by
1 vector of innovation error terms.
Let u
i
be the i-th element of u, let Z be an n by q, qk, matrix of instru-
ments, and let . Then, assume the following:
ASSUMPTION 1: The u
i
s are i.i.d. with mean 0 and finite variance .
ASSUMPTION 2: The elements of the weights matrix W are known constants,
and rank for all .
1
It can be shown that if two matrices, say A and B, are conformable for multiplication and their
row and column sums are uniformly bounded in absolute value, then the row and column sums of the
product matrix AB are also uniformly bounded in absolute value (see, e.g., Kelejian and Prucha
1995). Of course, if the row or column sums of a matrix are uniformly bounded in absolute value,
then this is also the case for each element.
a
ij
j 1=
n
c
a
for all i 1 nn1;,,=
a
ij
i 1=
n
c
a
for all j 1 nn1;,,=
yX
βε
+=
ερ
W
ε
u+=
PZW
Z
,
()
=
σ
2
I
ρ
ρ
1
<

KELEJIAN, PRUCHA: ESTIMATION OF SPATIAL REGRESSION MODELS 105
ASSUMPTION 3: The row and column sums of W and
are uniformly bounded in absolute value.
ASSUMPTION 4: The elements of the regressor matrix X are nonstochastic,
and X has full column rank.
ASSUMPTION 5: The elements of the instrument matrix Z are nonstochastic
and bounded in absolute value, and Z has full column rank.
ASSUMPTION 6: and where Q
x
and Q
p
are finite and nonsingular. Furthermore, and
are finite.
Assumptions 1 and 2 imply that and furthermore that
, where
. (3)
These two assumptions are typical in spatial autoregressive models unless special
complications are considered
2
(e.g., Cliff and Ord 1981: 198–9). Assumption 3 is
reasonable and should hold for most weights matrix specifications. For example,
the row and column sums of W will be uniformly bounded if W becomes a suffi-
ciently sparse matrix as n . Another example where this condition is satisfied
is the case in which the elements of W are row normalized and the maximum
number of nonzero elements in any given column remains bounded as n .
Next observe from (3) that, except for the scale factor ,
is the variance-covariance matrix of ε. The assumption that the row and column
sums of this matrix are uniformly bounded therefore restricts the extent of corre-
lations relating to the elements of ε. In particular, the assumption implies, as is
easily seen, that there exists some finite constant, say , such that
,
where denotes the correlation between and . Virtually all large
sample analyses restrict the extent of correlations in some way (see, e.g.,
Amemiya 1985, Ch. 3, 4; Pötscher and Prucha 1997, Ch. 5, 6; Anselin and Kele-
jian 1997). Assumption 4 is a standard condition in the context of the general lin-
ear regression model. Essentially, Assumption 4 rules out perfect
multicollinearity. Assumption 5 maintains that the instruments are nonstochastic.
One interpretation of this assumption is that the instruments are exogenous vari-
ables, and that the analysis is conditional upon their realized values. Assumption
6 relates to second order sample moments and is similar to those typically made
2
Among other things, these complications could relate to heteroskedasticity concerning the
innovation error terms, more general patterns of spatial correlation, and parametric specifications of
the weights matrix (see, e.g., Case 1991; Anselin 1990; Dubin 1988).
I ρ
W
()
1
I ρ
W
()
1
n
1
X
XQ
x
=
n
lim
lim n
n
1
PPQ
p
=
lim
n
n
1
ZX
lim
n
n
1
Z
WX
ε I ρ
W
()
1
u=
E
εε
()
ε
=
ε
σ
2
I ρW
()
1
I ρW
()
1
=
σ
2
I ρ
W
()
1
I ρ
W
()
1
c
ω
n
1
corr ε
i
ε
j
,()c
ω
for all<
j 1=
n
i 1=
n
n
1
corr
ε
i
ε
j
,
()
ε
i
ε
j

106 INTERNATIONAL REGIONAL SCIENCE REVIEW, VOL. 20, NOS. 1 & 2, 1997
in large sample analyses involving instrumental variable estimators (e.g., Judge
et al. 1985: 167–9).
TWO-STAGE LEAST SQUARES PROCEDURES
Applying the analog of a Cochrane-Orcutt (1949) transformation to (1) and
(2) and rearranging terms in analogy to Durbin’s (1960) approach yields
, (4)
which can also be written in an over-parameterized form as
(5)
where the restriction is not considered. Note that the model formula-
tions (4) and (5) have been called the spatial Durbin model (see, e.g., Anselin 1988).
3
The model in (1) implies that . It then follows from (2) and
Assumptions 1 and 2 that
.
Therefore, as noted in Anselin (1988: 58), the spatially lagged regressor, , is
correlated with the error term, u. One implication of this is that the parameters of
(5) cannot be consistently estimated by ordinary least squares, nor can the param-
eters of (4) be consistently estimated by nonlinear least squares.
In light of the correlation between and u, one might think of estimating
(4) by nonlinear 2SLS, or (5) by (linear) 2SLS. However, as will be demon-
strated, these procedures are, in general, not consistent. For this discussion, it
proves convenient to denote with
the stacked vector of the true
model parameters in (4). Furthermore, let denote some arbitrary a
priori permissible parameter vector (of corresponding dimensions). Rewrite (4)
as
with
. (6)
The function f(θ) is often referred to as the response function. The nonlinear
2SLS estimator of , say , based on the instruments Z is
now defined as the minimizer of
. (7)
3
These model formulations have also been considered by Burridge (1981) and Blommestein
(1983) and have been referred to in the spatial literature as the spatial common factor model.
y
ρ
WyX
ρ
WX
()β
u++=
y
ρ
WyX
β
WX
γ
u+++=
γρβ
=
WyWX
β
W
ε
+=
EWyu
()σ
2
I ρ
W
()
1
0
=
Wy
Wy
θρβ
,
()
=
θρβ,()
=
yf
θ
()
u+=
f
θ
()ρ
WyX
ρ
WX
()β
+=
θρβ
,
()
=
θ
ˆ
ρ
ˆ
β
ˆ
,()
=
R
n
θ()
n
1
yf
θ()
[]′ZZZ()
1
Z
yf
θ()
[]
=

KELEJIAN, PRUCHA: ESTIMATION OF SPATIAL REGRESSION MODELS 107
Amemiya (1985: 246) gives conditions under which the nonlinear 2SLS esti-
mator is consistent. In terms of the model presented in this paper, one of
Amemiya’s conditions for the consistency of is that the matrix
(8)
has full column rank. For purposes of interpretation, if a model were linear in the
parameters, then the derivative of the response function with respect to the
parameters would be the regressor matrix, say S. In this case, H would then corre-
spond to the probability limit of .
4
From (6),
(9)
Note that the expected value of the first column of the n by k+1 matrix in (9) is a
vector of zero. Given this and the maintained assumptions, it is shown in the
appendix that the first column of H is also a vector of zeroes. It follows that H
does not have full column rank. The violation of Amemiya’s rank condition
implies that his proof of consistency does not apply to the nonlinear 2SLS estima-
tor corresponding to (4). It also suggests that there may be a fundamental “identi-
fication problem” in the sense that the objective function
becomes flat in the direction of as n tends toward infinity. That is, it suggests
that in the limit the minimum of is not associated with a unique value of
. That this is indeed the case for is now demonstrated.
The nonlinear 2SLS estimator can be viewed as a special case of an M-esti-
mator. A basic condition maintained in the general literature on M-estimators is
that the parameters be identifiably unique (see, e.g., Gallant and White 1988;
Pötscher and Prucha 1991, 1997). For the problem at hand, this translates into the
requirement that the limiting objective function
has a unique minimum at the true parameter value, i.e., for all
. Now observe that, for any given value of ,
4
In somewhat more detail, consider for a moment the classical case of a linear model, say
with , where S is the regressor matrix. In this case, the minimizer of (7),
i.e., the 2SLS estimator, can be expressed explicitly (in terms of the usual formula) as
. Furthermore, observe that in this case, .
Thus, in the linear case, Amemiya’s condition reduces to the standard requirement that
has full column rank.
θ
ˆ
ρ
ˆ
β
ˆ
,()
=
H p n
1
Z
f
θ()
θ
------------
n
lim
θθ
=
=
n
1
Z
S
yf
θ
()
u+=
f
θ
()
S
θ
=
θ
ˆ
SZZZ()
1
ZS[]
1
SZZZ()
1
Z
y=
f
θ()θ S
=
plim
n
n
1
ZS
f θ()
θ
------------
θθ=
WyXβ()X ρWX(),[]=
W
ε X ρ
WX
(),[]
.=
R
n
θ()R
n
ρβ,()
=
ρ
R
n
ρβ,()
ρ
ββ
=
R ρβ,()p R
n
ρβ,()
n
lim=
R ρβ,()R ρβ,()>
ρβ,()ρβ,()
ρ

Citations
More filters
Journal ArticleDOI

Under the hood Issues in the specification and interpretation of spatial regression models

TL;DR: A number of conceptual issues pertaining to the implementation of an explicit "spatial" perspective in applied econometrics are reviewed, both from a theory-driven as well as from a data-driven perspective.
Journal ArticleDOI

Comparing Implementations of Estimation Methods for Spatial Econometrics

TL;DR: This review constitutes an up-to-date comparison of generalized method of moments and maximum likelihood implementations now available, using the cross-sectional US county data set provided by Drukker, Prucha, and Raciborski (2013d).
Journal ArticleDOI

Panel data models with spatially correlated error components

TL;DR: In this paper, a generalized moments estimator for the autoregressive parameter in a spatial model is proposed, and a feasible generalized least squares procedure for the regression parameters is defined.
Journal ArticleDOI

Thirty years of spatial econometrics

TL;DR: It is argued that the field of spatial econometric methodology has moved from the margins to the mainstream of applied econometrics and social science methodology during the past 30 years.
Journal ArticleDOI

HAC estimation in a spatial framework

TL;DR: In this article, a nonparametric heteroscedasticity and autocorrelation consistent (HAC) estimator of the variance-covariance (VC) matrix for a vector of sample moments within a spatial context is proposed.
References
More filters
Book

Spatial Econometrics: Methods and Models

TL;DR: In this article, a typology of Spatial Econometric Models is presented, and the maximum likelihood approach to estimate and test Spatial Process Models is proposed, as well as alternative approaches to Inference in Spatial process models.
Book

Probability and Measure

TL;DR: In this paper, the convergence of distributions is considered in the context of conditional probability, i.e., random variables and expected values, and the probability of a given distribution converging to a certain value.
Book

Approximation Theorems of Mathematical Statistics

TL;DR: In this paper, the basic sample statistics are used for Parametric Inference, and the Asymptotic Theory in Parametric Induction (ATIP) is used to estimate the relative efficiency of given statistics.
Book

The Theory and Practice of Econometrics

TL;DR: The Classical Inference Approach for the General Linear Model, Statistical Decision Theory and Biased Estimation, and the Bayesian Approach to Inference are reviewed.
Journal ArticleDOI

On stationary processes in the plane

Peter Whittle
- 03 Dec 1954 - 
TL;DR: The sampling theory of stationary processes in space is not completely analogous to that of stationary time series, due to the fact that the variate of a time series is influenced only by past values, while for a spatial process dependence extends in all directions as mentioned in this paper.
Related Papers (5)
Frequently Asked Questions (1)
Q1. What have the authors contributed in "A serious problem" ?

” Unfortunately, however, as shown in this paper, these two-stage least squares procedures are generally, in a typical cross-sectional spatial context, not consistent and therefore should not be used.