scispace - formally typeset
Open AccessJournal ArticleDOI

Consistent Estimation of Models Defined by Conditional Moment Restrictions

Manuel A. Domínguez, +1 more
- 01 Sep 2004 - 
- Vol. 72, Iss: 5, pp 1601-1615
TLDR
In this article, a simple and consistent estimation procedure for conditional moment restrictions is proposed, which is directly based on the definition of the conditional moments and does not require the selection of any user-chosen number.
Abstract
In econometrics, models stated as conditional moment restrictions are typically estimated by means of the generalized method of moments (GMM). The GMM estimation procedure can render inconsistent estimates since the number of arbitrarily chosen instruments is finite. In fact, consistency of the GMM estimators relies on additional assumptions that imply unclear restrictions on the data generating process. This article introduces a new, simple and consistent estimation procedure for these models that is directly based on the definition of the conditional moments. The main feature of our procedure is its simplicity, since its implementation does not require the selection of any user-chosen number, and statistical inference is straightforward since the proposed estimator is asymptotically normal. In addition, we suggest an asymptotically efficient estimator constructed by carrying out one Newton–Raphson step in the direction of the efficient GMM estimator.

read more

Content maybe subject to copyright    Report

Econometrica, Vol. 72, No. 5 (September, 2004), 1601–1615
CONSISTENT ESTIMATION OF MODELS DEFINED BY CONDITIONAL
MOMENT RESTRICTIONS
B
Y MANUEL A. DOMÍNGUEZ AND IGNACIO N. LOBATO
1
In econometrics, models stated as conditional moment restrictions are typically
estimated by means of the generalized method of moments (GMM). The GMM es-
timation procedure can render inconsistent estimates since the number of arbitrarily
chosen instruments is finite. In fact, consistency of the GMM estimators relies on addi-
tional assumptions that imply unclear restrictions on the data generating process. This
article introduces a new, simple and consistent estimation procedure for these models
that is directly based on the definition of the conditional moments. The main feature
of our procedure is its simplicity, since its implementation does not require the selec-
tion of any user-chosen number, and statistical inference is straightforward since the
proposed estimator is asymptotically normal. In addition, we suggest an asymptotically
efficient estimator constructed by carrying out one Newton–Raphson step in the direc-
tion of the efficient GMM estimator.
K
EYWORDS: Generalized method of moments, identification, unconditional mo-
ments, marked empirical process, integrated regression function, efficiency bound.
1. INTRODUCTION
IN MANY AREAS OF ECONOMETRICS such as panel data, discrete choice, macroeco-
nomics, and finance, there exist models that are defined in terms of conditional moment
restrictions. That is, the models establish that certain parametric functions have zero
conditional mean when evaluated at the true parameter value. Note that these condi-
tional restrictions imply that the expectation of the parametric functions evaluated at
the true parameter value times any function that depends on the conditioning variables
is equal to zero. Therefore, when the conditioning variables have a support with infinite
cardinality, the conditional moment restrictions imply an infinite number of uncondi-
tional moment restrictions. This fact underlies the generalized method of moments
(GMM), which is the method commonly employed to estimate these models. Basically,
this method consists of the following two stages. First, choose a finite number of un-
conditional moment restrictions out of the infinite number implied by the conditional
moment restrictions. Second, define the estimator as the parameter value that makes
the empirical analogs of the selected unconditional moments closest to 0. In linear
models, any subset of linearly independent unconditional moment restrictions identi-
fies globally the parameters of interest as long as the dimension of this subset equals the
dimension of the parameter vector. Hence, the GMM procedure provides consistent
estimators in linear models. However, in nonlinear models the selected unconditional
1
We thank a co-editor and two referees for very useful suggestions and comments and
M.Arellano,M.Carrasco,S.Donald,Y.Kitamura,H.Koul,W.Newey,andW.Stuteforuse-
ful conversations. We also thank participants at the 2003 Joint Statistical Meeting and at the 2003
Latin American Meeting of the Econometric Society for interesting comments. Both authors ac-
knowledge financial support from Asociación Mexicana de Cultura and from the Mexican Con-
sejo Nacional de Ciencia y Tecnología (CONACYT) under Project Grant J38276D (Domínguez)
and Project Grant 41893-S (Lobato).
1601

1602 M. DOMÍNGUEZ AND I. LOBATO
moment restrictions may hold for several parameter values even if the conditional re-
strictions just hold for a single value. This means that the GMM objective function may
have several global minima. In these cases the arbitrarily chosen unconditional moment
restrictions do not identify globally the parameters of interest, and hence, the GMM
estimators are inconsistent. The next two examples illustrate this idea.
E
XAMPLE 1: Assume that the random variable Y satisfies E(Y |X) = X
θ
0
where
θ
0
= 4, and X is a random variable that is symmetric around zero and whose fourth and
sixth moments are equal, such as an N(0 1/5). Assume that the researcher specifies
correctly the model E(Y|X) = X
θ
where θ Θ =[2 ), and sets out to estimate θ
0
.
The model implies that E[(Y X
θ
0
)g(X)]=0foranyfunctiong provided that
E|(Y X
θ
0
)g(X)| < . Since there is only one parameter, the researcher needs to
select at least one function g(X) Let us assume that she selects the functions (typ-
ically called instruments) 1 and X. The problem is that these two instruments do
not identify the parameter value θ
0
= 4 since the system of equations E(Y X
θ
) =
E((Y X
θ
)X) = 0 also holds for the value θ = 6 at least. Of course, more arbitrary
instruments could be added, but it would be simple to find a particular distribution
for X,suchthatθ
0
and additional values for θ satisfy the new set of orthogonality
conditions.
E
XAMPLE 2: Assume that the random variable Y satisfies the simple nonlinear
model E(Y|X) = θ
2
0
X + θ
0
X
2
. Suppose that θ
0
= 5/4andthatV(Y|X) is constant.
Assume that the researcher properly specifies the model and, instead of an arbitrary in-
strument, she chooses the optimal instrument, given by W
0
= 2θ
0
X +X
2
; see Amemiya
(1974) and Chamberlain (1987). In this case, the parameter θ
0
is not identified, since
the equation E[(Y θ
2
X θX
2
)W
0
]=0isalsosatisedforθ =−5/4whenX follows
an N(1 1) random variable. Moreover, W
0
is an unfeasible instrument because θ
0
is
unknown. Hence, in practice the researcher just knows the form of the optimal instru-
ment, given by W = 2θX + X
2
. In this case the parameter θ
0
is not identified again,
since the equation E[(Y θ
2
X θX
2
)W ]=0isalsosatisedforθ =−5/4andfor
θ =−3whenX follows an N(1 1) random variable.
These simple examples illustrate that the procedure based on selecting an arbitrary
finite number of instruments (even the optimal ones) can lead to inconsistent estima-
tion since it does not guarantee that the parameters of interest are globally identified.
Hence, GMM typically introduces the additional assumption that the selected uncon-
ditional restrictions identify globally the parameters of interest. As we have seen on the
examples, this additional assumption depends on the selected instruments and on the
unknown true value of the parameters, and in fact, it restricts the marginal distribution
of the conditioning variables. Thus, the introduction of this additional assumption leads
to the following paradox: while the distribution of the conditioning variables should be
irrelevant for the consistent estimation of conditional models, it turns out that this dis-
tribution is crucial for GMM estimators because it guarantees global identification of
the parameters of interest.
In this article we propose an alternative estimation procedure where the identifi-
cation problem does not arise, since the method is directly based on the conditional
moment restrictions that define the parameters of interest. Implementing our pro-
cedure is very simple since no additional user-chosen objects (such as a smoothing

CONDITIONAL MOMENT RESTRICTIONS 1603
number) are needed. As far as we know, ours is the first estimator proposed in the liter-
ature that is consistent and does not require the introduction of additional user-chosen
objects. Carrying out statistical inference with our estimator is very simple since its as-
ymptotic distribution is normal. In addition, by carrying out a single Newton–Raphson
step in the direction of the efficient GMM estimator, an asymptotically efficient esti-
mator can be constructed.
The paper is organized as follows. Section 2 introduces the framework and our
estimator, Section 3 establishes the asymptotic theory, Section 4considersefcient
estimation, Section 5 examines a brief Monte Carlo exercise, and Section 6 concludes.
The proofs are contained in the Appendix.
2.
NOTATION AND FRAMEWORK
Let Z
t
be a time series vector and for all t let {Y
t
X
t
} be two subvectors of Z
t
(that could have common coordinates). We consider Y
t
as a k-dimensional time series
vector that may contain endogenous and exogenous variables and a finite number of
these variables lagged and X
t
as a d-dimensional time series vector that contains the
exogenous variables (again, a finite number of these variables lagged can be included).
The coordinates of Z
t
are related by an econometric model that establishes that the
true distribution of the data satisfies the following conditional moment restrictions:
E
h(Y
t
θ
0
)|X
t
= 0 a.s.(1)
for a unique value θ
0
Θ where Θ R
m
. Equation (1) defines the parameter value of
interest θ
0
, which is unknown to the econometrician. The function h that maps R
k
×Θ
into R
l
is supposed to be known. In general, h(Y
t
θ
0
) can be understood as the errors
in a multivariate nonlinear dynamic regression model. In this paper for simplicity we
will consider the case where l = 1
This model has been repeatedly considered in the econometrics literature and sev-
eral estimators have been proposed; see among others, Amemiya (1974, 1977), Hansen
(1982), Newey (1990, 1993), and Robinson (1987, 1991). However, none of these ref-
erences address the identification problem commented above. For instance, Newey
(1990) considers a similar model (see his equation (2.1) on p. 810) in a more re-
strictive framework (he considers independent and identically distributed data with
homoskedasticity) and focuses on the optimality properties of a selected estimator.
Note that he assumes that the parameter vector is globally identified by the selected
unconditional moment restrictions; see his assumption 3.3(a) on p. 817.
Recently, Donald, Imbens, and Newey (2003) have addressed the identification
problem in a different setting. They consider efficient estimation of conditional
moment restrictions models. Their analysis is different from ours. They need to in-
troduce a sequence of approximating functions such as splines or power or Fourier
series and the researcher needs to select the number of terms of these series to be
considered in the analysis. This number is a smoothing or bandwidth number that
compared to the sample size has to verify certain rate restrictions in order to achieve
efficient estimation. Although this bandwidth number allows their estimators to be
root-n asymptotically normal and efficient, statistical inference with this estimator can
be sensitive to the selection of the bandwidth number. Furthermore, their procedure is
restricted to the independent and identically distributed setting, and for most of their

1604 M. DOMÍNGUEZ AND I. LOBATO
results, the conditioning variables should have a compact support and their joint density
has to be bounded away from zero. In the same spirit as Donald, Imbens, and Newey
(2003), Newey and Powell (2003) have provided consistent estimators of semiparamet-
ric models defined by conditional moment restrictions. Contrary to this approach, our
procedure is very simple, does not require the introduction of an arbitrary user-chosen
number to achieve an asymptotically normal distribution, allows for instruments with
unbounded support, and can be used for time series data.
Kitamura, Tripathi, and Ahn (2000) have also analyzed the problem of efficient esti-
mation in conditional moment restrictions models. By employing a localized empirical
likelihood, they propose an estimator that also achieves the semiparametric efficiency
bound without estimating the optimal instrument. Similarly to Donald, Imbens, and
Newey (2003), Kitamura, Tripathi, and Ahn (2000) also need to introduce a bandwidth
number and restrict to the independent and identically distributed setting.
Another related reference is Carrasco and Florens (2000). They consider optimal
GMM estimation for the case where there is a continuum of moment conditions in an
independent and identically distributed framework. Our estimator is similar to theirs
in spirit. However, our estimator cannot be written in their framework, as we will see
below, because our norm in the objective function is random and changes with the
sample size, whereas their norm is deterministic and does not change with the sample
size. Carrasco and Florens’ estimator is efficient, but efficiency is achieved at the cost of
introducing a user-chosen smoothing number that permits inversion of the covariance
operator. As in the case of Donald, Imbens, and Newey (2003) the sensitivity of the
estimator to that number is unknown.
Next, we introduce our estimator. As discussed in the previous section, the typical
estimation procedure based on selecting some orthogonality conditions does not guar-
antee global identification of the parameters of interest. In this paper we propose an
alternative estimation procedure that uses the whole information about θ
0
contained
in expression (1). From Billingsley (1995, Theorem 16.10iii), note that
E
h(Y
t
θ
0
)|X
t
= 0a.s. ⇐⇒ H(θ
0
x)= 0 for almost all x R
d
(2)
where H(θx) = E(h(Y
t
θ)I(X
t
x)) is the integrated regression function (Brunk
(1970)) and the indicator function I(X
t
x) equals 1 when each component in X
t
is less than or equal to the corresponding component in x, and equals 0 otherwise.
In addition, from (1), it follows that P(E(h(Y
t
θ)|X
t
) = 0)<1whenθ = θ
0
,sothat
H(θx) = 0 in a nonnull set of the sample space of X
t
. Therefore, denoting by P
X
t
the
probability distribution function of the random vector X
t
H(θ
0
x)
2
dP
X
t
(x) = 0but
H(θx)
2
dP
X
t
(x) > 0 θ = θ
0
.Hence,wecanwrite
θ
0
= arg min
θΘ
H(θx)
2
dP
X
t
(x)(3)
and θ
0
is the unique value that satisfies (3). Denote the sample integrated regression
function by H
n
(θ x) = n
1
n
t=1
h(Y
t
θ)I(X
t
x) where n isthesamplesize.For
any g the sample analog of
g
2
(x) dP
X
t
(x) is n
1
n
=1
g
2
(X
) Then, we propose es-
timating θ
0
by the sample analog of (3), that is,
θ = arg min
θΘ
1
n
3
n
=1
n
t=1
h(Y
t
θ)I(X
t
X
)
2

CONDITIONAL MOMENT RESTRICTIONS 1605
This estimator is a minimum distance estimator; see Ch. 5 in Koul (2002). From a com-
putational point of view, the previous objective function has an additional summation
of n terms compared to the standard GMM objective function. However, it does not in-
volve either matrix inversion or nonparametric estimation, which are computationally
more demanding procedures.
3.
ASYMPTOTIC THEORY
We start by enumerating the assumptions for the consistency of our estimator. Let
|·|denote the Euclidean norm in the corresponding Euclidean space, and assume that
all the considered functions are Borel measurable.
A
SSUMPTION 1: h(y ·) is continuous in Θ for each y in R
k
, |h(Y
t
θ)| <k(Y
t
) with
Ek(Y
t
)< and E(h(Y
t
θ)|X
t
) = 0 as if and only if θ = θ
0
.
A
SSUMPTION 2: Z
t
is ergodic and strictly stationary.
A
SSUMPTION 3: Θ R
m
is compact.
Assumptions 1–3 are standard in the GMM literature. Assumption 1 defines the
model and identifies globally θ
0
. It also establishes that the function h is smooth
in Θ, but this smoothness condition is weaker than the Lipschitz condition in As-
sumption 3 in Donald, Imbens, and Newey (2003). Notice that the assumptions
concerning the existence of a bounding function k and the compactness of Θ can be
replaced by other assumptions imposing that for all θ Θ there exists ρ
θ
> 0such
that E[sup
{θθ
θ
}∩Θ
|h(Y
t
θ) h(Y
t
θ
)|] < and that lim
|θ|→∞
E|h(Y
t
θ) h(Y
t
θ
0
)| > 0. This first condition is a smoothness assumption that is still weaker than the
condition in Donald, Imbens, and Newey (2003), whereas the second condition rules
out redescending functions. Opposite to standard GMM, all our assumptions refer to
the unconditional or to the conditional distribution of h, and nothing is imposed on the
marginal distribution of X
t
, except for Assumption 2, which just restricts dependence
and heterogeneity of the data. Next, we state the consistency theorem whose proof is
in the Appendix.
T
HEOREM 1: Under Assumptions 1–3
θ
as
θ
0
In order to obtain asymptotic normality, some additional assumptions are required.
A
SSUMPTION 4: h(y ·) is once continuously differentiable in a neighborhood of θ
0
and
satisfies E[sup
θ∈ℵ
0
|
˙
h(Y
t
θ)|]< where
0
denotes a neighborhood of θ
0
and
˙
h(Y
t
θ)=
∂h(Y
t
θ)/θ
A
SSUMPTION 5: h(Y
t
θ
0
) is a martingale difference sequence with respect to {Z
s
s t}
A
SSUMPTION 6: θ
0
int(Θ)
A
SSUMPTION 7: E[h
4
(Y
t
θ
0
)X
t
1+δ
]<

Citations
More filters

Probability and Measure

P.J.C. Spreij
Journal ArticleDOI

Inference based on conditional moment inequalities

TL;DR: In this article, the authors construct confidence sets for models defined by many conditional moment inequalities/equalities and verify that these sets have correct uniform asymptotic size and exclude parameter values outside the identified set with probability approaching one.
Journal ArticleDOI

A consistent diagnostic test for regression models using projections

TL;DR: In this paper, the authors proposed a consistent test for the goodness-of-fit of parametric regression models which overcomes two important problems of the existing tests, namely, the poor empirical power and size performance of the tests due to the curse of dimensionality and the choice of subjective parameters like bandwidths, kernels or integrating measures.
Posted Content

Empirical Likelihood Methods in Econometrics: Theory and Practice

TL;DR: In this paper, two interpretations of empirical likelihood are presented, one as a nonparametric maximum likelihood estimation method (NPMLE) and the other as a generalized minimum contrast estimator(GMC).
Journal ArticleDOI

Estimation of average treatment effects with misclassification

TL;DR: In this article, the effect of misclassification in a nonparametric or semiparametric regression model on the treatment effect of a binary treatment or policy has been investigated.
References
More filters
Journal ArticleDOI

Time series analysis

James D. Hamilton
- 01 Feb 1997 - 
TL;DR: A ordered sequence of events or observations having a time component is called as a time series, and some good examples are daily opening and closing stock prices, daily humidity, temperature, pressure, annual gross domestic product of a country and so on.
Book ChapterDOI

Time Series Analysis

TL;DR: This paper provides a concise overview of time series analysis in the time and frequency domains with lots of references for further reading.
Book

Probability and Measure

TL;DR: In this paper, the convergence of distributions is considered in the context of conditional probability, i.e., random variables and expected values, and the probability of a given distribution converging to a certain value.
Book

Approximation Theorems of Mathematical Statistics

TL;DR: In this paper, the basic sample statistics are used for Parametric Inference, and the Asymptotic Theory in Parametric Induction (ATIP) is used to estimate the relative efficiency of given statistics.
Related Papers (5)
Frequently Asked Questions (13)
Q1. What have the authors contributed in "Consistent estimation of models defined by conditional moment restrictions by manuel" ?

This article introduces a new, simple and consistent estimation procedure for these models that is directly based on the definition of the conditional moments. In addition, the authors suggest an asymptotically efficient estimator constructed by carrying out one Newton–Raphson step in the direction of the efficient GMM estimator. 

The authors finish with a suggestion on further research. 

The main advantage of introducing this smoothing number is that it allows the derivation of estimators that are asymptotically efficient. 

The authors consider Yt as a k-dimensional time series vector that may contain endogenous and exogenous variables and a finite number of these variables lagged and Xt as a d-dimensional time series vector that contains the exogenous variables (again, a finite number of these variables lagged can be included). 

Both authors acknowledge financial support from Asociación Mexicana de Cultura and from the Mexican Consejo Nacional de Ciencia y Tecnología (CONACYT) under Project Grant J38276D (Domínguez) and Project Grant 41893-S (Lobato). 

IN MANY AREAS OF ECONOMETRICS such as panel data, discrete choice, macroeconomics, and finance, there exist models that are defined in terms of conditional moment restrictions. 

The coordinates of Zt are related by an econometric model that establishes that the true distribution of the data satisfies the following conditional moment restrictions:E ( h(Yt θ0)|Xt ) = 0 a.s.(1) for a unique value θ0 ∈Θ where Θ ⊂ Rm. Equation (1) defines the parameter value of interest θ0, which is unknown to the econometrician. 

denoting by PXt the probability distribution function of the random vector Xt ∫ H(θ0 x)2 dPXt (x) = 0 but∫ H(θ x)2 dPXt (x) > 0 ∀θ = θ0. 

In econometrics, models stated as conditional moment restrictions are typically estimated by means of the generalized method of moments (GMM). 

For the N(0 1) case, for n = 50, bothθ̂ and θ̂E perform better than θ̃, although for n = 200 the RMSE of θ̂ is larger than that of θ̃ and θ̂EIn Table II the authors report the coverage percentages for 90%, 95%, and 99% confidence intervals for the three estimators. 

for the N(1 1) case, the coverage probabilities of the efficient GMM estimator present substantial distortions that do not vanish by increasing the sample size. 

Carrasco and Florens’ estimator is efficient, but efficiency is achieved at the cost of introducing a user-chosen smoothing number that permits inversion of the covariance operator. 

By employing a localized empirical likelihood, they propose an estimator that also achieves the semiparametric efficiency bound without estimating the optimal instrument.