What have the authors stated for future works in "Consistent estimation of models defined by conditional moment restrictions by manuel" ?

The authors finish with a suggestion on further research.

What is the main advantage of introducing this smoothing number?

The main advantage of introducing this smoothing number is that it allows the derivation of estimators that are asymptotically efficient.

What is the simplest way to estimate a time series?

The authors consider Yt as a k-dimensional time series vector that may contain endogenous and exogenous variables and a finite number of these variables lagged and Xt as a d-dimensional time series vector that contains the exogenous variables (again, a finite number of these variables lagged can be included).

What is the definition of conditional moment restrictions?

IN MANY AREAS OF ECONOMETRICS such as panel data, discrete choice, macroeconomics, and finance, there exist models that are defined in terms of conditional moment restrictions.

What is the value of the parameter of interest?

The coordinates of Zt are related by an econometric model that establishes that the true distribution of the data satisfies the following conditional moment restrictions:E ( h(Yt θ0)|Xt ) = 0 a.s.(1) for a unique value θ0 ∈Θ where Θ ⊂ Rm. Equation (1) defines the parameter value of interest θ0, which is unknown to the econometrician.

What is the way to estimate the probability distribution of a random vector?

denoting by PXt the probability distribution function of the random vector Xt ∫ H(θ0 x)2 dPXt (x) = 0 but∫ H(θ x)2 dPXt (x) > 0 ∀θ = θ0.

What is the coverage percentage for the three estimators?

For the N(0 1) case, for n = 50, bothθ̂ and θ̂E perform better than θ̃, although for n = 200 the RMSE of θ̂ is larger than that of θ̃ and θ̂EIn Table II the authors report the coverage percentages for 90%, 95%, and 99% confidence intervals for the three estimators.

What is the coverage probability of the efficient GMM estimator?

for the N(1 1) case, the coverage probabilities of the efficient GMM estimator present substantial distortions that do not vanish by increasing the sample size.

(Open Access) Consistent Estimation of Models Defined by Conditional Moment Restrictions (2004) | Manuel A. Domínguez

Q: What is the author's financial support for the project?

Both authors acknowledge financial support from Asociación Mexicana de Cultura and from the Mexican Consejo Nacional de Ciencia y Tecnología (CONACYT) under Project Grant J38276D (Domínguez) and Project Grant 41893-S (Lobato).

Econometrica, Vol. 72, No. 5 (September, 2004), 1601–1615

CONSISTENT ESTIMATION OF MODELS DEFINED BY CONDITIONAL

MOMENT RESTRICTIONS

Y MANUEL A. DOMÍNGUEZ AND IGNACIO N. LOBATO

In econometrics, models stated as conditional moment restrictions are typically

estimated by means of the generalized method of moments (GMM). The GMM es-

timation procedure can render inconsistent estimates since the number of arbitrarily

chosen instruments is ﬁnite. In fact, consistency of the GMM estimators relies on addi-

tional assumptions that imply unclear restrictions on the data generating process. This

article introduces a new, simple and consistent estimation procedure for these models

that is directly based on the deﬁnition of the conditional moments. The main feature

of our procedure is its simplicity, since its implementation does not require the selec-

tion of any user-chosen number, and statistical inference is straightforward since the

proposed estimator is asymptotically normal. In addition, we suggest an asymptotically

efﬁcient estimator constructed by carrying out one Newton–Raphson step in the direc-

tion of the efﬁcient GMM estimator.

EYWORDS: Generalized method of moments, identiﬁcation, unconditional mo-

ments, marked empirical process, integrated regression function, efﬁciency bound.

1. INTRODUCTION

IN MANY AREAS OF ECONOMETRICS such as panel data, discrete choice, macroeco-

nomics, and ﬁnance, there exist models that are deﬁned in terms of conditional moment

restrictions. That is, the models establish that certain parametric functions have zero

conditional mean when evaluated at the true parameter value. Note that these condi-

tional restrictions imply that the expectation of the parametric functions evaluated at

the true parameter value times any function that depends on the conditioning variables

is equal to zero. Therefore, when the conditioning variables have a support with inﬁnite

cardinality, the conditional moment restrictions imply an inﬁnite number of uncondi-

tional moment restrictions. This fact underlies the generalized method of moments

(GMM), which is the method commonly employed to estimate these models. Basically,

this method consists of the following two stages. First, choose a ﬁnite number of un-

conditional moment restrictions out of the inﬁnite number implied by the conditional

moment restrictions. Second, deﬁne the estimator as the parameter value that makes

the empirical analogs of the selected unconditional moments closest to 0. In linear

models, any subset of linearly independent unconditional moment restrictions identi-

ﬁes globally the parameters of interest as long as the dimension of this subset equals the

dimension of the parameter vector. Hence, the GMM procedure provides consistent

estimators in linear models. However, in nonlinear models the selected unconditional

We thank a co-editor and two referees for very useful suggestions and comments and

M.Arellano,M.Carrasco,S.Donald,Y.Kitamura,H.Koul,W.Newey,andW.Stuteforuse-

ful conversations. We also thank participants at the 2003 Joint Statistical Meeting and at the 2003

Latin American Meeting of the Econometric Society for interesting comments. Both authors ac-

knowledge ﬁnancial support from Asociación Mexicana de Cultura and from the Mexican Con-

sejo Nacional de Ciencia y Tecnología (CONACYT) under Project Grant J38276D (Domínguez)

and Project Grant 41893-S (Lobato).

1601

1602 M. DOMÍNGUEZ AND I. LOBATO

moment restrictions may hold for several parameter values even if the conditional re-

strictions just hold for a single value. This means that the GMM objective function may

have several global minima. In these cases the arbitrarily chosen unconditional moment

restrictions do not identify globally the parameters of interest, and hence, the GMM

estimators are inconsistent. The next two examples illustrate this idea.

XAMPLE 1: Assume that the random variable Y satisﬁes E(Y |X) = X

where

= 4, and X is a random variable that is symmetric around zero and whose fourth and

sixth moments are equal, such as an N(0 1/5). Assume that the researcher speciﬁes

correctly the model E(Y|X) = X

where θ ∈ Θ =[2 ∞), and sets out to estimate θ

The model implies that E[(Y − X

)g(X)]=0foranyfunctiong provided that

E|(Y − X

)g(X)| < ∞. Since there is only one parameter, the researcher needs to

select at least one function g(X) Let us assume that she selects the functions (typ-

ically called instruments) 1 and X. The problem is that these two instruments do

not identify the parameter value θ

= 4 since the system of equations E(Y − X

) =

E((Y − X

)X) = 0 also holds for the value θ = 6 at least. Of course, more arbitrary

instruments could be added, but it would be simple to ﬁnd a particular distribution

for X,suchthatθ

and additional values for θ satisfy the new set of orthogonality

conditions.

XAMPLE 2: Assume that the random variable Y satisﬁes the simple nonlinear

model E(Y|X) = θ

X + θ

. Suppose that θ

= 5/4andthatV(Y|X) is constant.

Assume that the researcher properly speciﬁes the model and, instead of an arbitrary in-

strument, she chooses the optimal instrument, given by W

= 2θ

X +X

; see Amemiya

(1974) and Chamberlain (1987). In this case, the parameter θ

is not identiﬁed, since

the equation E[(Y −θ

X −θX

]=0isalsosatisﬁedforθ =−5/4whenX follows

an N(−1 1) random variable. Moreover, W

is an unfeasible instrument because θ

unknown. Hence, in practice the researcher just knows the form of the optimal instru-

ment, given by W = 2θX + X

. In this case the parameter θ

is not identiﬁed again,

since the equation E[(Y − θ

X − θX

)W ]=0isalsosatisﬁedforθ =−5/4andfor

θ =−3whenX follows an N(1 1) random variable.

These simple examples illustrate that the procedure based on selecting an arbitrary

ﬁnite number of instruments (even the optimal ones) can lead to inconsistent estima-

tion since it does not guarantee that the parameters of interest are globally identiﬁed.

Hence, GMM typically introduces the additional assumption that the selected uncon-

ditional restrictions identify globally the parameters of interest. As we have seen on the

examples, this additional assumption depends on the selected instruments and on the

unknown true value of the parameters, and in fact, it restricts the marginal distribution

of the conditioning variables. Thus, the introduction of this additional assumption leads

to the following paradox: while the distribution of the conditioning variables should be

irrelevant for the consistent estimation of conditional models, it turns out that this dis-

tribution is crucial for GMM estimators because it guarantees global identiﬁcation of

the parameters of interest.

In this article we propose an alternative estimation procedure where the identiﬁ-

cation problem does not arise, since the method is directly based on the conditional

moment restrictions that deﬁne the parameters of interest. Implementing our pro-

cedure is very simple since no additional user-chosen objects (such as a smoothing

CONDITIONAL MOMENT RESTRICTIONS 1603

number) are needed. As far as we know, ours is the ﬁrst estimator proposed in the liter-

ature that is consistent and does not require the introduction of additional user-chosen

objects. Carrying out statistical inference with our estimator is very simple since its as-

ymptotic distribution is normal. In addition, by carrying out a single Newton–Raphson

step in the direction of the efﬁcient GMM estimator, an asymptotically efﬁcient esti-

mator can be constructed.

The paper is organized as follows. Section 2 introduces the framework and our

estimator, Section 3 establishes the asymptotic theory, Section 4considersefﬁcient

estimation, Section 5 examines a brief Monte Carlo exercise, and Section 6 concludes.

The proofs are contained in the Appendix.

NOTATION AND FRAMEWORK

Let Z

be a time series vector and for all t let {Y

X

} be two subvectors of Z

(that could have common coordinates). We consider Y

as a k-dimensional time series

vector that may contain endogenous and exogenous variables and a ﬁnite number of

these variables lagged and X

as a d-dimensional time series vector that contains the

exogenous variables (again, a ﬁnite number of these variables lagged can be included).

The coordinates of Z

are related by an econometric model that establishes that the

true distribution of the data satisﬁes the following conditional moment restrictions:



h(Y

θ

)|X



= 0 a.s.(1)

for a unique value θ

∈ Θ where Θ ⊂ R

. Equation (1) deﬁnes the parameter value of

interest θ

, which is unknown to the econometrician. The function h that maps R

×Θ

into R

is supposed to be known. In general, h(Y

θ

) can be understood as the errors

in a multivariate nonlinear dynamic regression model. In this paper for simplicity we

will consider the case where l = 1

This model has been repeatedly considered in the econometrics literature and sev-

eral estimators have been proposed; see among others, Amemiya (1974, 1977), Hansen

(1982), Newey (1990, 1993), and Robinson (1987, 1991). However, none of these ref-

erences address the identiﬁcation problem commented above. For instance, Newey

(1990) considers a similar model (see his equation (2.1) on p. 810) in a more re-

strictive framework (he considers independent and identically distributed data with

homoskedasticity) and focuses on the optimality properties of a selected estimator.

Note that he assumes that the parameter vector is globally identiﬁed by the selected

unconditional moment restrictions; see his assumption 3.3(a) on p. 817.

Recently, Donald, Imbens, and Newey (2003) have addressed the identiﬁcation

problem in a different setting. They consider efﬁcient estimation of conditional

moment restrictions models. Their analysis is different from ours. They need to in-

troduce a sequence of approximating functions such as splines or power or Fourier

series and the researcher needs to select the number of terms of these series to be

considered in the analysis. This number is a smoothing or bandwidth number that

compared to the sample size has to verify certain rate restrictions in order to achieve

efﬁcient estimation. Although this bandwidth number allows their estimators to be

root-n asymptotically normal and efﬁcient, statistical inference with this estimator can

be sensitive to the selection of the bandwidth number. Furthermore, their procedure is

restricted to the independent and identically distributed setting, and for most of their

1604 M. DOMÍNGUEZ AND I. LOBATO

results, the conditioning variables should have a compact support and their joint density

has to be bounded away from zero. In the same spirit as Donald, Imbens, and Newey

(2003), Newey and Powell (2003) have provided consistent estimators of semiparamet-

ric models deﬁned by conditional moment restrictions. Contrary to this approach, our

procedure is very simple, does not require the introduction of an arbitrary user-chosen

number to achieve an asymptotically normal distribution, allows for instruments with

unbounded support, and can be used for time series data.

Kitamura, Tripathi, and Ahn (2000) have also analyzed the problem of efﬁcient esti-

mation in conditional moment restrictions models. By employing a localized empirical

likelihood, they propose an estimator that also achieves the semiparametric efﬁciency

bound without estimating the optimal instrument. Similarly to Donald, Imbens, and

Newey (2003), Kitamura, Tripathi, and Ahn (2000) also need to introduce a bandwidth

number and restrict to the independent and identically distributed setting.

Another related reference is Carrasco and Florens (2000). They consider optimal

GMM estimation for the case where there is a continuum of moment conditions in an

independent and identically distributed framework. Our estimator is similar to theirs

in spirit. However, our estimator cannot be written in their framework, as we will see

below, because our norm in the objective function is random and changes with the

sample size, whereas their norm is deterministic and does not change with the sample

size. Carrasco and Florens’ estimator is efﬁcient, but efﬁciency is achieved at the cost of

introducing a user-chosen smoothing number that permits inversion of the covariance

operator. As in the case of Donald, Imbens, and Newey (2003) the sensitivity of the

estimator to that number is unknown.

Next, we introduce our estimator. As discussed in the previous section, the typical

estimation procedure based on selecting some orthogonality conditions does not guar-

antee global identiﬁcation of the parameters of interest. In this paper we propose an

alternative estimation procedure that uses the whole information about θ

contained

in expression (1). From Billingsley (1995, Theorem 16.10iii), note that



h(Y

θ

)|X



= 0a.s. ⇐⇒ H(θ

x)= 0 for almost all x ∈ R

(2)

where H(θx) = E(h(Y

 θ)I(X

≤ x)) is the integrated regression function (Brunk

(1970)) and the indicator function I(X

≤ x) equals 1 when each component in X

is less than or equal to the corresponding component in x, and equals 0 otherwise.

In addition, from (1), it follows that P(E(h(Y

θ)|X

) = 0)<1whenθ = θ

,sothat

H(θx) = 0 in a nonnull set of the sample space of X

. Therefore, denoting by P

the

probability distribution function of the random vector X





H(θ

x)

(x) = 0but



H(θx)

(x) > 0 ∀θ = θ

.Hence,wecanwrite

= arg min

θ∈Θ



H(θx)

(x)(3)

and θ

is the unique value that satisﬁes (3). Denote the sample integrated regression

function by H

(θ x) = n

−1



t=1

h(Y

 θ)I(X

≤ x) where n isthesamplesize.For

any g the sample analog of



(x) dP

(x) is n

−1



=1



) Then, we propose es-

timating θ

by the sample analog of (3), that is,



θ = arg min

θ∈Θ



=1





t=1

h(Y

 θ)I(X

≤ X



)





CONDITIONAL MOMENT RESTRICTIONS 1605

This estimator is a minimum distance estimator; see Ch. 5 in Koul (2002). From a com-

putational point of view, the previous objective function has an additional summation

of n terms compared to the standard GMM objective function. However, it does not in-

volve either matrix inversion or nonparametric estimation, which are computationally

more demanding procedures.

ASYMPTOTIC THEORY

We start by enumerating the assumptions for the consistency of our estimator. Let

|·|denote the Euclidean norm in the corresponding Euclidean space, and assume that

all the considered functions are Borel measurable.

SSUMPTION 1: h(y ·) is continuous in Θ for each y in R

, |h(Y

θ)| <k(Y

) with

Ek(Y

)<∞ and E(h(Y

θ)|X

) = 0 as if and only if θ = θ

SSUMPTION 2: Z

is ergodic and strictly stationary.

SSUMPTION 3: Θ ⊂ R

is compact.

Assumptions 1–3 are standard in the GMM literature. Assumption 1 deﬁnes the

model and identiﬁes globally θ

. It also establishes that the function h is smooth

in Θ, but this smoothness condition is weaker than the Lipschitz condition in As-

sumption 3 in Donald, Imbens, and Newey (2003). Notice that the assumptions

concerning the existence of a bounding function k and the compactness of Θ can be

replaced by other assumptions imposing that for all θ ∈ Θ there exists ρ

> 0such

that E[sup

{θ−θ



<ρ

}∩Θ

|h(Y

θ)− h(Y

θ



)|] < ∞ and that lim

|θ|→∞

E|h(Y

θ)− h(Y



)| > 0. This ﬁrst condition is a smoothness assumption that is still weaker than the

condition in Donald, Imbens, and Newey (2003), whereas the second condition rules

out redescending functions. Opposite to standard GMM, all our assumptions refer to

the unconditional or to the conditional distribution of h, and nothing is imposed on the

marginal distribution of X

, except for Assumption 2, which just restricts dependence

and heterogeneity of the data. Next, we state the consistency theorem whose proof is

in the Appendix.

HEOREM 1: Under Assumptions 1–3



θ →

as



In order to obtain asymptotic normality, some additional assumptions are required.

SSUMPTION 4: h(y ·) is once continuously differentiable in a neighborhood of θ

and

satisﬁes E[sup

θ∈ℵ

h(Y

θ)|]< ∞ where ℵ

denotes a neighborhood of θ

and

h(Y

θ)=

∂h(Y

θ)/∂θ

SSUMPTION 5: h(Y

θ

) is a martingale difference sequence with respect to {Z

s≤ t}

SSUMPTION 6: θ

∈ int(Θ)

SSUMPTION 7: E[h

θ

)X



1+δ

]< ∞

Consistent Estimation of Models Defined by Conditional Moment Restrictions

Figures

Citations

Probability and Measure

Inference based on conditional moment inequalities

A consistent diagnostic test for regression models using projections

Empirical Likelihood Methods in Econometrics: Theory and Practice

Estimation of average treatment effects with misclassification

References

Large sample properties of generalized method of moments estimators

Time series analysis

Time Series Analysis

Probability and Measure

Approximation Theorems of Mathematical Statistics

Related Papers (5)

Large sample properties of generalized method of moments estimators

Asymptotic efficiency in estimation with conditional moment restrictions

Large sample estimation and hypothesis testing

Higher Order Properties of Gmm and Generalized Empirical Likelihood Estimators

Finite sample properties of some alternative GMM estimators

Frequently Asked Questions (13)

Q1. What have the authors contributed in "Consistent estimation of models defined by conditional moment restrictions by manuel" ?

Q2. What have the authors stated for future works in "Consistent estimation of models defined by conditional moment restrictions by manuel" ?

Q3. What is the main advantage of introducing this smoothing number?

Q4. What is the simplest way to estimate a time series?

Q5. What is the author's financial support for the project?

Q6. What is the definition of conditional moment restrictions?

Q7. What is the value of the parameter of interest?

Q8. What is the way to estimate the probability distribution of a random vector?

Q9. What method is used to estimate conditional moments?

Q10. What is the coverage percentage for the three estimators?

Q11. What is the coverage probability of the efficient GMM estimator?

Q12. What is the difference between the two estimators?

Q13. How does Donald and Newey (2003) approach the problem of efficient estimation of conditional moment?