scispace - formally typeset
Search or ask a question
Author

Victor Chernozhukov

Other affiliations: Amazon.com, New Economic School, Stanford University  ...read more
Bio: Victor Chernozhukov is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Estimator & Quantile. The author has an hindex of 73, co-authored 370 publications receiving 20588 citations. Previous affiliations of Victor Chernozhukov include Amazon.com & New Economic School.


Papers
More filters
ReportDOI
TL;DR: In this article, the authors show that the impact of regularization bias and overfitting on estimation of the parameter of interest θ0 can be removed by using two simple, yet critical, ingredients: (1) using Neyman-orthogonal moments/scores that have reduced sensitivity with respect to nuisance parameters, and (2) making use of cross-fitting, which provides an efficient form of data-splitting.
Abstract: Summary We revisit the classic semi-parametric problem of inference on a low-dimensional parameter θ0 in the presence of high-dimensional nuisance parameters η0. We depart from the classical setting by allowing for η0 to be so high-dimensional that the traditional assumptions (e.g. Donsker properties) that limit complexity of the parameter space for this object break down. To estimate η0, we consider the use of statistical or machine learning (ML) methods, which are particularly well suited to estimation in modern, very high-dimensional cases. ML methods perform well by employing regularization to reduce variance and trading off regularization bias with overfitting in practice. However, both regularization bias and overfitting in estimating η0 cause a heavy bias in estimators of θ0 that are obtained by naively plugging ML estimators of η0 into estimating equations for θ0. This bias results in the naive estimator failing to be N−1/2 consistent, where N is the sample size. We show that the impact of regularization bias and overfitting on estimation of the parameter of interest θ0 can be removed by using two simple, yet critical, ingredients: (1) using Neyman-orthogonal moments/scores that have reduced sensitivity with respect to nuisance parameters to estimate θ0; (2) making use of cross-fitting, which provides an efficient form of data-splitting. We call the resulting set of methods double or debiased ML (DML). We verify that DML delivers point estimators that concentrate in an N−1/2-neighbourhood of the true parameter values and are approximately unbiased and normally distributed, which allows construction of valid confidence statements. The generic statistical theory of DML is elementary and simultaneously relies on only weak theoretical requirements, which will admit the use of a broad array of modern ML methods for estimating the nuisance parameters, such as random forests, lasso, ridge, deep neural nets, boosted trees, and various hybrids and ensembles of these methods. We illustrate the general theory by applying it to provide theoretical properties of the following: DML applied to learn the main regression parameter in a partially linear regression model; DML applied to learn the coefficient on an endogenous variable in a partially linear instrumental variables model; DML applied to learn the average treatment effect and the average treatment effect on the treated under unconfoundedness; DML applied to learn the local average treatment effect in an instrumental variables setting. In addition to these theoretical applications, we also illustrate the use of DML in three empirical examples.

1,204 citations

Journal ArticleDOI
TL;DR: In this article, the authors developed a model of quantile treatment effects (QTE) in the presence of endogeneity and obtained conditions for identification of the QTE without functional form assumptions.
Abstract: The ability of quantile regression models to characterize the heterogeneous impact of variables on different points of an outcome distribution makes them appealing in many economic applications. However, in observational studies, the variables of interest (e.g., education, prices) are often endogenous, making conventional quantile regression inconsistent and hence inappropriate for recovering the causal effects of these variables on the quantiles of economic outcomes. In order to address this problem, we develop a model of quantile treatment effects (QTE) in the presence of endogeneity and obtain conditions for identification of the QTE without functional form assumptions. The principal feature of the model is the imposition of conditions that restrict the evolution of ranks across treatment states. This feature allows us to overcome the endogeneity problem and recover the true QTE through the use of instrumental variables. The proposed model can also be equivalently viewed as a structural simultaneous equation model with nonadditive errors, where QTE can be interpreted as the structural quantile effects (SQE).

902 citations

Journal ArticleDOI
TL;DR: The authors proposed robust methods for inference about the effect of a treatment variable on a scalar outcome in the presence of very many regressors in a model with possibly non-Gaussian and heteroscedastic disturbances.
Abstract: We propose robust methods for inference about the effect of a treatment variable on a scalar outcome in the presence of very many regressors in a model with possibly non-Gaussian and heteroscedastic disturbances. We allow for the number of regressors to be larger than the sample size. To make informative inference feasible, we require the model to be approximately sparse; that is, we require that the effect of confounding factors can be controlled for up to a small approximation error by including a relatively small number of variables whose identities are unknown. The latter condition makes it possible to estimate the treatment effect by selecting approximately the right set of regressors. We develop a novel estimation and uniformly valid inference method for the treatment effect in this setting, called the “post-double-selection” method. The main attractive feature of our method is that it allows for imperfect selection of the controls and provides confidence intervals that are valid uniformly across a large class of models. In contrast, standard post-model selection estimators fail to provide uniform inference even in simple cases with a small, fixed number of controls. Thus, our method resolves the problem of uniform inference after model selection for a large, interesting class of models. We also present a generalization of our method to a fully heterogeneous model with a binary treatment variable. We illustrate the use of the developed methods with numerical simulations and an application that considers the effect of abortion on crime rates.

825 citations

Journal ArticleDOI
TL;DR: In this paper, preliminary results of this paper were presented at Chernozhukov's invited Cowles Foundation lecture at the Northern American meetings of the Econometric society in June of 2009.
Abstract: Date: First version: June 2009, this version October 28, 2010. Preliminary results of this paper were FIRST presented at Chernozhukov's invited Cowles Foundation lecture at the Northern American meetings of the Econometric society in June of 2009. We thank seminar participants at Brown, Columbia, Harvard-MIT, the Dutch Econometric Study Group, Fuqua School of Business, and NYU for helpful comments. We also thank Denis Chetverikov, JB Doyle, and Joonhwan Lee for thorough reading of this paper and helpful feedback.

690 citations

Journal ArticleDOI
TL;DR: The authors developed a framework for performing estimation and inference in econometric models with partial identification, focusing particularly on models characterized by moment inequalities and equalities, and developed methods for analyzing the asymptotic properties of sample criterion functions under set identification.
Abstract: This paper develops a framework for performing estimation and inference in econometric models with partial identification, focusing particularly on models characterized by moment inequalities and equalities. Applications of this framework include the analysis of game-theoretic models, revealed preference restrictions, regressions with missing and corrupted data, auction models, structural quantile regressions, and asset pricing models. Specifically, we provide estimators and confidence regions for the set of minimizers Θ I of an econometric criterion function Q(θ). In applications, the criterion function embodies testable restrictions on economic models. A parameter value θ that describes an economic model satisfies these restrictions if Q(θ) attains its minimum at this value. Interest therefore focuses on the set of minimizers, called the identified set. We use the inversion of the sample analog, Q n (θ), of the population criterion, Q(θ), to construct estimators and confidence regions for the identified set, and develop consistency, rates of convergence, and inference results for these estimators and regions. To derive these results, we develop methods for analyzing the asymptotic properties of sample criterion functions under set identification.

632 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Convergence of Probability Measures as mentioned in this paper is a well-known convergence of probability measures. But it does not consider the relationship between probability measures and the probability distribution of probabilities.
Abstract: Convergence of Probability Measures. By P. Billingsley. Chichester, Sussex, Wiley, 1968. xii, 253 p. 9 1/4“. 117s.

5,689 citations

Journal ArticleDOI
TL;DR: This work considers statistical inference for regression when data are grouped into clusters, with regression model errors independent across clusters but correlated within clusters, when the number of clusters is large and default standard errors can greatly overstate estimator precision.
Abstract: We consider statistical inference for regression when data are grouped into clus- ters, with regression model errors independent across clusters but correlated within clusters. Examples include data on individuals with clustering on village or region or other category such as industry, and state-year dierences-in-dierences studies with clustering on state. In such settings default standard errors can greatly overstate es- timator precision. Instead, if the number of clusters is large, statistical inference after OLS should be based on cluster-robust standard errors. We outline the basic method as well as many complications that can arise in practice. These include cluster-specic �xed eects, few clusters, multi-way clustering, and estimators other than OLS.

3,236 citations

Journal ArticleDOI
TL;DR: In the last two decades, much research has been done on the econometric and statistical analysis of such causal effects as discussed by the authors, which has reached a level of maturity that makes it an important tool in many areas of empirical research in economics, including labor economics, public finance, development economics, industrial organization, and other areas in empirical microeconomics.
Abstract: Many empirical questions in economics and other social sciences depend on causal effects of programs or policies. In the last two decades, much research has been done on the econometric and statistical analysis of such causal effects. This recent theoreti- cal literature has built on, and combined features of, earlier work in both the statistics and econometrics literatures. It has by now reached a level of maturity that makes it an important tool in many areas of empirical research in economics, including labor economics, public finance, development economics, industrial organization, and other areas of empirical microeconomics. In this review, we discuss some of the recent developments. We focus primarily on practical issues for empirical research- ers, as well as provide a historical overview of the area and give references to more technical research.

3,175 citations