scispace - formally typeset
Open AccessJournal ArticleDOI

Bootstrap-Based Improvements for Inference with Clustered Errors

Reads0
Chats0
TLDR
In this article, the authors investigate inference using cluster bootstrap-t procedures that provide asymptotic refinement, including the example of Bertrand, Duflo, and Mullainathan.
Abstract
Researchers have increasingly realized the need to account for within-group dependence in estimating standard errors of regression parameter estimates. The usual solution is to calculate cluster-robust standard errors that permit heteroskedasticity and within-cluster error correlation, but presume that the number of clusters is large. Standard asymptotic tests can over-reject, however, with few (five to thirty) clusters. We investigate inference using cluster bootstrap-t procedures that provide asymptotic refinement. These procedures are evaluated using Monte Carlos, including the example of Bertrand, Duflo, and Mullainathan (2004). Rejection rates of 10% using standard methods can be reduced to the nominal size of 5% using our methods.

read more

Content maybe subject to copyright    Report

NBER TECHNICAL WORKING PAPER SERIES
BOOTSTRAP-BASED IMPROVEMENTS FOR INFERENCE WITH CLUSTERED
ERRORS
A. Colin Cameron
Jonah B. Gelbach
Douglas L. Miller
Technical Working Paper 344
http://www.nber.org/papers/t0344
NATIONAL BUREAU OF ECONOMIC RESEARCH
1050 Massachusetts Avenue
Cambridge, MA 02138
September 2007
We thank an anonymous referee and participants at The Australian National University, UC Berkeley,
UC Riverside, Dartmouth College, Florida State University, Indiana University, and MIT for useful
comments. Miller acknowledges funding from the National Institute on Aging, through Grant Number
T32-AG00186 to the NBER. The views expressed herein are those of the author(s) and do not necessarily
reflect the views of the National Bureau of Economic Research.
© 2007 by A. Colin Cameron, Jonah B. Gelbach, and Douglas L. Miller. All rights reserved. Short
sections of text, not to exceed two paragraphs, may be quoted without explicit permission provided
that full credit, including © notice, is given to the source.

Bootstrap-Based Improvements for Inference with Clustered Errors
A. Colin Cameron, Jonah B. Gelbach, and Douglas L. Miller
NBER Technical Working Paper No. 344
September 2007
JEL No. C12,C15,C21
ABSTRACT
Researchers have increasingly realized the need to account for within-group dependence in estimating
standard errors of regression parameter estimates. The usual solution is to calculate cluster-robust
standard errors that permit heteroskedasticity and within-cluster error correlation, but presume that
the number of clusters is large. Standard asymptotic tests can over-reject, however, with few (5-30)
clusters. We investigate inference using cluster bootstrap-t procedures that provide asymptotic refinement.
These procedures are evaluated using Monte Carlos, including the example of Bertrand, Duflo and
Mullainathan (2004). Rejection rates of ten percent using standard methods can be reduced to the nominal
size of five percent using our methods.
A. Colin Cameron
Department of Economics
UC Davis
One Shields Avenue
Davis, CA 95616
accameron@ucdavis.edu
Jonah B. Gelbach
Department of Economics
University of Arizona
1130 E. Helen Street
Tucson, AZ 85721-0108
gelbach@email.arizona.edu
Douglas L. Miller
UC, Davis
Department of Economics
One Shields Avenue
davis, CA 95616-8578
and NBER
dlmiller@ucdavis.edu

1 Introduction
Microeconometrics researchers have increasingly realized the essential need
to account for an y within-grou p dependence in estimating standard errors of
regression param eter estimates. In many settings the default OLS standard
errors that ignore such clustering can greatly underestim ate the true OLS
standard errors, as empha sized by Moulton (1986, 1990).
A comm on correction is to compute cluster-robust standard errors that
generalize the W h ite (198 0) h eteroskedastic-consistent estimate of OL S stan -
dard errors to the clustered setting. This perm its both error heteroskeda s-
ticity and quite exible error correlation within cluster, unlik e a muc h more
restrictiv e random eects or error components model. In econometrics this
adju stm ent w as proposed by W hite (1984 ) and Arellano (1987), and it is im-
plemented in STATA, for example, using the cluster option. In the statistics
literature these are called sandw ich standard errors, proposed by Liang and
Zeger (1986) for generalized estimating equations, and they are implemented
in SAS, for example, within the GE NMOD procedure. A recent brief survey
is giv en in Wooldridge (2003).
Not all empirical studies use appropriate corrections for clustering. In
particular, for xed eects panel models the errors are usually correlated
3

ev en after con trol for xed eects, ye t many studies either pro vide no control
for serial correlation or erroneously cluster at too ne a lev el. Kézdi (2004)
demon strated the usefulness of cluster robust standard errors in this setting
and con trasted these with other standard errors based on stronger distribu-
tional assumptions. Bertrand, Duo, and Mullainathan (2004), henceforth
BDM (2004), focused on implications for dierence-in-dierence (DID ) stud-
ies using variation across states and yea rs. Then the regressor of in t erest is
an indicator variable that is highly correlated within cluster (state) so there
is great need to correct standard errors for clustering. The clustering should
be on state, rather than on state-y ear.
A practical lim itation of inference with cluster-rob ust standard errors is
that the asympto tic justication assumes that the nu mber of clusters goes
to innity. Yet in some applications there may be few clusters. For example,
this happens if clustering is on region and there are few regions. W ith a
small nu mber of clusters the cluster-robust standard errors are do w nw a rds
biased. Bias corrections have been proposed in the statistics literature; see
Kauerm an n and Carroll (2001), M ancl and DeRouen (2001), and Bell and
McCarey (2002). Angrist and Lavy (2002) in an applied study nd that bias
adjustm e nt of cluster-robust standard errors can make quite a dierence. But
4

ev en after appropriate bias correction, with few clusters the usual Wald sta-
tistics for h ypothesis testing with asymptotic standard normal or chi-square
critical values over-reject. BDM (2004) demonstrate through a Monte Carlo
experim ent that the Wald test based on (unadjusted) cluster-robu st standa rd
errors o ver-rejects if standard norm al critical values are used. Donald and
Lang (2007) also demonstrate this and propose, for DID studies with policy
in variant within state, an alternative two-step GLS estimator that leads to T-
distributed Wald tests in some special circumstances. Ibra gim ov and Muller
(2007) propose an alternate approac h based on separate estimation within
eac h group. They separate the data in to independent groups, estimate the
model within each group, average the separate estimates and divide by the
samp le standard deviation of these estimates, and then compa re against crit-
ical values from a T-distr ibut ion. T h is approach holds prom ise for settings
with few groups and where model identication and a central limit theorem
holds within each group. Our proposed method does not require the latter
t wo conditions, can be used to test m ultiple hypotheses, and is based on the
param eter estimator commo nly used in practice.
In this paper we in vestigate whether bootstrapp ing to obtain asymp-
totic renement leads to improv ed inference for OLS estimation with cluster-
5

Citations
More filters
Journal ArticleDOI

A Practitioner’s Guide to Cluster-Robust Inference

TL;DR: This work considers statistical inference for regression when data are grouped into clusters, with regression model errors independent across clusters but correlated within clusters, when the number of clusters is large and default standard errors can greatly overstate estimator precision.
Journal ArticleDOI

Robust Inference with Multi-way Clustering

TL;DR: The authors proposed a variance estimator for the OLS estimator as well as for nonlinear estimators such as logit, probit, and GMM that enables cluster-robust inference when there is two-way or multiway clustering that is nonnested.
Book

Mostly harmless econometrics

TL;DR: The core methods in today's econometric toolkit are linear regression for statistical control, instrumental variables methods for the analysis of natural experiments, and differences-in-differences methods that exploit policy changes as mentioned in this paper.
Journal ArticleDOI

Teacher training, teacher quality and student achievement

TL;DR: In this article, the effects of various types of education and training on the productivity of teachers in promoting student achievement were studied. But they did not find a consistent relationship between formal professional development training and teacher productivity, and they found no evidence that teachers' pre-service training or college entrance exam scores are related to productivity.
BookDOI

Does Management Matter? Evidence from India

TL;DR: In this article, the authors run a management field experiment on large Indian textile firms, providing free consulting on modern management practices to a randomly chosen set of treatment plants and compared their performance to the control plants.
References
More filters
Book

An introduction to the bootstrap

TL;DR: This article presents bootstrap methods for estimation, using simple arguments, with Minitab macros for implementing these methods, as well as some examples of how these methods could be used for estimation purposes.
Journal ArticleDOI

A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity

Halbert White
- 01 May 1980 - 
TL;DR: In this article, a parameter covariance matrix estimator which is consistent even when the disturbances of a linear regression model are heteroskedastic is presented, which does not depend on a formal model of the structure of the heteroSkewedness.
Journal ArticleDOI

Longitudinal data analysis using generalized linear models

TL;DR: In this article, an extension of generalized linear models to the analysis of longitudinal data is proposed, which gives consistent estimates of the regression parameters and of their variance under mild assumptions about the time dependence.
Journal ArticleDOI

Bootstrap Methods: Another Look at the Jackknife

TL;DR: In this article, the authors discuss the problem of estimating the sampling distribution of a pre-specified random variable R(X, F) on the basis of the observed data x.
Journal ArticleDOI

How Much Should We Trust Differences-In-Differences Estimates?

TL;DR: In this article, the authors randomly generate placebo laws in state-level data on female wages from the Current Population Survey and use OLS to compute the DD estimate of its "effect" as well as the standard error of this estimate.
Frequently Asked Questions (19)
Q1. What are the contributions in this paper?

The views expressed herein are those of the author ( s ) and do not necessarily reflect the views of the National Bureau of Economic Research. 

The primary contribution of this paper is to use bootstrap procedures to obtain more accurate cluster-robust inference when there are few clusters. 

The variation the authors use is one that uses equal weights and probability, and uses residuals from OLS estimation that imposes the null hypothesis. 

The standard method for resampling that preserves the within-cluster features of the error is a pairscluster bootstrap that resamples at the cluster level, so that if the gth cluster is selected then all data (dependent and regressor variables) in that cluster appear in the resample. 

The data are clustered into G independent groups, so the resampling method should be one that assumes independence across clusters but preserves correlation within clusters. 

The obvious method is a residual cluster bootstrap that resamples with re-placement from the original sample residual vectors to give residuals {bu∗1, ..., bu∗G} and hence pseudo-sample {(by∗1,X1), ..., (by∗G,Xg)} where by∗g = X0gbβ + bu∗g. 

A practical limitation of inference with cluster-robust standard errors is that the asymptotic justification assumes that the number of clusters goes to infinity. 

One important conclusion of BDM (2004) is that for few (six) clusters the cluster-robust estimator performs poorly, and for moderate (ten and twenty) number of clusters theirbootstrap based method also does poorly. 

In particular, one can hold regressors X constant throughout the pseudo-samples, while resampling the residuals which can be then used to construct new values of the dependent variable y. 

A common correction is to compute cluster-robust standard errors that generalize the White (1980) heteroskedastic-consistent estimate of OLS standard errors to the clustered setting. 

They find that (1) default standard errors do poorly; (2) cluster-robust standard errors do well for all but G = 6; and (3) their bootstrap, which the authors discuss in their section 3.1, does poorly for low numbers of clusters, with actual rejection rates 0.44, 0.23 and 0.13 for G = 6, 10 and 20, respectively. 

The authors use three different cluster bootstrap resampling methods, respectively, the pairs cluster bootstrap, the residual clusters bootstrap with H0 imposed, and the wild bootstrap with H0 imposed. 

The remaining bootstrap-t methods all yield rejection rates less than 0.08, with the residual cluster bootstrap-t and wild cluster bootstrap-t doing best. 

If the authors instead bootstrap this Wald statistic with B = 999 replications, the pairs cluster bootstrap-t yields p = 0.209, the residual cluster bootstrap-tgives p = 0.112, and the wild cluster bootstrap-t gives a p-value of 0.070.12 

The authors believe that the p-value for the pairs cluster bootstrap is implausibly large, for reasons discussed in the BDM replication, while the other two bootstraps lead to plausible p-values that, as expected, are larger than those obtained by using asymptotic normal critical values. 

Donald and Lang (2007) also demonstrate this and propose, for DID studies with policy invariant within state, an alternative two-step GLS estimator that leads to Tdistributed Wald tests in some special circumstances. 

An alternative method with asymptotic refinement is the bias-corrected accelerated (BCA) procedure, defined in Efron (1987), Hall (1992, pp. 128- 141), and in (Cameron, Gelbach, and Miller, 2006). 

The bootstrap-t procedure directly bootstraps w which is asymptotically pivotal since the standard normal has no unknown parameters. 

in contrast to the bootstrap-t procedure, it does not offer asymptotic refinement, and so may perform worse 6Alternative names used in the literature include cluster bootstrap, case bootstrap, nonparametric bootstrap, and nonoverlapping block bootstrap. 

Trending Questions (1)
In accounting better robust standard errors or clustered firm standard errors?

Clustered firm standard errors are preferred over robust standard errors for better inference, especially with few clusters, as they reduce over-rejection rates and provide asymptotic refinement.