What Are We Weighting For

Home
/
Papers
/
What Are We Weighting For

Posted Content•

What Are We Weighting For

Gary Solon¹, Steven J. Haider, Jeffrey M. Wooldridge•Institutions (1)

01 Feb 2013-Research Papers in Economics (National Bureau of Economic Research, Inc)-

TL;DR: Three distinct weighting motives are discussed: to achieve precise estimates by correcting for heteroskedasticity; to achieve consistent estimates by corrected for endogenous sampling; and to identify average partial effects in the presence of unmodeled heterogeneity of effects.

read less

Abstract: The purpose of this paper is to help empirical economists think through when and how to weight the data used in estimation. We start by distinguishing two purposes of estimation: to estimate population descriptive statistics and to estimate causal effects. In the former type of research, weighting is called for when it is needed to make the analysis sample representative of the target population. In the latter type, the weighting issue is more nuanced. We discuss three distinct potential motives for weighting when estimating causal effects: (1) to achieve precise estimates by correcting for heteroskedasticity, (2) to achieve consistent estimates by correcting for endogenous sampling, and (3) to identify average partial effects in the presence of unmodeled heterogeneity of effects. In each case, we find that the motive sometimes does not apply in situations where practitioners often assume it does. We recommend diagnostics for assessing the advisability of weighting, and we suggest methods for appropriate inference.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

A Practitioner’s Guide to Cluster-Robust Inference

[...]

A. Colin Cameron, Douglas L. Miller

31 Mar 2015-Journal of Human Resources

TL;DR: This work considers statistical inference for regression when data are grouped into clusters, with regression model errors independent across clusters but correlated within clusters, when the number of clusters is large and default standard errors can greatly overstate estimator precision.

...read moreread less

Abstract: We consider statistical inference for regression when data are grouped into clus- ters, with regression model errors independent across clusters but correlated within clusters. Examples include data on individuals with clustering on village or region or other category such as industry, and state-year dierences-in-dierences studies with clustering on state. In such settings default standard errors can greatly overstate es- timator precision. Instead, if the number of clusters is large, statistical inference after OLS should be based on cluster-robust standard errors. We outline the basic method as well as many complications that can arise in practice. These include cluster-specic �xed eects, few clusters, multi-way clustering, and estimators other than OLS.

...read moreread less

3,236 citations

Journal Article•DOI•

Difference-in-Differences with Variation in Treatment Timing

[...]

Andrew Goodman-Bacon¹, Andrew Goodman-Bacon²•Institutions (2)

Federal Reserve Bank of Minneapolis¹, National Bureau of Economic Research²

01 Dec 2021-Journal of Econometrics

TL;DR: This paper showed that the two-way fixed effects estimator equals a weighted average of all possible two-group/two-period DD estimators in the data and decompose the difference between two specifications, and provide a new analysis of models that include time-varying controls.

...read moreread less

1,414 citations

Journal Article•DOI•

Designing Difference in Difference Studies: Best Practices for Public Health Policy Research

[...]

Coady Wing¹, Kosali Simon², Ricardo A. Bello-Gomez¹•Institutions (2)

Indiana University¹, National Bureau of Economic Research²

02 Apr 2018-Annual Review of Public Health

TL;DR: Key features of DID designs are reviewed with an emphasis on public health policy research and it is noted that combining elements from multiple quasi-experimental techniques may be important in the next wave of innovations to the DID approach.

...read moreread less

Abstract: The difference in difference (DID) design is a quasi-experimental research design that researchers often use to study causal relationships in public health settings where randomized controlled trials (RCTs) are infeasible or unethical. However, causal inference poses many challenges in DID designs. In this article, we review key features of DID designs with an emphasis on public health policy research. Contemporary researchers should take an active approach to the design of DID studies, seeking to construct comparison groups, sensitivity analyses, and robustness checks that help validate the method's assumptions. We explain the key assumptions of the design and discuss analytic tactics, supplementary analysis, and approaches to statistical inference that are often important in applied research. The DID design is not a perfect substitute for randomized experiments, but it often represents a feasible way to learn about casual relationships. We conclude by noting that combining elements from multiple quasi-experimental techniques may be important in the next wave of innovations to the DID approach.

...read moreread less

789 citations

Journal Article•DOI•

The Effect of Minimum Wages on Low-Wage Jobs

[...]

Doruk Cengiz¹, Arindrajit Dube², Attila Lindner³, Ben Zipperer⁴•Institutions (4)

University of Massachusetts Amherst¹, National Bureau of Economic Research², Hungarian Academy of Sciences³, Economic Policy Institute⁴

01 Aug 2019-Quarterly Journal of Economics

TL;DR: In this paper, the authors estimate the effect of minimum wages on low-wage jobs using 138 prominent state-level minimum wage changes between 1979 and 2016 in the U.S using a dierence-in-dierences approach.

...read moreread less

Abstract: We estimate the eect of minimum wages on low-wage jobs using 138 prominent state-level minimum wage changes between 1979 and 2016 in the U.S using a dierence-in-dierences approach. We first estimate the eect of the minimum wage increase on employment changes by wage bins throughout the hourly wage distribution. We then focus on the bottom part of the wage distribution and compare the number of excess jobs paying at or slightly above the new minimum wage to the missing jobs paying below it to infer the employment eect. We find that the overall number of low-wage jobs remained essentially unchanged over the five years following the increase. At the same time, the direct eect of the minimum wage on average earnings was amplified by modest wage spillovers at the bottom of the wage distribution. Our estimates by detailed demographic groups show that the lack of job loss is not explained by labor-labor substitution at the bottom of the wage distribution. We also find no evidence of disemployment when we consider higher levels of minimum wages. However, we do find some evidence of reduced employment in tradable sectors. We also show how decomposing the overall employment eect by wage bins allows a transparent way of assessing the plausibility of estimates.

...read moreread less

449 citations

Journal Article•DOI•

The effect of medical marijuana laws on adolescent and adult use of marijuana, alcohol, and other substances.

[...]

Hefei Wen¹, Jason M. Hockenberry¹, Jason M. Hockenberry², Janet R. Cummings¹•Institutions (2)

Emory University¹, National Bureau of Economic Research²

01 Jul 2015-Journal of Health Economics

TL;DR: MMLs have no discernible impact on drinking behavior for those aged 12-20, or the use of other psychoactive substances in either age group, but increase in the probability of current marijuana use, regular marijuana use and marijuana abuse/dependence among those aged 21 or above.

...read moreread less

313 citations

Cites background from "What Are We Weighting For"

...…stratification, which f the NSDUH sampling design would suppress the state-clustering adjustment. hen considering the choice between the two, Solon et al. (2013) noted that theoetically “neither strictly dominates the other (in identifying the population average ffect)” (Solon et al., 2013, p. 21)....
[...]
...…age 21 stratification, which f the NSDUH sampling design would suppress the state-clustering adjustment. hen considering the choice between the two, Solon et al. (2013) noted that theoetically “neither strictly dominates the other (in identifying the population average ffect)” (Solon et al., 2013,…...
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity

[...]

Halbert White

01 May 1980-Econometrica

TL;DR: In this article, a parameter covariance matrix estimator which is consistent even when the disturbances of a linear regression model are heteroskedastic is presented, which does not depend on a formal model of the structure of the heteroSkewedness.

...read moreread less

Abstract: This paper presents a parameter covariance matrix estimator which is consistent even when the disturbances of a linear regression model are heteroskedastic. This estimator does not depend on a formal model of the structure of the heteroskedasticity. By comparing the elements of the new estimator to those of the usual covariance estimator, one obtains a direct test for heteroskedasticity, since in the absence of heteroskedasticity, the two estimators will be approximately equal, but will generally diverge otherwise. The test has an appealing least squares interpretation.

...read moreread less

25,689 citations

Monograph•DOI•

The Analysis of Household Surveys : A Microeconometric Approach to Development Policy

[...]

Angus Deaton

31 Jul 1997

TL;DR: Deaton as mentioned in this paper reviewed the analysis of household survey data, including the construction of household surveys, the econometric tools useful for such analysis, and a range of problems in development policy for which this survey analysis can be applied.

...read moreread less

Abstract: Two decades after its original publication, The Analysis of Household Surveys is reissued with a new preface by its author, Sir Angus Deaton, recipient of the 2015 Nobel Prize in Economic Sciences. This classic work remains relevant to anyone with a serious interest in using household survey data to shed light on policy issues. This book reviews the analysis of household survey data, including the construction of household surveys, the econometric tools useful for such analysis, and a range of problems in development policy for which this survey analysis can be applied. The author's approach remains close to the data, using transparent econometric and graphical techniques to present data in a way that can clearly inform policy and academic debates. Chapter 1 describes the features of survey design that need to be understood in order to undertake appropriate analysis. Chapter 2 discusses the general econometric and statistical issues that arise when using survey data for estimation and inference. Chapter 3 covers the use of survey data to measure welfare, poverty, and distribution. Chapter 4 focuses on the use of household budget data to explore patterns of household demand. Chapter 5 discusses price reform, its effects on equity and efficiency, and how to measure them. Chapter 6 addresses the role of household consumption and saving in economic development. The book includes an appendix providing code and programs using STATA, which can serve as a template for the users' own analysis.

...read moreread less

4,835 citations

Journal Article•DOI•

Estimation of Regression Coefficients When Some Regressors are not Always Observed

[...]

James M. Robins¹, Andrea Rotnitzky¹, Lue Ping Zhao²•Institutions (2)

Harvard University¹, Fred Hutchinson Cancer Research Center²

01 Sep 1994-Journal of the American Statistical Association

TL;DR: In this paper, a new class of semiparametric estimators, based on inverse probability weighted estimating equations, were proposed for parameter vector α 0 of the conditional mean model when the data are missing at random in the sense of Rubin and the missingness probabilities are either known or can be parametrically modeled.

...read moreread less

Abstract: In applied problems it is common to specify a model for the conditional mean of a response given a set of regressors. A subset of the regressors may be missing for some study subjects either by design or happenstance. In this article we propose a new class of semiparametric estimators, based on inverse probability weighted estimating equations, that are consistent for parameter vector α0 of the conditional mean model when the data are missing at random in the sense of Rubin and the missingness probabilities are either known or can be parametrically modeled. We show that the asymptotic variance of the optimal estimator in our class attains the semiparametric variance bound for the model by first showing that our estimation problem is a special case of the general problem of parameter estimation in an arbitrary semiparametric model in which the data are missing at random and the probability of observing complete data is bounded away from 0, and then deriving a representation for the efficient score...

...read moreread less

2,638 citations

Posted Content•

Computing Inequality: Have Computers Changed the Labor Market?

[...]

David H. Autor¹, Lawrence F. Katz², Alan B. Krueger²•Institutions (2)

Harvard University¹, National Bureau of Economic Research²

01 Mar 1997-Research Papers in Economics

TL;DR: The authors examined the effect of technological change and other factors on the relative demand for workers with different education levels and on the recent growth of U.S. educational wage differentials and found that the increase in demand shifts for more-skilled workers in the 1970s and 1980s relative to the 1960s is entirely accounted for by an increase in within- industry changes in skill utilization rather than between-industry employment shifts.

...read moreread less

Abstract: This paper examines the effect of technological change and other factors on the relative demand for workers with different education levels and on the recent growth of U.S. educational wage differentials. A simple supply-demand framework is used to interpret changes in the relative quantities, wages, and wage bill shares of workers by education in the aggregate U.S. labor market in each decade since 1940 and over the 1990 to 1995 period. The results suggest that the relative demand for college graduates grew more rapidly on average during the past 25 years (1970-95) than during the previous three decades (1940-70). The increased rate of growth of relative demand for college graduates beginning in the 1970s did not lead to an increase in the college/high school wage differential until the 1980s because the growth in the supply of college graduates increased even more sharply in the 1970s before returning to historical levels in the 1980s. The acceleration in demand shifts for more-skilled workers in the 1970s and 1980s relative to the 1960s is entirely accounted for by an increase in within- industry changes in skill utilization rather than between-industry employment shifts. Industries with large increases in the rate of skill upgrading in the 1970s and 1980s versus the 1960s are those with greater growth in employee computer usage, more computer capital per worker, and larger shares of computer investment as a share of total investment. The results suggest that the spread of computer technology may "explain" as much as 30 to 50 percent of the increase in the rate of growth of the relative demand for more-skilled workers since 1970.

...read moreread less

1,943 citations

Journal Article•DOI•

The Estimation of Choice Probabilities from Choice Based Samples

[...]

Charles F. Manski, Steven R. Lerman

01 Nov 1977-Econometrica

TL;DR: In this paper, a choice-based sampling process is proposed to estimate the parameters of a probabilistic choice model when choices rather than decision makers are sampled and the characteristics of the decision makers selecting those alternatives are observed.

...read moreread less

Abstract: Ti-H CONCERN of this paper is the estimation of the parameters of a probabilistic choice model when choices rather than decision makers are sampled. Existing estimation methods presuppose an exogeneous sampling process, that is one in which a sequence of decision makers are drawn and their choice behaviors observed. In contrast, in choice based sampling processes, a sequence of chosen alternatives are drawn and the characteristics of the decision makers selecting those alternatives are observed. The problem of estimating a choice model from a choice based sample has suibstantive interest because data collection costs for such processes are often considerably smaller than for exogeneous sampling. Particular instances of this differential occur in the analysis of transportation behavior. For example, in studying choice of mode for work trips, it is often less expensive to survey transit users at the station and auto users at the parking lot than to interview commuters at their homes. Similarly, in examining choice of destination for shopping trips, surveys conducted at various shopping centers offer significant cost savings relative to home interviews.2 While interest in transportation applications provided the original motivation for our work, it has become apparent that choice based sampling processes can be cost effective in the analysis of numerous decision problems. In particular, wherever decision makers are physically clustered according to the alternatives they select, choice based sampling processes can achieve economies of scale not available with exogeneous sampling. Some non-transportation decision problems in which decision makers do cluster as described include the schooling decisions of students, the job decisions of workers, the medical care decisions of patients and the residential location decisions of households. Realization of the sampling cost benefits of choice based samples presupposes of course that the parameters of the underlying choice model can logically be inferred from such samples and that a tractable estimator with desirable statistical properties can be found. We shall, in this paper, confirm the logical supposition, develop a suitable estimator, and characterize the behavior of existing, exogeneous sampling, estimators in the context of choice based samples. An outline of the presentation and summary of major results follows.

...read moreread less

1,304 citations