scispace - formally typeset
Open AccessJournal ArticleDOI

Partial Identification in Triangular Systems of Equations With Binary Dependent Variables

Azeem M. Shaikh, +1 more
- 01 May 2011 - 
- Vol. 79, Iss: 3, pp 949-955
Reads0
Chats0
TLDR
In this article, the authors studied the special case of the triangular system of equations in Vytlacil and Yildiz (2007), where both dependent variables are binary but without imposing the restrictive support condition required by Vytlaacil et al. for identification of the average structural function and the average treatment effect.
Abstract
This paper studies the special case of the triangular system of equations in Vytlacil and Yildiz (2007), where both dependent variables are binary but without imposing the restrictive support condition required by Vytlacil and Yildiz (2007) for identification of the average structural function (ASF) and the average treatment effect (ATE). Under weak regularity conditions, we derive upper and lower bounds on the ASF and the ATE. We show further that the bounds on the ASF and ATE are sharp under some further regularity conditions and an additional restriction on the support of the covariates and the instrument.

read more

Content maybe subject to copyright    Report

Partial Identification in Triangular Systems of Equations with
Binary Dependent Variables
Azeem M. Shaikh
Department of Economics
University of Chicago
amshaikh@uchicago.edu
Edward J. Vytlacil
Department of Economics
Yale University
edward.vytlacil@yale.edu
July 15, 2010
Abstract
This paper studies models for binary outcome variables that contain a binary endogenous
regressor. More specifically, we consider a nonparametric, triangular system of equations with
binary dependent variables. The main assumption we impose is a weak separability condition on
each equation, or, equivalently, a threshold crossing model on each equation. In this setting, we
construct upper and lower bounds on the Average Structural Function (ASF) and the Average
Treatment Effect (ATE) under weak regularity conditions. The resulting bounds are narrower
the greater the strength of the instrument and the greater the degree to which the exogenous
covariates that enter the outcome equation can compensate for variation in the endogenous
regressor. We show further that the bounds on the ASF and ATE are sharp under an additional
restriction on the support of the covariates and the instrument.
JEL Codes: C14, C35
KEYWORDS: Partial Identification, Simultaneous Equation Model, Binary Dependent Vari-
able, Endogeneity, Threshold Crossing Model, Weak Separability, Average Structural Function,
Average Treatment Effect
ACKNOWLEDGEMENTS: We would like to thank Hide Ichimura, Jim Heckman, Whitney
Newey, and Jim Powell for very helpful comments on this paper. This research was conducted
in part while Edward Vytlacil was in residence at Hitotsubashi University. This research was
supported by NSF SES-05-51089 and DMS-0820310.
An earlier version of this paper titled “Threshold Crossing Models and Bounds on Treatment Effects: A Non-
parametric Analysis” appeared in May 2005 as NBER Technical Working Paper 307.
1

1 Introduction
This paper studies models for binary outcome variables that contain a binary endogenous regres-
sor. More specifically, we consider a nonparametric, triangular system of equations with binary
dependent variables. The main assumption we impose is a weak separability condition on each
equation, or, equivalently, a threshold crossing model on each equation. This structure nests the
bivariate probit model with structural shift of Heckman (1978) as a special case. In this setting,
we consider the problem of partially identifying the Average Structural Function (ASF) and the
Average Treatment Effect (ATE), thereby extending the identification results of Vytlacil and Yildiz
(2007).
In order to define this structure precisely, let D denote the binary endogenous regressor and
let Y denote the outcome of interest. For example, D might denote receipt of job training and Y
later employment, or D might denote receipt of a medical intervention and Y later mortality. See
Bhattacharya et al. (2009) for an application of the methodology developed in this paper to the
evaluation of the impact of Swan-Ganz catheterization on patient mortality. Consider the following
triangular system of equations:
Y = g
1
(D, X,
1
)
D = g
2
(Z,
2
) .
(1)
Here, X and Z are observed random vectors that may share elements in common, and
1
and
2
are unobserved random variables. Following Blundell and Powell (2004), our object of interest is
the Average Structural Function (ASF)
G
1
(d, x) =
Z
g
1
(d, x,
1
)dF
1
,
where (d, x) denotes a potential realization of the random vector (D, X). The ASF averages against
the unconditional distribution of
1
, not the distribution of
1
conditional on the possibly endoge-
nous regressor D, and thus gives the expected outcome of Y if D were determined exogenously.
We also consider
G
1
(x) = G
1
(1, x) G
1
(0, x) ,
which is often referred to as the Average Treatment Effect (ATE) in the treatment effect literature.
The main assumption we impose is that g
1
and g
2
both satisfy weak separability of the observed
regressors from the unobserved error term. As will be further discussed in Section 2, for a binary
dependent variable, such an assumption is equivalent to assuming that the function is weakly
increasing in the error term, as in Chesher (2005), assuming the monotonicity restriction considered
by Imbens and Angrist (1994), or assuming that the model can be represented as a threshold crossing
model with an additively separable latent error, as in Heckman and Vytlacil (2005). For ease of
2

analysis, we will work with the threshold crossing representation of the model, i.e.,
Y = I{ν
1
(D, X)
1
}
D = I{ν
2
(Z)
2
} .
(2)
If one assumes that ν
1
and ν
2
are linear functions and that (
1
,
2
) has a bivariate normal distribu-
tion, then the above model reduces to the classical bivariate probit with structural shift considered
in Heckman (1978). We will not impose any such parametric functional form or parametric distri-
butional assumptions in this paper.
In addition to the weak separability assumption described above, we will require some mild
regularity of the distribution of (
1
,
2
). We will also assume that X and Z are exogenous in the
sense that (X, Z) (
1
,
2
). Note that D may still be endogenous in the Y equation due to
possible dependence between
1
and
2
. For example, those who receive the job training might
have the worst human capital, or those who receive the medical intervention might have the worst
latent health. The resulting bounds on the ASF or ATE are substantially narrower than alternative
bounds that do not impose our weak separability restrictions. Under certain restrictions on the
distribution of (X, Z) and the functions ν
1
and ν
2
in (2), we show further that the bounds we derive
on the ASF and ATE are sharp in the sense that for any value lying between the upper and lower
bounds, there will exist a distribution of unobservable variables satisfying all of the assumptions
of our analysis that is consistent with both the distribution of the observed data and the proposed
value of the ASF or the ATE.
Identification of the ASF and ATE with this structure was previously considered by Vytlacil
and Yildiz (2007). They show that when the support of the distribution of X conditional on
Pr{D = 1|Z} is sufficiently rich it is possible to point identify the ASF and the ATE. Their support
condition will fail if, for example, X is a discrete random variable, and would be expected to fail near
the boundaries of the support of X if X has bounded support. In this paper, we investigate what
can be inferred about the ASF or the ATE without imposing this support restriction. To this end,
we first use a modified instrumental variable-like procedure to determine what variation in X over-
compensates or under-compensates for ceteris paribus variation in D, and then use this information
to construct bounds on the ASF or the ATE. The resulting bounds are smaller the greater the
variation there is in X conditional on Pr{D = 1|Z}, and collapse to point identification under the
Vytlacil and Yildiz (2007) condition of sufficient variation in X conditional on Pr{D = 1|Z}.
As mentioned earlier, our weak separability restriction on the functions g
1
and g
2
is equivalent to
imposing that the functions are weakly increasing in the error terms
1
and
2
, respectively. We do
not impose the stronger requirement that either function is strictly increasing in its error term, as
to do so would imply under our regularity conditions on the distribution of (
1
,
2
) that Y or D must
be continuous. For this reason, we cannot follow the control variate-approach used, e.g., in Altonji
3

and Matzkin (2005), Blundell and Powell (2004), Chesher (2003), and Imbens and Newey (2010),
which would require g
2
to be strictly increasing in
2
. Similarly, we cannot follow the quantile
instrumental variable-approach used in Chernozhukov and Hansen (2005) and Chernozhukov et al.
(2007), which would require g
1
to be strictly increasing in
1
.
Our analysis is similar to Chesher (2005), who only assumes that g
1
and g
2
are weakly increasing
in
1
and
2
, respectively. In his analysis, the object of interest is g
1
itself, while in this paper we
focus more modestly on the ASF and the ATE. More importantly, his analysis requires a rank
condition that cannot hold except in trivial cases when D is binary. When D is binary, the rank
condition under which he constructs bounds for g
1
(0, x, τ) is that there exists some value z
0
such
that Pr{D = 1|Z = z
0
} τ 0 and for g
1
(1, x, τ) that there exists some value z
0
such that
1 τ Pr{D = 1|Z = z
0
}. These conditions cannot hold for any value of τ except τ = 0 or τ = 1,
in which case the ASF is identified following arguments in Heckman and Vytlacil (2001). See Jun
et al. (2009) for extensions of his analysis and Chesher (2007) for related analysis that considers
partial identification of g
1
without imposing any restrictions on g
2
.
The analysis of this paper has recently been extended in subsequent work by Chiburis (2009).
While we show that our bounds are sharp whenever the support of (X, Z) may be written as the
product of the support of X and the support of Z, Chiburis (2009) shows that our bounds may
not be sharp without this restriction. On the other hand, he presents numerical evidence that
suggests that our bounds will often be close to the sharp bounds even when this restriction fails.
Moreover, our bounds are much simpler to describe than the sharp bounds derived in Chiburis
(2009). Chiburis (2009) also considers restrictions beyond what we impose, such as linear latent
index restrictions and parametric distributional assumptions.
The remainder of the paper is organized as follows. In Section 2, we formally define our
assumptions and analyze the connection between our assumptions and the assumptions considered
in the previous literature. Our main results are contained in Section 3. We conclude with a
numerical example in Section 4.
2 Model and Assumptions
In addition to assuming that Y and D are determined by (2), we will make use of the following
assumptions in our analysis:
Assumption 2.1 (X, Z) (
1
,
2
).
Assumption 2.2 The distribution of (
1
,
2
) has strictly positive density w.r.t. Lebesgue measure
on R
2
.
4

Assumption 2.3 The support of the distribution of (X, Z), supp(X, Z), is compact.
Assumption 2.4 The functions ν
1
(·), and ν
2
(·) are continuous.
Assumption 2.5 The distribution of ν
2
(Z)|X is nondegenerate.
In the derivation of our bounds, we will exploit the assumption that Y and D are determined by
(2) and Assumptions 2.1 - 2.2. Formally, our analysis will not require Assumption 2.5, but when it
fails our bounds will reduce to those of Manski (1989), who imposes no structure on the equations
determining Y and D. In this sense, though formally our results will not require a variable in Z
that is not in X, they will be nontrivial only when there is a variable in Z that is not contained in
X. When this is the case, any regressor in X that is not in Z will provide an additional source of
identifying power in our analysis. We will make use of Assumptions 2.3 and 2.4 only when arguing
that the bounds are sharp.
As discussed in Vytlacil (2006), the existence of a threshold crossing representation with an
additive latent error as in (2) is equivalent to several other nonparametric monotonicity conditions
considered in the literature. In fact, by combining results from the previous literature, we have the
following lemma:
Lemma 2.1 For f : W × E 7→ {0, 1}, where W R
K
W
, E R
K
E
, the following statements are
equivalent:
(i) For any w, ˜w W, f (w, e
) > f( ˜w, e
) for some e
E f(w, e) f( ˜w, e) for all e E.
(ii) There exists a function ν : E 7→ R with range R(ν) and a function g : W × R(ν) 7→ R with
g(w, t) weakly increasing in t such that f(w, e) = g(w, ν(e)) for all (w, e) W × E.
(iii) There exists a function ν : W 7→ R with range R(ν) and a function g : R(ν) × E 7→ R with
g(t, e) weakly increasing in t such that f(w, e) = g(ν(w), e) for all (w, e) W × E.
(iv) There exists a function ν : W 7→ R and a function λ : E 7→ R such that f (w, e) = I{ν(w)
λ(e)} for all (w, e) W × E.
Proof: The equivalence between (i) and (iii) follows from Theorem C.1 of Vytlacil and Yildiz
(2007). The equivalences between (i) and (iv) and between (ii) and (iv) follow from straightforward
modifications to the proof of Theorem 1 of Vytlacil (2002).
Restriction (i) in Lemma 2.1 is imposed on the model for D by Imbens and Angrist (1994).
Imbens and Angrist (1994) refer to this restriction as “monotonicity,” whereas Heckman and Vyt-
lacil (2005) refer to it as a uniformity condition. This restriction on D without the corresponding
5

Citations
More filters
Posted Content

Evolution and Rationality Some Recent Game-Theoretic Results. Identification and Estimation of Local Average Treatment Effects

TL;DR: In this paper, the authors investigated conditions sufficient for identification of average treatment effects using instrumental variables and showed that the existence of valid instruments is not sufficient to identify any meaningful average treatment effect.
Posted Content

Identifying the Effects of SNAP (Food Stamps) on Child Health Outcomes When Participation Is Endogenous and Misreported

TL;DR: In this paper, the authors derive informative bounds on the average treatment effect of SNAP on child food insecurity, general poor health, obesity, and anemia across a range of different assumptions used to address the selection and classification error problems.
Journal ArticleDOI

Intersection Bounds: Estimation and Inference

TL;DR: A practical and novel method for inference on intersection bounds, namely bounds defined by either the infimum or supremum of a parametric or nonparametric function, or equivalently, the value of a linear programming problem with a potentially infinite constraint set is developed.
ReportDOI

Intersection Bounds: estimation and inference

TL;DR: This work develops a practical and novel method for inference on intersection bounds, namely bounds defined by either the infimum or supremum of a parametric or nonparametric function, or equivalently, the value of a linear programming problem with a potentially infinite constraint set.
Journal ArticleDOI

Identifying the Effects of SNAP (Food Stamps) on Child Health Outcomes When Participation Is Endogenous and Misreported

TL;DR: The authors derived informative bounds on the average treatment effect (ATE) of SNAP on child food insecurity, poor general health, obesity, and anemia across a range of different assumptions used to address the selection and classification error problems.
References
More filters
Posted Content

Evolution and Rationality Some Recent Game-Theoretic Results. Identification and Estimation of Local Average Treatment Effects

TL;DR: In this paper, the authors investigated conditions sufficient for identification of average treatment effects using instrumental variables and showed that the existence of valid instruments is not sufficient to identify any meaningful average treatment effect.
ReportDOI

Dummy Endogenous Variables in a Simultaneous Equation System

James J. Heckman
- 01 Jul 1978 - 
TL;DR: In this article, the authors considered the formulation and estimation of simultaneous equation models with both discrete and continuous endogenous variables and proposed a statistical model that is sufficiently rich to encompass the classical simultaneous equation model for continuous endogenous variable and more recent models for purely discrete endogenous variables as special cases of a more general model.
Posted Content

Varieties of selection bias

Posted Content

Identification and Estimation of Local Average Treatment Effects

TL;DR: In this paper, the authors investigated conditions sufficient for identification of average treatment effects using instrumental variables and showed that the existence of valid instruments is not sufficient to identify any meaningful average treatment effect.
Journal ArticleDOI

An IV Model of Quantile Treatment Effects

TL;DR: In this article, the authors developed a model of quantile treatment effects (QTE) in the presence of endogeneity and obtained conditions for identification of the QTE without functional form assumptions.
Related Papers (5)