scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Constructing a Control Group Using Multivariate Matched Sampling Methods That Incorporate the Propensity Score

01 Feb 1985-The American Statistician (American Statistical Association)-Vol. 39, Iss: 1, pp 33-38
TL;DR: This article used multivariate matching methods in an observational study of the effects of prenatal exposure to barbiturates on subsequent psychological development, using the propensity score as a distinct matching variable.
Abstract: Matched sampling is a method for selecting units from a large reservoir of potential controls to produce a control group of modest size that is similar to a treated group with respect to the distribution of observed covariates. We illustrate the use of multivariate matching methods in an observational study of the effects of prenatal exposure to barbiturates on subsequent psychological development. A key idea is the use of the propensity score as a distinct matching variable.
Citations
More filters
Journal ArticleDOI
TL;DR: The propensity score is a balancing score: conditional on the propensity score, the distribution of observed baseline covariates will be similar between treated and untreated subjects, and different causal average treatment effects and their relationship with propensity score analyses are described.
Abstract: The propensity score is the probability of treatment assignment conditional on observed baseline characteristics. The propensity score allows one to design and analyze an observational (nonrandomized) study so that it mimics some of the particular characteristics of a randomized controlled trial. In particular, the propensity score is a balancing score: conditional on the propensity score, the distribution of observed baseline covariates will be similar between treated and untreated subjects. I describe 4 different propensity score methods: matching on the propensity score, stratification on the propensity score, inverse probability of treatment weighting using the propensity score, and covariate adjustment using the propensity score. I describe balance diagnostics for examining whether the propensity score model has been adequately specified. Furthermore, I discuss differences between regression-based methods and propensity score-based methods for the analysis of observational data. I describe different causal average treatment effects and their relationship with propensity score analyses.

7,895 citations


Cites background or methods from "Constructing a Control Group Using ..."

  • ...There are two primary methods for this: nearest neighbor matching and nearest neighbor matching within a specified caliper distance (Rosenbaum & Rubin, 1985)....

    [...]

  • ...Building on the prior work of Cochran and Rubin on matching on a single normally distributed confounding variable, Rosenbaum and Rubin (1985) suggested that similar reduction in bias can be achieved by matching on the logit of the propensity score using caliper widths similar to those described by…...

    [...]

  • ...Propensity score matching entails forming matched sets of treated and untreated subjects who share a similar value of the propensity score (Rosenbaum & Rubin, 1983a, 1985)....

    [...]

Journal ArticleDOI
TL;DR: Propensity score matching (PSM) has become a popular approach to estimate causal treatment effects as discussed by the authors, but empirical examples can be found in very diverse fields of study, and each implementation step involves a lot of decisions and different approaches can be thought of.
Abstract: Propensity score matching (PSM) has become a popular approach to estimate causal treatment effects. It is widely applied when evaluating labour market policies, but empirical examples can be found in very diverse fields of study. Once the researcher has decided to use PSM, he is confronted with a lot of questions regarding its implementation. To begin with, a first decision has to be made concerning the estimation of the propensity score. Following that one has to decide which matching algorithm to choose and determine the region of common support. Subsequently, the matching quality has to be assessed and treatment effects and their standard errors have to be estimated. Furthermore, questions like 'what to do if there is choice-based sampling?' or 'when to measure effects?' can be important in empirical studies. Finally, one might also want to test the sensitivity of estimated treatment effects with respect to unobserved heterogeneity or failure of the common support condition. Each implementation step involves a lot of decisions and different approaches can be thought of. The aim of this paper is to discuss these implementation issues and give some guidance to researchers who want to use PSM for evaluation purposes.

5,510 citations

Journal ArticleDOI
TL;DR: This paper decompose the conventional measure of evaluation bias into several components and find that bias due to selection on unobservables, commonly called selection bias in econometrics, is empirically less important than other components, although it is still a sizeable fraction of the estimated programme impact.
Abstract: This paper considers whether it is possible to devise a nonexperimental procedure for evaluating a prototypical job training programme. Using rich nonexperimental data, we examine the performance of a two-stage evaluation methodology that (a) estimates the probability that a person participates in a programme and (b) uses the estimated probability in extensions of the classical method of matching. We decompose the conventional measure of programme evaluation bias into several components and find that bias due to selection on unobservables, commonly called selection bias in econometrics, is empirically less important than other components, although it is still a sizeable fraction of the estimated programme impact. Matching methods applied to comparison groups located in the same labour markets as participants and administered the same questionnaire eliminate much of the bias as conventionally measured, but the remaining bias is a considerable fraction of experimentally-determined programme impact estimates. We test and reject the identifying assumptions that justify the classical method of matching. We present a nonparametric conditional difference-in-differences extension of the method of matching that is consistent with the classical index-sufficient sample selection model and is not rejected by our tests of identifying assumptions. This estimator is effective in eliminating bias, especially when it is due to temporally-invariant omitted variables.

5,069 citations

Journal ArticleDOI
TL;DR: The propensity score, defined as the conditional probability of being treated given the covariates, can be used to balance the variance of covariates in the two groups, and therefore reduce bias as mentioned in this paper.
Abstract: In observational studies, investigators have no control over the treatment assignment. The treated and non-treated (that is, control) groups may have large differences on their observed covariates, and these differences can lead to biased estimates of treatment effects. Even traditional covariance analysis adjustments may be inadequate to eliminate this bias. The propensity score, defined as the conditional probability of being treated given the covariates, can be used to balance the covariates in the two groups, and therefore reduce this bias. In order to estimate the propensity score, one must model the distribution of the treatment indicator variable given the observed covariates. Once estimated the propensity score can be used to reduce bias through matching, stratification (subclassification), regression adjustment, or some combination of all three. In this tutorial we discuss the uses of propensity score methods for bias reduction, give references to the literature and illustrate the uses through applied examples.

4,948 citations

Journal ArticleDOI
TL;DR: In this article, the authors use a particular model for causal inference (Holland and Rubin 1983; Rubin 1974) to critique the discussions of other writers on causation and causal inference.
Abstract: Problems involving causal inference have dogged at the heels of statistics since its earliest days. Correlation does not imply causation, and yet causal conclusions drawn from a carefully designed experiment are often valid. What can a statistical model say about causation? This question is addressed by using a particular model for causal inference (Holland and Rubin 1983; Rubin 1974) to critique the discussions of other writers on causation and causal inference. These include selected philosophers, medical researchers, statisticians, econometricians, and proponents of causal modeling.

4,845 citations

References
More filters
Journal ArticleDOI
TL;DR: The authors discusses the central role of propensity scores and balancing scores in the analysis of observational studies and shows that adjustment for the scalar propensity score is sufficient to remove bias due to all observed covariates.
Abstract: : The results of observational studies are often disputed because of nonrandom treatment assignment. For example, patients at greater risk may be overrepresented in some treatment group. This paper discusses the central role of propensity scores and balancing scores in the analysis of observational studies. The propensity score is the (estimated) conditional probability of assignment to a particular treatment given a vector of observed covariates. Both large and small sample theory show that adjustment for the scalar propensity score is sufficient to remove bias due to all observed covariates. Applications include: matched sampling on the univariate propensity score which is equal percent bias reducing under more general conditions than required for discriminant matching, multivariate adjustment by subclassification on balancing scores where the same subclasses are used to estimate treatment effects for all outcome variables and in all subpopulations, and visual representation of multivariate adjustment by a two-dimensional plot. (Author)

23,744 citations

Journal ArticleDOI
David Cox1
01 May 1971

4,635 citations

Journal ArticleDOI
TL;DR: In this article, five subclasses defined by the estimated propensity score are constructed that balance 74 covariates, and thereby provide estimates of treatment effects using direct adjustment, and these subclasses are applied within sub-populations, and model-based adjustments are then used to provide estimates for treatment effects within these sub-population.
Abstract: The propensity score is the conditional probability of assignment to a particular treatment given a vector of observed covariates. Previous theoretical arguments have shown that subclassification on the propensity score will balance all observed covariates. Subclassification on an estimated propensity score is illustrated, using observational data on treatments for coronary artery disease. Five subclasses defined by the estimated propensity score are constructed that balance 74 covariates, and thereby provide estimates of treatment effects using direct adjustment. These subclasses are applied within sub-populations, and model-based adjustments are then used to provide estimates of treatment effects within these sub-populations. Two appendixes address theoretical issues related to the application: the effectiveness of subclassification on the propensity score in removing bias, and balancing properties of propensity scores with incomplete data.

3,860 citations

Journal ArticleDOI
TL;DR: In this article, the authors proposed a simple technique for assessing the range of plausible causal con- fusions from observational studies with a binary outcome and an observed categorical covariate, under several sets of assumptions about u. The technique assesses the sensitivity of conclusions to assumptions about an unobserved binary covariate relevant to both treatment assignment and response.
Abstract: This paper proposes a simple technique for assessing the range of plausible causal con- clusions from observational studies with a binary outcome and an observed categorical covariate. The technique assesses the sensitivity of conclusions to assumptions about an unobserved binary covariate relevant to both treatment assignment and response. A medical study of coronary artery disease is used to illustrate the technique. Inevitably, the results of clinical studies are subject to dispute. In observational studies, one basis for dispute is obvious: since patients were not assigned to treatments at random, patients at greater risk may be over-represented in some treatment groups. This paper proposes a method for assess- ing the sensitivity of causal conclusions to an unmeasured patient characteristic relevant to both treatment assignment and response. Despite their limitations, observational studies will continue to be a valuable source of information, and therefore it is prudent to develop appropriate methods of analysis for them. Our sensitivity analysis consists of the estimation of the average effect of a treatment on a binary outcome variable after adjustment for observed categorical covariates and an unobserved binary covariate u, under several sets of assumptions about u. Both Cornfield et al. (1959) and Bross (1966) have proposed guidelines for determining whether an unmeasured binary covariate having specified properties could explain all of the apparent effect of a treatment, that is, whether the treatment effect, after adjustment for u could be zero. Our method has two advantages: first, Cornfield et al. (1959) and Bross (1966) adjust only for the unmeasured binary covariate u, whereas we adjust for measured covariates in addition to the unmeasured covariate u. Second, Cornfield et al. (1959) and Bross (1966, 1967) only judge whether the effect of the treatment could be zero having adjusted for u, where Cornfield et al. (1959) employ an implicit yet extreme assumption about u. In contrast, we provide actual estimates of the treatment effect adjusted for both u and the observed categorical covariates under any assumption about u. In principle, the ith of the N patients under study has both a binary response r1i that would have resulted if he had received the new treatment, and a binary response ro0 that would have resulted if he had received the control treatment. In this formulation, treatment effects are comparisons of r1i and roi, such as r1i - roi. Since each patient receives only one treatment, either rli or ro0 is observed, but not both, and therefore comparisons of rli and roi imply some degree of speculation. Treatment effects defined as comparisons of the two potential responses, r1i and roi, of individual patients are implicit in Fisher's (1953) randomization test of the sharp null

1,005 citations

Book ChapterDOI
01 Jun 1974
TL;DR: This article reviewed the effectiveness of matched sampling and statistical adjustment, alone and in combination, in reducing bias due to confounding x-variables when comparing two populations, and the adjustment methods were linear regression adjustment for x continuous and direct standardization for x categorical.
Abstract: : This paper reviews work on the effectiveness of different methods of matched sampling and statistical adjustment, alone and in combination, in reducing bias due to confounding x-variables when comparing two populations. The adjustment methods were linear regression adjustment for x continuous and direct standardization for x categorical.

994 citations