scispace - formally typeset
Open AccessJournal ArticleDOI

Recursive partitioning for heterogeneous causal effects

Susan Athey, +1 more
- 05 Jul 2016 - 
- Vol. 113, Iss: 27, pp 7353-7360
Reads0
Chats0
TLDR
This paper provides a data-driven approach to partition the data into subpopulations that differ in the magnitude of their treatment effects, and proposes an “honest” approach to estimation, whereby one sample is used to construct the partition and another to estimate treatment effects for each subpopulation.
Abstract
In this paper we propose methods for estimating heterogeneity in causal effects in experimental and observational studies and for conducting hypothesis tests about the magnitude of differences in treatment effects across subsets of the population. We provide a data-driven approach to partition the data into subpopulations that differ in the magnitude of their treatment effects. The approach enables the construction of valid confidence intervals for treatment effects, even with many covariates relative to the sample size, and without “sparsity” assumptions. We propose an “honest” approach to estimation, whereby one sample is used to construct the partition and another to estimate treatment effects for each subpopulation. Our approach builds on regression tree methods, modified to optimize for goodness of fit in treatment effects and to account for honest estimation. Our model selection criterion anticipates that bias will be eliminated by honest estimation and also accounts for the effect of making additional splits on the variance of treatment effect estimates within each subpopulation. We address the challenge that the “ground truth” for a causal effect is not observed for any individual unit, so that standard approaches to cross-validation must be modified. Through a simulation study, we show that for our preferred method honest estimation results in nominal coverage for 90% confidence intervals, whereas coverage ranges between 74% and 84% for nonhonest approaches. Honest estimation requires estimating the model with a smaller sample size; the cost in terms of mean squared error of treatment effects for our preferred method ranges between 7–22%.

read more

Citations
More filters
Journal ArticleDOI

Infinite but Rare: Valuation and Pricing in Marketplaces for Blockchain-Based Virtual Items

TL;DR: In this paper, the authors study how buyers value and sellers price blockchain-based digital collectables and propose a machine learning approach to value items at scale and develop a proof-of-concept decision support tool to help sellers value their digital items, addressing the pressing need for information transparency in this new market.
Journal ArticleDOI

Causal interaction trees: Finding subgroups with heterogeneous treatment effects in observational data

TL;DR: In this article, causal interaction tree (CIT) algorithms are introduced for finding subgroups of individuals with heterogeneous treatment effects in observational data. But the CIT algorithms are extensions of the classification and regression tree algorithm that use splitting criteria based on subgroup-specific treatment effect estimators appropriate for observational data, and they use them to construct splitting criteria for the cIT algorithms.
Posted Content

Improving pairwise comparison models using Empirical Bayes shrinkage.

TL;DR: This work derives a collection of methods for estimating the pairwise uncertainty of pairwise predictions based on different assumptions about the comparison process to examine model uncertainty as well as perform Empirical Bayes shrinkage estimation of the model parameters.
Posted Content

Estimating Bayesian Optimal Treatment Regimes for Dichotomous Outcomes using Observational Data

TL;DR: In this paper, Bayes optimal treatment regimes are estimated using a loss function defined on the bivariate distribution of dichotomous potential outcomes, which allows considering more general objectives for the OTR than maximization of an expected outcome (e.g., survival probability) by taking into account, for example, unnecessary treatment burden.
Posted ContentDOI

Robust Machine Learning for Treatment Effects in Multilevel Observational Studies Under Cluster-level Unmeasured Confounding.

TL;DR: This paper proposes a family of ML methods that estimate treatment effects in the presence of cluster-level unmeasured confounders, a type of unme measured confounder that is shared within each cluster and are common in multilevel observational studies.
References
More filters
Journal ArticleDOI

Random Forests

TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Journal ArticleDOI

Regression Shrinkage and Selection via the Lasso

TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Book

The Nature of Statistical Learning Theory

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?

Statistical learning theory

TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.
Journal ArticleDOI

The central role of the propensity score in observational studies for causal effects

Paul R. Rosenbaum, +1 more
- 01 Apr 1983 - 
TL;DR: The authors discusses the central role of propensity scores and balancing scores in the analysis of observational studies and shows that adjustment for the scalar propensity score is sufficient to remove bias due to all observed covariates.