scispace - formally typeset
Search or ask a question

Showing papers by "Donald B. Rubin published in 1986"


Journal ArticleDOI
TL;DR: In this paper, several multiple imputation techniques for simple random samples with ignorable nonresponse on a scalar outcome variable are compared using both analytic and Monte Carlo results concerning coverages of the resulting intervals for the population mean.
Abstract: Several multiple imputation techniques are described for simple random samples with ignorable nonresponse on a scalar outcome variable. The methods are compared using both analytic and Monte Carlo results concerning coverages of the resulting intervals for the population mean. Using m = 2 imputations per missing value gives accurate coverages in common cases and is clearly superior to single imputation (m = 1) in all cases. The performances of the methods for various m can be predicted well by linear interpolation in 1/(m — 1) between the results for m = 2 and m = ∞. As a rough guide, to assure coverages of interval estimates within 2% of the nominal level when using the preferred methods, the number of imputations per missing value should increase from 2 to 3 as the nonresponse rate increases from 10% to 60%.

725 citations



Journal ArticleDOI
TL;DR: The method of file concatenation with adjusted weights and multiple imputations is described and illustrated on an artificial example, showing the ability to display sensitivity of inference to untestable assumptions being made when creating the matched file.
Abstract: Statistically matched files are created in an attempt to solve the practical problem that exists when no single file has the full set of variables needed for drawing important inferences. Previous methods of file matching are reviewed, and the method of file concatenation with adjusted weights and multiple imputations is described and illustrated on an artificial example. A major benefit of this approach is the ability to display sensitivity of inference to untestable assumptions being made when creating the matched file.

436 citations



Journal ArticleDOI
TL;DR: In this paper, the stable unit treatment value assumption (SUTVA) is used for causal inference, which is a priori assumption that the value of an outcome variable for each unit when exposed to treatment t will be the same no matter what mechanism is used to assign treatment t to unit u and no matter how many treatments he other units receive.
Abstract: I congratulate my friend Paul Holland on his lucidly clear description of the basic perspective for causal inference referred to as Rubin's model. I have been advocating this general perspective for defining problems of causal inference since Rubin (1974), and with very little modification since Rubin (1978). The one point concerning the definition of causal effects that has continued to evolve in my thinking is the key role of the stable-unit-treatmentvalue assumption (SUTVA, as labeled in Rubin 1980) for deciding which questions are formulated well enough to have causal answers. Under SUTVA, the model's representation foutcomes is adequate. More explicitly, consider the situation with N units indexed by u = 1, .. ., N; T treatments indexed by t = 1, . . . , T; and outcome variable Y, whose possible values are represented by Y,\" (t = 1, . . . , T; u = 1, ... , N). SUTVA is simply the a priori assumption that the value of Y for unit u when exposed to treatment t will be the same no matter what mechanism isused to assign treatment t to unit u and no matter what treatments he other units receive, and this holds for all u = 1, . . . , N and all t = 1, . . . , T. SUTVA is violated when, for example, there xist unrepresented versions of treatments (Y,u depends on which version of treatment t was received) or interference b tween units (Y,1 depends on whether unit u' received treatment t or t').

309 citations


Book ChapterDOI
01 Jan 1986
TL;DR: In this article, the authors discuss the performance of two alternative approaches, the selection model approach and the mixture model approach, for obtaining estimates of means and regression estimates when nonresponse depends on the outcome variable.
Abstract: It is sometimes suspected that nonresponse to a sample survey is related to the primary outcome variable. This is the case, for example, in studies of income or of alcohol consumption behaviors. If nonresponse to a survey is related to the level of the outcome variable, then the sample mean of this outcome variable based on the respondents will generally be a biased estimate of the population mean. If this outcome variable has a linear regression on certain predictor variables in the population, then ordinary least squares estimates of the regression coefficients based on the responding units will generally be biased unless nonresponse is a stochastic function of these predictor variables. The purpose of this paper is to discuss the performance of two alternative approaches, the selection model approach and the mixture model approach, for obtaining estimates of means and regression estimates when nonresponse depends on the outcome variable. Both approaches extend readily to the situation when values of the outcome variable are available for a subsample of the nonrespondents, called “follow-ups.” The availability of follow-ups are a feature of the example we use to illustrate comparisons.

160 citations



Journal ArticleDOI
TL;DR: Methods of simulating the frequency coverage of an interval estimate bysimulating the average Bayesian posterior probability coverage are presented and can be much more efficient than the standard method that simulates the hit rate of the interval.
Abstract: SUMMARY Methods of simulating the frequency coverage of an interval estimate by simulating the average Bayesian posterior probability coverage are presented. The methods can be much more efficient than the standard method that simulates the hit rate of the interval. The possible increased efficiency is illustrated using three examples: estimating a binomial probability, bootstrapping a variance, and multiple imputation intervals for the mean.

29 citations


Book ChapterDOI
01 Jan 1986
TL;DR: In this article, a simple mathematical model for causal inference is proposed and the resolution of Lord's paradox from this perspective has two aspects: first, the descriptive, non-causal conclusions of the two hypothetical statisticians are both correct.
Abstract: Lord’s Paradox is analyzed in terms of a simple mathematical model for causal inference. The resolution of Lord’s Paradox from this perspective has two aspects. First, the descriptive, non-causal conclusions of the two hypothetical statisticians are both correct. They appear contradictory only because they describe quite different aspects of the data. Second, the causal inferences of the statisticians are neither correct nor incorrect since they are based on different assumptions that our mathematical model makes explicit, but neither assumption can be tested using the data set that is described in the example. We identify these differing assumptions and show how each may be used to justify the differing causal conclusions of the two statisticians. In addition to analyzing the classic “diet” example which Lord used to introduce his paradox, we also examine three other examples that appear in the three papers where Lord discusses the paradox and related matters.

2 citations