scispace - formally typeset
Search or ask a question

Showing papers by "Donald B. Rubin published in 1975"


Journal ArticleDOI
TL;DR: In this paper, it was shown that ignoring the process that causes missing data when making sampling distribution inferences about the parameter of the data, θ, is generally appropriate if and only if the missing data are missing at random and the observed data are observed at random, and then such inferences are generally conditional on the observed pattern of missing data.
Abstract: Two results are presented concerning inference when data may be missing. First, ignoring the process that causes missing data when making sampling distribution inferences about the parameter of the data, θ, is generally appropriate if and only if the missing data are “missing at random” and the observed data are “observed at random,” and then such inferences are generally conditional on the observed pattern of missing data. Second, ignoring the process that causes missing data when making Bayesian inferences about θ is generally appropriate if and only if the missing data are missing at random and the parameter of the missing data is “independent” of θ. Examples and discussion indicating the implications of these results are included.

3,914 citations


Journal ArticleDOI
TL;DR: In this article, it was shown that the ratio of the variance of a linear estimator to the mean square error of the estimator as an estimator of a parameter τ is the same as that of αi.
Abstract: Let t1,…,tk be independent unbiased estimators of a parameter τ. Let t* = Σαiti have minimum variance among unbiased linear estimators of τ, and let , be any linear combinations with () independent of t1,…,tk. We prove that is unbiased for τ and that the ratio of the variance of to the variance of t* is where is the mean square error of as an estimator of αi.

17 citations


Journal ArticleDOI
TL;DR: In this paper, a simple example is presented in which an event has taken place that is very rare (in some sense), and useful Bayesian and likelihood inferences are simple to obtain and yet in cases of interest do not agree with simple sampling distribution inferences.
Abstract: A simple example is presented in which an event has taken place that is very rare (in some sense). Useful Bayesian and likelihood inferences are simple to obtain and yet in cases of interest do not agree with simple sampling distribution inferences. Useful sampling distribution inferences seem to be difficult to obtain in these cases.

2 citations


Journal ArticleDOI
TL;DR: In this paper, a generalization of the multiple correlation coefficient is defined which is appropriate when there are missing values but is identical to the MCC when no missing values are present.
Abstract: The sample multiple correlation coefficient is often used to select a subset of independent variables that “best” predicts a dependent variable, Y. If the data are partially missing, the choice of best predictors often should reflect not only how correlated the predictors are with Y but also how likely they are to be observed. Thus, an independent variable that is highly correlated with Y but also is difficult to record (i.e., is often missing) may not be as useful a predictor of Y as a less correlated but easily recorded independent variable. A generalization of the multiple correlation coefficient is defined which is appropriate when there are missing values but is identical to the multiple correlation coefficient when there are no missing values. An example of its use is presented.

1 citations


Journal ArticleDOI
TL;DR: In this paper, a method is given for estimating the effect of non-respondents in a sample survey, which is based on Bayesian techniques and yields a "confidence interval" for the value of the average response in the survey if all nonrespondents had responded.
Abstract: A method is given for estimating, in a subjective sense, the effect of non-respondents in a sample survey. The method is based on Bayesian techniques and yields a “confidence interval” for the value of the average response in the survey if all non-respondents had responded. Background information which is recorded for both respondents and non-respondents plays an important role although is not needed. The technique is illustrated with real survey data of 660 schools; 488 had 80 dependent variables recorded, 172 had no dependent variables recorded, and all 660 had 35 background variables recorded. On the basis of this example, it appears as if the method can be useful in practical problems. The general idea, of which the method presented here is only a specific example, can be applied to any problem with non-respondents and/or missing data.