scispace - formally typeset
Journal ArticleDOI

Inference and missing data

Donald B. Rubin
- 01 Dec 1976 - 
- Vol. 63, Iss: 3, pp 581-592
TLDR
In this article, it was shown that ignoring the process that causes missing data when making sampling distribution inferences about the parameter of the data, θ, is generally appropriate if and only if the missing data are missing at random and the observed data are observed at random, and then such inferences are generally conditional on the observed pattern of missing data.
Abstract
Two results are presented concerning inference when data may be missing. First, ignoring the process that causes missing data when making sampling distribution inferences about the parameter of the data, θ, is generally appropriate if and only if the missing data are “missing at random” and the observed data are “observed at random,” and then such inferences are generally conditional on the observed pattern of missing data. Second, ignoring the process that causes missing data when making Bayesian inferences about θ is generally appropriate if and only if the missing data are missing at random and the parameter of the missing data is “independent” of θ. Examples and discussion indicating the implications of these results are included.

read more

Citations
More filters
Journal ArticleDOI

A Two-Part Random-Effects Model for Semicontinuous Longitudinal Data

TL;DR: In this article, the authors extend the two-part regression approach to longitudinal settings by introducing random coefficients into both the logistic and the linear stages, and obtain maximum likelihood estimates for the fixed coefficients and variance components by an approximate Fisher scoring procedure based on high-order Laplace approximations.
Journal ArticleDOI

The proportion of missing data should not be used to guide decisions on multiple imputation.

TL;DR: Evidence is provided that for MAR data, valid MI reduces bias even when the proportion of missingness is large, and researchers are advised to use FMI to guide choice of auxiliary variables for efficiency gain in imputation analyses, and that sensitivity analyses including different imputation models may be needed if the number of complete cases is small.
Posted Content

Learning from Noisy Labels with Deep Neural Networks: A Survey

TL;DR: A comprehensive review of 62 state-of-the-art robust training methods, all of which are categorized into five groups according to their methodological difference, followed by a systematic comparison of six properties used to evaluate their superiority.
Journal ArticleDOI

Semiparametric Regression for Repeated Outcomes With Nonignorable Nonresponse

TL;DR: The authors propose a class of augmented inverse probability of response weighted estimators that are consistent and asymptotically normal (CAN) for estimating β* when the response probabilities can be parametrically modeled and a CAN estimator exists.
Journal ArticleDOI

Average causal effects from nonrandomized studies: A practical guide and simulated example.

TL;DR: In this article, the authors review Rubin's definition of an average causal effect (ACE) as the average difference between potential outcomes under different treatments and review 9 strategies for estimating ACEs on the basis of regression, propensity scores, and doubly robust methods.
References
More filters
Book

Bayesian inference in statistical analysis

TL;DR: In this article, the effect of non-normality on inference about a population mean with generalizations was investigated. But the authors focused on the effect on the mean with information from more than one source.
Journal ArticleDOI

Maximum Likelihood Estimates for a Multivariate Normal Distribution when Some Observations are Missing

TL;DR: In this paper, the authors give an approach to derive maximum likelihood estimates of parameters of multivariate normal distributions in cases where some observations are missing (Edgett [2] and Lord [3], [4]).
Journal ArticleDOI

Missing Observations in Multivariate Statistics I. Review of the Literature

TL;DR: In this paper, a review of the literature on the problem of handling multivariate data with observations missing on some or all of the variables under study is presented, where the authors examine the ways that statisticians have devised to estimate means, variances, correlations and linear regression functions.