Topic

# Imputation (statistics)

About: Imputation (statistics) is a research topic. Over the lifetime, 8203 publications have been published within this topic receiving 315547 citations. The topic is also known as: data imputation.

##### Papers published on a yearly basis

##### Papers

More filters

•

01 Jan 1987

TL;DR: This work states that maximum Likelihood for General Patterns of Missing Data: Introduction and Theory with Ignorable Nonresponse and large-Sample Inference Based on Maximum Likelihood Estimates is likely to be high.

Abstract: Preface.PART I: OVERVIEW AND BASIC APPROACHES.Introduction.Missing Data in Experiments.Complete-Case and Available-Case Analysis, Including Weighting Methods.Single Imputation Methods.Estimation of Imputation Uncertainty.PART II: LIKELIHOOD-BASED APPROACHES TO THE ANALYSIS OF MISSING DATA.Theory of Inference Based on the Likelihood Function.Methods Based on Factoring the Likelihood, Ignoring the Missing-Data Mechanism.Maximum Likelihood for General Patterns of Missing Data: Introduction and Theory with Ignorable Nonresponse.Large-Sample Inference Based on Maximum Likelihood Estimates.Bayes and Multiple Imputation.PART III: LIKELIHOOD-BASED APPROACHES TO THE ANALYSIS OF MISSING DATA: APPLICATIONS TO SOME COMMON MODELS.Multivariate Normal Examples, Ignoring the Missing-Data Mechanism.Models for Robust Estimation.Models for Partially Classified Contingency Tables, Ignoring the Missing-Data Mechanism.Mixed Normal and Nonnormal Data with Missing Values, Ignoring the Missing-Data Mechanism.Nonignorable Missing-Data Models.References.Author Index.Subject Index.

18,201 citations

•

01 Jan 1987

TL;DR: In this article, a survey of drinking behavior among men of retirement age was conducted and the results showed that the majority of the participants reported that they did not receive any benefits from the Social Security Administration.

Abstract: Tables and Figures. Glossary. 1. Introduction. 1.1 Overview. 1.2 Examples of Surveys with Nonresponse. 1.3 Properly Handling Nonresponse. 1.4 Single Imputation. 1.5 Multiple Imputation. 1.6 Numerical Example Using Multiple Imputation. 1.7 Guidance for the Reader. 2. Statistical Background. 2.1 Introduction. 2.2 Variables in the Finite Population. 2.3 Probability Distributions and Related Calculations. 2.4 Probability Specifications for Indicator Variables. 2.5 Probability Specifications for (X,Y). 2.6 Bayesian Inference for a Population Quality. 2.7 Interval Estimation. 2.8 Bayesian Procedures for Constructing Interval Estimates, Including Significance Levels and Point Estimates. 2.9 Evaluating the Performance of Procedures. 2.10 Similarity of Bayesian and Randomization--Based Inferences in Many Practical Cases. 3. Underlying Bayesian Theory. 3.1 Introduction and Summary of Repeated--Imputation Inferences. 3.2 Key Results for Analysis When the Multiple Imputations are Repeated Draws from the Posterior Distribution of the Missing Values. 3.3 Inference for Scalar Estimands from a Modest Number of Repeated Completed--Data Means and Variances. 3.4 Significance Levels for Multicomponent Estimands from a Modest Number of Repeated Completed--Data Means and Variance--Covariance Matrices. 3.5 Significance Levels from Repeated Completed--Data Significance Levels. 3.6 Relating the Completed--Data and Completed--Data Posterior Distributions When the Sampling Mechanism is Ignorable. 4. Randomization--Based Evaluations. 4.1 Introduction. 4.2 General Conditions for the Randomization--Validity of Infinite--m Repeated--Imputation Inferences. 4.3Examples of Proper and Improper Imputation Methods in a Simple Case with Ignorable Nonresponse. 4.4 Further Discussion of Proper Imputation Methods. 4.5 The Asymptotic Distibution of (Qm,Um,Bm) for Proper Imputation Methods. 4.6 Evaluations of Finite--m Inferences with Scalar Estimands. 4.7 Evaluation of Significance Levels from the Moment--Based Statistics Dm and Dm with Multicomponent Estimands. 4.8 Evaluation of Significance Levels Based on Repeated Significance Levels. 5. Procedures with Ignorable Nonresponse. 5.1 Introduction. 5.2 Creating Imputed Values under an Explicit Model. 5.3 Some Explicit Imputation Models with Univariate YI and Covariates. 5.4 Monotone Patterns of Missingness in Multivariate YI. 5.5 Missing Social Security Benefits in the Current Population Survey. 5.6 Beyond Monotone Missingness. 6. Procedures with Nonignorable Nonresponse. 6.1 Introduction. 6.2 Nonignorable Nonresponse with Univariate YI and No XI. 6.3 Formal Tasks with Nonignorable Nonresponse. 6.4 Illustrating Mixture Modeling Using Educational Testing Service Data. 6.5 Illustrating Selection Modeling Using CPS Data. 6.6 Extensions to Surveys with Follow--Ups. 6.7 Follow--Up Response in a Survey of Drinking Behavior Among Men of Retirement Age. References. Author Index. Subject Index. Appendix I. Report Written for the Social Security Administration in 1977. Appendix II. Report Written for the Census Bureau in 1983.

14,574 citations

••

TL;DR: Mice adds new functionality for imputing multilevel data, automatic predictor selection, data handling, post-processing imputed values, specialized pooling routines, model selection tools, and diagnostic graphs.

Abstract: The R package mice imputes incomplete multivariate data by chained equations. The software mice 1.0 appeared in the year 2000 as an S-PLUS library, and in 2001 as an R package. mice 1.0 introduced predictor selection, passive imputation and automatic pooling. This article documents mice, which extends the functionality of mice 1.0 in several ways. In mice, the analysis of imputed data is made completely general, whereas the range of models under which pooling works is substantially extended. mice adds new functionality for imputing multilevel data, automatic predictor selection, data handling, post-processing imputed values, specialized pooling routines, model selection tools, and diagnostic graphs. Imputation of categorical data is improved in order to bypass problems caused by perfect prediction. Special attention is paid to transformations, sum scores, indices and interactions using passive imputation, and to the proper setup of the predictor matrix. mice can be downloaded from the Comprehensive R Archive Network. This article provides a hands-on, stepwise approach to solve applied incomplete data problems.

10,234 citations

••

TL;DR: In this article, it was shown that ignoring the process that causes missing data when making sampling distribution inferences about the parameter of the data, θ, is generally appropriate if and only if the missing data are missing at random and the observed data are observed at random, and then such inferences are generally conditional on the observed pattern of missing data.

Abstract: Two results are presented concerning inference when data may be missing. First, ignoring the process that causes missing data when making sampling distribution inferences about the parameter of the data, θ, is generally appropriate if and only if the missing data are “missing at random” and the observed data are “observed at random,” and then such inferences are generally conditional on the observed pattern of missing data. Second, ignoring the process that causes missing data when making Bayesian inferences about θ is generally appropriate if and only if the missing data are missing at random and the parameter of the missing data is “independent” of θ. Examples and discussion indicating the implications of these results are included.

8,197 citations

••

01 Jan 2001

TL;DR: In this article, the authors present a case study in least squares fitting and interpretation of a linear model, where they use nonparametric transformations of X and Y to fit a linear regression model.

Abstract: Introduction * General Aspects of Fitting Regression Models * Missing Data * Multivariable Modeling Strategies * Resampling, Validating, Describing, and Simplifying the Model * S-PLUS Software * Case Study in Least Squares Fitting and Interpretation of a Linear Model * Case Study in Imputation and Data Reduction * Overview of Maximum Likelihood Estimation * Binary Logistic Regression * Logistic Model Case Study 1: Predicting Cause of Death * Logistic Model Case Study 2: Survival of Titanic Passengers * Ordinal Logistic Regression * Case Study in Ordinal Regrssion, Data Reduction, and Penalization * Models Using Nonparametic Transformations of X and Y * Introduction to Survival Analysis * Parametric Survival Models * Case Study in Parametric Survival Modeling and Model Approximation * Cox Proportional Hazards Regression Model * Case Study in Cox Regression

7,264 citations