scispace - formally typeset
Journal ArticleDOI

Maximum likelihood estimation for mixed continuous and categorical data with missing values

Roderick J. A. Little, +1 more
- 01 Dec 1985 - 
- Vol. 72, Iss: 3, pp 497-512
Reads0
Chats0
TLDR
In this paper, the general location model of Olkin & Tate (1961) and extensions introduced by Krzanowski (1980, 1982) form the basis for the maximum likelihood procedures for analyzing mixed continuous and categorical data with missing values.
Abstract
SUMMARY Maximum likelihood procedures for analysing mixed continuous and categorical data with missing values are presented. The general location model of Olkin & Tate (1961) and extensions introduced by Krzanowski (1980, 1982) form the basis for our methods. Maximum likelihood estimation with incomplete data is achieved by an application of the EM algorithm (Dempster, Laird & Rubin, 1977). Special cases of the algorithm include Orchard & Woodbury's (1972) algorithm for incomplete normal samples, Fuchs's (1982) algorithms for log linear modelling of partially classified contingency tables, and Day's (1969) algorithm for multivariate normal mixtures. Applications include: (a) imputation of missing values, (b) logistic regression and discriminant analysis with missing predictors and unclassified observations, (c) linear regression with missing continuous and categorical predictors, and (d) parametric cluster analysis with incomplete data. Methods are illustrated using data from the St Louis Risk Research Project. Some key word8: Cluster analysis; Discriminant analysis; EM algorithm; Incomplete data; Linear regression; Logistic regression; Log linear model; Mixture model.

read more

Citations
More filters
Journal ArticleDOI

MissForest—non-parametric missing value imputation for mixed-type data

TL;DR: In this comparative study, missForest outperforms other methods of imputation especially in data settings where complex interactions and non-linear relations are suspected and the out-of-bag imputation error estimates of missForest prove to be adequate in all settings.
Journal Article

A multivariate technique for multiply imputing missing values using a sequence of regression models

TL;DR: In this paper, the authors describe and evaluate a procedure for imputing missing values for a relatively complex data structure when the data are missing at random, by fitting a sequence of regression models and drawing values from corresponding predictive distributions.
Journal ArticleDOI

Modeling the Drop-Out Mechanism in Repeated-Measures Studies

TL;DR: Methods that simultaneously model the data and the drop-out process within a unified model-based framework are discussed, and possible extensions outlined.
Journal ArticleDOI

Regression with missing X’s: A review

TL;DR: The literature of regression analysis with missing values of the independent variables is reviewed in this article, where six classes of procedures are distinguished: complete case analysis, available case methods, least squares on imputed data, maximum likelihood, Bayesian methods, and multiple imputation.

Regression with Missing X's: A Review

TL;DR: Regression With Missing X's: A Review Author(s): Roderick J. A.
References
More filters
Book

An Introduction to Multivariate Statistical Analysis

TL;DR: In this article, the distribution of the Mean Vector and the Covariance Matrix and the Generalized T2-Statistic is analyzed. But the distribution is not shown to be independent of sets of Variates.
Journal ArticleDOI

Inference and missing data

Donald B. Rubin
- 01 Dec 1976 - 
TL;DR: In this article, it was shown that ignoring the process that causes missing data when making sampling distribution inferences about the parameter of the data, θ, is generally appropriate if and only if the missing data are missing at random and the observed data are observed at random, and then such inferences are generally conditional on the observed pattern of missing data.
Book

Discrete multivariate analysis: theory and practice

TL;DR: Discrete Multivariate Analysis is a comprehensive text and general reference on the analysis of discrete multivariate data, particularly in the form of multidimensional tables, and contains a wealth of material on important topics.