scispace - formally typeset
Search or ask a question

Showing papers on "Imputation (statistics) published in 2002"


BookDOI
28 Mar 2002
TL;DR: In this article, the authors proposed a model for missing data in health-related quality of life (HRQoL) assessment using a multivariate regression model with multivariate procedures for non-monotone missing data.
Abstract: Introduction and Examples Health-related quality of life (HRQoL) Measuring health-related quality of life Study 1: Adjuvant breast cancer trial Study 2: Migraine prevention trial Study 3: Advanced lung cancer trial Study 4: Renal cell carcinoma trial Study 5: Chemoradiation (CXRT) trial Study 6: Osteoarthritis trial Study Design and Protocol Development Introduction Background and rationale Research objectives and goals Selection of subjects Longitudinal designs Selection of measurement instrument(s) Conduct of HRQoL assessments Scoring instruments Models for Longitudinal Studies I Introduction Building models for longitudinal studies Building repeated measures models: The mean structure Building repeated measures models: The covariance structure Estimation and hypothesis testing Models for Longitudinal Studies II Introduction Building growth curve models: The mean (fixed effects) structure Building growth curve models: The covariance structure Model reduction Hypothesis testing and estimation An alternative growth-curve model Moderation and Mediation Introduction Moderation Mediation Other exploratory analyses Characterization of Missing Data Introduction Patterns and causes of missing data Mechanisms of missing data Missing completely at random (MCAR) Missing at random (MAR) Missing not at random (MNAR) Example for trial with variation in timing of assessments Example with different patterns across treatment arms Analysis of Studies with Missing Data Introduction MCAR Ignorable missing data Non-ignorable missing data Simple Imputation Introduction to imputation Missing items in a multi-item questionnaire Regression-based methods Other simple imputation methods Imputing missing covariates Underestimation of variance Final comments Multiple Imputation Introduction Overview of multiple imputation Explicit univariate regression Closest neighbor and predictive mean matching Approximate Bayesian bootstrap (ABB) Multivariate procedures for non-monotone missing data Analysis of the M data sets Miscellaneous issues Pattern Mixture and Other Mixture Models Introduction Pattern mixture models Restrictions for growth curve models Restrictions for repeated measures models Variance estimation for mixture models Random Effects Dependent Dropout Introduction Conditional linear model Varying coefficient models Joint models with shared parameters Selection Models Introduction Outcome selection model for monotone dropout Multiple Endpoints Introduction General strategies for multiple endpoints Background concepts and definitions Single step procedures Sequentially rejective methods Closed testing and gatekeeper procedures Composite Endpoints and Summary Measures Introduction Choosing a composite or summary measure Summarizing across HRQoL domains or subscales Summary measure across time Composite endpoints across time Quality Adjusted Life-Years (QALYs) and Q-TWiST Introduction QALYs Q-TWiST Analysis Plans and Reporting Results Introduction General analysis plan Sample size and power Reporting results Appendix C: Cubic Smoothing Splines Appendix P: PAWS/SPSS Notes Appendix R: R Notes Appendix S: SAS Notes References A Summary appears at the end of each chapter.

516 citations


Journal ArticleDOI
TL;DR: This study showed that the theoretically more valid multiple imputation method did not lead to different point estimates than the more simple (longitudinal) imputation methods, and the estimated standard errors appeared to be theoretically more adequate, because they reflect the uncertainty in estimation caused by missing values.

486 citations



Judi Scheffer1
01 Jan 2002
TL;DR: This paper shows how the mean and standard deviation are affected by different methods of imputation, given different missingness mechanisms.
Abstract: What is done with missing data? Does the missingness mechanism matter? Is it a good idea to just use the default options in the major statistical packages? Even some highly trained statisticians do this, so can the non-statistician analysing their own data cope with some of the better techniques for handling missing data? This paper shows how the mean and standard deviation are affected by different methods of imputation, given different missingness mechanisms. Better options than the standard default options are available in the major statistical software, offering the chance to 'do the right thing' to the statistical and non-statistical community alike.

405 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present new computational techniques for multivariate longitudinal or clustered data with missing values by applying a multivariate extension of a popular linear mixed-effects model, creating multiple imputations of missing values for subsequent analyses by a straightforward and effective Markov chain Monte Carlo procedure.
Abstract: This article presents new computational techniques for multivariate longitudinal or clustered data with missing values. Current methodology for linear mixed-effects models can accommodate imbalance or missing data in a single response variable, but it cannot handle missing values in multiple responses or additional covariates. Applying a multivariate extension of a popular linear mixed-effects model, we create multiple imputations of missing values for subsequent analyses by a straightforward and effective Markov chain Monte Carlo procedure. We also derive and implement a new EM algorithm for parameter estimation which converges more rapidly than traditional EM algorithms because it does not treat the random effects as “missing data,” but integrates them out of the likelihood function analytically. These techniques are illustrated on models for adolescent alcohol use in a large school-based prevention trial.

310 citations


01 Jan 2002
TL;DR: This analysis indicates that missing data imputation based on the k-nearest neighbour algorithm can outperform the internal methods used by C4.5 and CN2 to treat missing data.
Abstract: Data quality is a major concern in Machine Learning and other correlated areas such as Knowledge Discovery from Databases (KDD). As most Machine Learning algorithms induce knowledge strictly from data, the quality of the knowledge extracted is largely determined by the quality of the underlying data. One relevant problem in data quality is the presence of missing data. Despite the frequent occurrence of missing data, many Machine Learning algorithms handle missing data in a rather naive way. Missing data treatment should be carefully thought, otherwise bias might be introduced into the knowledge induced. In this work, we analyse the use of the k-nearest neighbour as an imputation method. Imputation is a term that denotes a procedure that replaces the missing values in a data set by some plausible values. Our analysis indicates that missing data imputation based on the k-nearest neighbour algorithm can outperform the internal methods used by C4.5 and CN2 to treat missing data.

306 citations


Journal ArticleDOI
TL;DR: The authors compare and contrast five approaches for dealing with missing data and suggest that mean substitution was the least effective and that regression with an error term and the EM algorithm produced estimates closest to those of the original variables.
Abstract: Researchers are commonly faced with the problem of missing data. This article presents theoretical and empirical information for the selection and application of approaches for handling missing data on a single variable. An actual data set of 492 cases with no missing values was used to create a simulated yet realistic data set with missing at random (MAR) data. The authors compare and contrast five approaches (listwise deletion, mean substitution, simple regression, regression with an error term, and the expectation maximization [EM] algorithm) for dealing with missing data, and compare the effects of each method on descriptive statistics and correlation coefficients for the imputed data (n = 96) and the entire sample (n = 492) when imputed data are included. All methods had limitations, although our findings suggest that mean substitution was the least effective and that regression with an error term and the EM algorithm produced estimates closest to those of the original variables.

278 citations


Journal ArticleDOI
TL;DR: In this paper, an adjusted empirical likelihood approach to inference for the mean of the response variable is developed, and a nonparametric version of Wilks' theorem is proved for the adjusted empirical log-likelihood ratio by showing that it has an asymptotic standard chi-squared distribution.
Abstract: Inference under kernel regression imputation for missing response data is considered. An adjusted empirical likelihood approach to inference for the mean of the response variable is developed. A nonparametric version of Wilks' theorem is proved for the adjusted empirical log-likelihood ratio by showing that it has an asymptotic standard chi-squared distribution, and the corresponding empirical likelihood confidence interval for the mean is constructed. With auxiliary information, an empirical likelihood-based estimator is defined and an adjusted empirical log-likelihood ratio is derived. Asymptotic normality of the estimator is proved. Also, it is shown that the adjusted empirical log-likelihood ratio obeys Wilks' theorem. A simulation study is conducted to compare the adjusted empirical likelihood and the normal approximation methods in terms of coverage accuracies and average lengths of confidence intervals. Based on biases and standard errors, a comparision is also made by simulation between the empirical likelihood-based estimator and related estimators. Our simulation indicates that the adjusted empirical likelihood method performs competitively and that the use of auxiliary information provides improved inferences.

267 citations


Journal ArticleDOI
TL;DR: When cross- sectional data are miss ing, re place ment with the group mean leads to an un der es ti mate of the stan dard de via tion (SD) and in fla tion of the Type I er ror rate.
Abstract: Missing data are common in most studies, especially when subjects are followed over time. This can jeopardize the validity of a study because of reduced power to detect differences, and especially because subjects who are lost to follow-up rarely represent the group as a whole. There are several approaches to handling missing data, but some may result in biased estimates of the treatment effect, and others may overestimate the significance of the statistical tests. When cross-sectional data (for example, demographic and background information and a single outcome measurement time) are missing, replacement with the group mean leads to an underestimate of the standard deviation (SD) and inflation of the Type I error rate. Using regression estimates, especially with error built into the imputed value, lessens but does not eliminate this problem. Multiple imputation preserves the estimates of both the mean and the SD, even when a significant proportion of the data are missing. With longitudinal studies, the last observation carried forward (LOCF) approach preserves the sample size, but may make unwarranted assumptions about the missing data, resulting in either underestimating or overestimating the treatment effects. Growth curve analysis makes maximal use of the existing data and makes fewer assumptions.

230 citations


Journal ArticleDOI
TL;DR: In this paper, the problem of using future multivariate observations with missing data to estimate latent variable scores from an existing principal component analysis (PCA) model is addressed, and several methods for estimating the scores of new individuals with missing observations are presented.
Abstract: This paper addresses the problem of using future multivariate observations with missing data to estimate latent variable scores from an existing principal component analysis (PCA) model. This is a critical issue in multivariate statistical process control (MSPC) schemes where the process is continuously interrogated based on an underlying PCA model. We present several methods for estimating the scores of new individuals with missing data: a so-called trimmed score method (TRI), a single-component projection method (SCP), a method of projection to the model plane (PMP), a method based on the iterative imputation of missing data, a method based on the minimization of the squared prediction error (SPE), a conditional mean replacement method (CMR) and various least squared-based methods: one based on a regression on known data (KDR) and the other based on a regression on trimmed scores (TSR). The basis for each method and the expressions for the score estimators, their covariance matrices and the estimation errors are developed. Some of the methods discussed have already been proposed in the literature (SCP, PMP and CMR), some are original (TRI and TSR) and others are shown to be equivalent to methods already developed by other authors: iterative imputation and SPE methods are equivalent to PMP; KDR is equivalent to CMR. These methods can be seen as different ways to impute values for the missing variables. The efficiency of the methods is studied through simulations based on an industrial data set. The KDR method is shown to be statistically superior to the other methods, except the TSR method in which the matrix to be inverted is of a much smaller size.

214 citations


Journal ArticleDOI
TL;DR: Until improved methods of imputing county-level crime data are developed, tested, and implemented, they should not be used, especially in policy studies.
Abstract: County-level crime data have major gaps, and the imputation schemes for filling in the gaps are inadequate and inconsistent. Such data were used in a recent study of guns and crime without considering the errors resulting from imputation. This note describes the errors and how they may have affected this study. Until improved methods of imputing county-level crime data are developed, tested, and implemented, they should not be used, especially in policy studies.

Journal ArticleDOI
TL;DR: The purpose of this article is to review the problems associated with missing data, options for handlingMissing data, and recent multiple imputation methods to inform researchers' decisions about whether to delete or impute missing responses and the method best suited to doing so.
Abstract: Missing data occur frequently in survey and longitudinal research. Incomplete data are problematic, particularly in the presence of substantial absent information or systematic nonresponse patterns. Listwise deletion and mean imputation are the most common techniques to reconcile missing data. However, more recent techniques may improve parameter estimates, standard errors, and test statistics. The purpose of this article is to review the problems associated with missing data, options for handling missing data, and recent multiple imputation methods. It informs researchers' decisions about whether to delete or impute missing responses and the method best suited to doing so. An empirical investigation of AIDS care data outcomes illustrates the process of multiple imputation.

Journal ArticleDOI
TL;DR: It is shown that the performance of the imputation model compares well to other classical methods, and that the use of a self-organising map for data correction provides a performing system for data validation, data correction and data analysis.
Abstract: This paper is dedicated to erroneous data detection and imputation methods in surveys. We describe experiments conducted under the scope of a European project for studying new statistical methods based on neural networks. We show that the selforganising map can be used successfully for these tasks. A self-organising map is calibrated according to the available observations, described through a set of correlated variables handled together. The map can then be used both to detect erroneous data and to impute values to partial observations. We apply these principles to a real size transport survey database. We show that the performance of our imputation model compares well to other classical methods, and that the use of a self-organising map for data correction provides a performing system for data validation, data correction and data analysis.

Journal ArticleDOI
TL;DR: Accessible, user-friendly computer programs are available to perform multiple imputation for missing data making ad hoc approaches to missing data obsolete.
Abstract: Background Sample loss and missing data are inevitable in multivariate and longitudinal research. Ad hoc approaches such as analysis of incomplete data or substituting the group mean for missing data, while common, may unnecessarily reduce statistical power and threaten study validity. Multiple imputation for missing data is a newly accessible, methodologically rigorous approach to dealing with the problem of missing data. Approach To (a) discuss the problem of missing data in clinical research, and (b) describe the technique of multiple imputation. A case of analysis of multivariate psychosocial data is presented to illustrate the practice of multiple imputation. Results The advantages of multiple imputation are it (a) results in unbiased estimates, providing more validity than ad hoc approaches to missing data; (b) uses all available data, preserving sample size and statistical power; (c) may be used with standard statistical software; and, (d) results are readily interpreted. Discussion Accessible, user-friendly computer programs are available to perform multiple imputation for missing data making ad hoc approaches to missing data obsolete.

Journal ArticleDOI
TL;DR: The results suggest that selection bias in the study is of concern, but only slightly, in very elderly (age 80+ years), both women and men, and epidemiologists should consider using multiple imputation more often.
Abstract: Background. Nonresponse bias is a concern in any epidemiologic survey in which a subset of selected individuals declines to participate. Methods. We reviewed multiple imputation, a widely applicable and easy to implement Bayesian methodology to adjust for nonresponse bias. To illustrate the method, we used data from the Canadian Multicentre Osteoporosis Study, a large cohort study of 9423 randomly selected Canadians, designed in part to estimate the prevalence of osteoporosis. Although subjects were randomly selected, only 42% of individuals who were contacted agreed to participate fully in the study. The study design included a brief questionnaire for those invitees who declined further participation in order to collect information on the major risk factors for osteoporosis. These risk factors (which included age, sex, previous fractures, family history of osteoporosis, and current smoking status) were then used to estimate the missing osteoporosis status for nonparticipants using multiple imputation. Both ignorable and nonignorable imputation models are considered. Results. Our results suggest that selection bias in the study is of concern, but only slightly, in very elderly (age 80 years), both women and men. Conclusions. Epidemiologists should consider using multiple imputation more often than is current practice. (EPIDEMIOLOGY 2002;13:437–444)

Book
01 Jan 2002
TL;DR: The aim of this book is to provide a Discussion of the Foundations of Analytic Models for Design of Quality of Life Studies and their Applications to Quality of life research.
Abstract: INTRODUCTION Health-Related Quality of Life Measuring Health-Related Quality of Life Example 1: Adjuvant Breast Cancer Trial Example 2: Advanced Non-Small-Cell Lung Cancer (NSCLC) Example 3: Renal Cell Carcinoma Trial Summary STUDY DESIGN AND PROTOCOL DEVELOPMENT Introduction Background and Rationale Research Objectives Selection of Subjects Longitudinal Designs Selection of a Quality of Life Measure Conduct Summary MODELS FOR LONGITUDINAL STUDIES Introduction Building the Analytic Models Building Repeated Measures Models Building Growth Curve Models Summary MISSING DATA Introduction Patterns of Missing Data Mechanisms of Missing Data Summary ANALYTIC METHODS FOR IGNORABLE MISSING DATA Introduction Repeated Univariate Analyses Multivariate Methods Baseline Assessment as a Covariate Change from Baseline Empirical Bayes Estimates Summary SIMPLE IMPUTATION Introduction Mean Value Substitution Explicit Regression Models Last Value Carried Forward Underestimation of Variance Sensitivity Analysis Summary MULTIPLE IMPUTATION Introduction Overview of Multiple Imputation Explicit Univariate Regression Closest Neighbor and Predictive Mean Matching Approximate Bayesian Bootstrap Multivariate Procedures for Nonmonotone Missing Data Combining the M Analyses Sensitivity Analyses Imputation vs. Analytic Models Implications for Design Summary PATTERN MIXTURE MODELS Introduction Bivariate Data (Two Repeated Measures) Monotone Dropout Parametric Models Additional Reading Algebraic Details Summary RANDOM-EFFECTS MIXTURE, SHARED-PARAMETER, AND SELECTION MODELS Introduction Conditional Linear Model Joint Mixed-Effects and Time to Dropout Selection Model for Monotone Dropout Advanced Readings Summary SUMMARY MEASURES Introduction Choosing a Summary Measure Constructing Summary Measures Summary Statistics across Time Summarizing Across HRQoL Domains or Subscales Advanced Notes Summary MULTIPLE ENDPOINTS Introduction Background Concepts and Definitions Multivariate Statistics Univariate Statistics Resampling Techniques Summary DESIGN: ANALYSIS PLANS Introduction General Analysis Plan Models for Longitudinal Data Multiplicity of Endpoints Sample Size and Power Reported Results Summary APPENDICES BIBLIOGRAPHY

Journal ArticleDOI
TL;DR: This case study is a case study demonstrating the application of multiple imputation to address important questions related to prostate cancer and urologic symptoms in a data set with missing values.
Abstract: The Flint Men's Health Study is an ongoing population-based study of African-American men designed to address questions related to prostate cancer and urologic symptoms. The initial phase of the study was conducted in 1996-1997 in two stages: an interviewer-administered survey followed by a clinical examination. The response rate in the clinical examination phase was 52%. Thus, some data were missing for clinical examination variables, diminishing the generalizability of the results to the general population. This paper is a case study demonstrating the application of multiple imputation to address important questions related to prostate cancer and urologic symptoms in a data set with missing values. On the basis of the observed clinical examination data, the American Urological Association Symptoms Score showed a surprising reduction in symptoms in the oldest age group, but after multiple imputation there was a monotonically increasing trend with age. It appeared that multiple imputation corrected for nonresponse bias associated with the observed data. For other outcome measures-namely, the age-adjusted 95th percentile of prostate-specific antigen level and the association between urologic symptoms and prostate volume-results from the observed data and the multiply imputed data were similar.

Journal ArticleDOI
TL;DR: A cohort of intensive care unit (ICU) patients in 20 Colombian ICUs is used to describe the application of three imputation techniques: single, hot deck and multiple imputation, and statistically significant differences were found for the area under the ROC curve.
Abstract: A cohort of intensive care unit (ICU) patients in 20 Colombian ICUs is used to describe the application of three imputation techniques: single, hot deck and multiple imputation. These strategies were used to impute the missing data in the variables used to construct APACHE II scores, a scoring system for the ICU patients that provides an unbiased standardized estimate of the probability of hospital death. Imputed APACHE II scores were then used in the APACHE II model to estimate adjusted hospital mortality rates. The area under the receiver operating characteristic (ROC) curve was used to compare imputation strategies with respect to predictive power. While statistically significant differences were found for the area under the ROC curve, these differences were not clinically significant.

Journal ArticleDOI
TL;DR: In this paper, the problem of nuisance covariate model specification is considered in Cox regression where the maximum semiparametric likelihood method is used to handle the missing covariates and the statistical properties of the proposed method are examined.
Abstract: The problem of nuisance covariate model specification is considered in Cox regression where the maximum semiparametric likelihood method is used to handle the missing covariates. A component of the covariates is modeled nonparametrically to achieve robustness against covariate model misspecification and to reduce the number of possibly intractable integrations involved in the parametric modeling of the covariates. The statistical properties of the proposed method are examined. It is found that in some important situations, the maximum semiparametric likelihood can be applied without making any additional parametric model assumptions on covariates. The proposed method can yield a more efficient estimator than the nonparametric imputation methods and does not require specification of the missingness mechanism when compared with the inverse probability weighting method. A real data example is analyzed to demonstrate use of the proposed method.

Journal ArticleDOI
TL;DR: Some new approaches to data fusion are presented, in particular one based on homogeneity analysis and future directions, which insist on validation problems and caveats.

Journal ArticleDOI
TL;DR: This paper studies the effect of a missing data recovery method, namely the pseudo-nearest-neighbor substitution approach, on Gaussian distributed data sets that represent typical cases in data discrimination and data mining applications.

Journal ArticleDOI
TL;DR: In this paper, two nonparametric imputation schemes are considered: risk set imputation and Kaplan-Meier imputation, where the censored time is replaced by a random draw of the observed times amongst those at risk after the censoring time.

Journal ArticleDOI
TL;DR: A marginal approach for the analysis of the effect of covariates on multivariate interval-censored survival data is considered and a robust estimator for the covariance matrix is developed that accounts for the correlation between events.
Abstract: This paper considers a marginal approach for the analysis of the effect of covariates on multivariate interval-censored survival data.Interval censoring of multivariate events can occur when the events are not directly observable but are detected by periodically performing clinical examinations or laboratory tests. The method assumes the marginal distribution for each event is based on a discrete analogue of the proportional hazards model for interval-censored data. A robust estimator for the covariance matrix is developed that accounts for the correlation between events. A simulation study comparing the performance of this method and a midpoint imputation approach indicates the parameter estimates from the proposed method are less biased. Furthermore, even when the events are only modestly correlated, ignoring the correlation can result in erroneous variance estimators. The method is illustrated using data from an ongoing clinical trial involving subjects with systemic lupus erythematosus.

Journal Article
TL;DR: This paper presents a method of data decomposition to avoid the necessity of reasoning on data with missing attribute values, and provides an empirical evaluation of the decomposition method accuracy and model size with use of various decomposition criteria.
Abstract: In this paper we present a method of data decomposition to avoid the necessity of reasoning on data with missing attribute values. This method can be applied to any algorithm of classifier induction. The original incomplete data is decomposed into data subsets without missing values. Next, methods for classifier induction are applied to these sets. Finally, a conflict resolving method is used to obtain final classification from partial classifiers. We provide an empirical evaluation of the decomposition method accuracy and model size with use of various decomposition criteria on data with natural missing values. We present also experiments on data with synthetic missing values to examine the properties of proposed method with variable ratio of incompleteness.

Journal ArticleDOI
TL;DR: In this paper, a fully nonparametric and a semiparametric imputation method are studied, both based on local resampling principles, and it is shown that the final estimator, based on these local imputations, is consistent under fewer or no parametric assumptions.
Abstract: Dealing with missing data via parametric multiple imputation methods usually implies stating several strong assumptions both about the distribution of the data and about underlying regression relationships. If such parametric assumptions do not hold, the multiply imputed data are not appropriate and might produce inconsistent estimators and thus misleading results. In this paper, a fully nonparametric and a semiparametric imputation method are studied, both based on local resampling principles. It is shown that the final estimator, based on these local imputations, is consistent under fewer or no parametric assumptions. Asymptotic expressions for bias, variance and mean squared error are derived, showing the theoretical impact of the different smoothing parameters. Simulations illustrate the usefulness and applicability of the method.


Journal ArticleDOI
TL;DR: In this paper, an imputation algorithm derived from a log-multiplicative model was proposed to estimate levels of homicides disaggregated by victim/offender relationship using the Federal Bureau of Investigation's Supplementary Homicide Report (SHR) data for 1996 and 1997, and compared the resulting estimates with those obtained from the application of conventional procedures.
Abstract: This research note critically evaluates conventional methods for allocating homicides with an unknown victim/offender relationship to meaningful categories, and it proposes an alternative approach. We argue that conventional methods are based on a problematic assumption, namely, that the missing data mechanism is “ignorable.” As an alternative to these methods, we propose an imputation algorithm derived from a log-multiplicative model that does not require this assumption. We apply this technique to estimate levels of homicides disaggregated by victim/offender relationship using the Federal Bureau of Investigation's Supplementary Homicide Report (SHR) data for 1996 and 1997, and we compare the resulting estimates with those obtained from the application of conventional procedures. Our results yield a larger proportion of stranger homicides than are obtained from the conventional methods.

Journal ArticleDOI
TL;DR: It is concluded that some alternative missing grade methods are superior to GPA and data augmentation and stochastic regression imputation appear to be superior as missing grade techniques.
Abstract: In this article, grade point average (GPA) is considered a missing data technique for unavailable grades in school grade records. In Study 1, theoretical and empirical differences between GPA and seven alternative missing grade techniques were considered. These seven techniques are subject mean substitution, corrected subject mean, subject correlation substitution, regression imputation, expectation maximization algorithm imputation and two multiple imputation methods-stochastic regression imputation and data augmentation., The missing grade techniques differ greatly. Data augmentation and stochastic regression imputation appear to be superior as missing grade techniques. In Study 2, the completed grade records (observed and imputed values) were used in two prediction analyses of academic achievement. One analysis was based on unweighed grades, the other on weighed grades. In both analyses, alternative missing grade methods produced better and more consistent predictions. It is concluded that some alternative missing grade methods are superior to GPA.

Book
01 Jan 2002
TL;DR: The use of Soft Endpoints in Clinical Trials: The Search for Clinical Significance and Item Response Theory (IRT): Applications in Quality of LifeMeasurement, Analysis and Interpretation D.R. Cox.
Abstract: Acknowledgements. Preface. Presenting Authors. Introduction D.R. Cox. 1: Measurement, Scale Development, and Study Design. regulatory Aspects of Quality of Life C. Gnecco, P.A. Lachenbruch. Biases in the Retrospective Calculation of Reliability and Responsiveness from Longitudinal Studies G. Norma, et al. Application of the Multi-attribute Utility Theory to the Development of a Preference based Health-Related Quality of Life Instrument C. Le Gales. Strategy and Methodology for Choice of Items in Psychometric Measurement: Designing a Quality of Life Instrument for Hip and Knee Osteoarthritis F. Guillemin, et al. Conception, Development and Validation of Instruments for Quality of Life Assessment: An Overview A.J. Chwalow, A.B. Adesina. Methodological Issues in the Analysis of Quality of Life Data in Clinical Trials: Illustrations from the National Surgical Adjuvant Breast and Bowel Project (NSABP) Breast Cancer Prevention Trial S. Land, et al. Disease-Specific Versus Generic Measurement of Health-Related Quality of Life in Cross-Sectional and Longitudinal Studies: an Inpatient Investigation of the SF-36 and Four Disease-Specific Instruments S. Briancon, et al. 2: Analysis and Interpretation of Multiple Endpoints. Analyzing Longitudinal Health-Related Quality of Life Data: Missing Data and Imputation Methods D.A. Revicki. Comparison of Treatments with Multiple Outcomes P. Tubert-Bitter, et al. The Use of Soft Endpoints in Clinical Trials: The Search for Clinical Significance J. Wittes. 3: Item Response Theory and Rasch Models. Parametric and Nonparametric Item Response Theory Models in Health Related Quality of Life Measurement I.W. Molenaar. Questionnaire Reliability Under the Rasch Model A. Hamon, M. Mesbah. Item Response Theory (IRT): Applications in Quality of LifeMeasurement, Analysis and Interpretation D. Cella, et al. Graphical Rasch Models S. Kreiner, K.B. Christensen. 4: Joint Analysis of Quality of Life and Survival. Semi-Markov Models for Quality of Life Data with Censoring N. Heutte, C. Huber-Carol. A Model Relating Quality of Life of Latent Health Status and Survival M.-L. Ting Lee, G.A. Whitmore. Applying Survival Data Methodology to Analyze Longitudinal Quality of Life Data L. Awad, et al. Latent Class Models to Describe Changes Over Time: A Case Study H.C. van Houwelingen. 5: Quality-Adjusted Survival Analysis and Related Methods. Prevalence Analysis of Recurrent and Transient Health States in Quality of Life Studies A. Kramar, R. Lancar. Measures of Quality Adjusted Life and Quality of Life Deficiency: Statistical Perspectives P.K. Sen. Quality-Adjusted Survival Analysis in Cancer Clinical Trials B.F. Cole, K.L. Kilbridge. 6: Methods for Informatively Missing Longitudinal Quality-of-Life Data. Handling of Missing Data M. Chavance. Guidelines For Administration of Self-Reported Health-Related Quality of Questionnaires: How to Minimize Avoidable Missing Data? D. Dubois. Joint Analysis of Survival and Nonignorable Missing Longitudinal Quality-of-Life Data J.-F. Dupuy. Multiple Imputation for Non-Random Missing Data in Longitudinal Studies of Health-Related Quality of Life D.L. Fairclough. Strategies to Fit Pattern-Mixture Models G. Molenberghs, et al. Analysis of Longitudinal Quality of Life Data with Informative Dropout M.C. Wu, et al.

Journal ArticleDOI
TL;DR: In this article, a joint regression imputation method is proposed that preserves unbiasedness for marginal totals, second moments, and correlation coefficients, which is more stable than those produced by marginal nonrandom regression imputations when correlation coefficients are in a c...
Abstract: Regression imputation is commonly used to compensate for item nonresponse when auxiliary data are available. It is common practice to compute survey estimators by treating imputed values as observed data and using the standard unbiased (or nearly unbiased) estimation formulas designed for the case of no nonresponse. Although the commonly used regression imputation method preserves unbiasedness for population marginal totals (i.e., survey estimators computed from imputed data are still nearly unbiased), it does not preserve unbiasedness for population correlation coefficients. A joint regression imputation method is proposed that preserves unbiasedness for marginal totals, second moments, and correlation coefficients. Some simulation results show that the joint regression imputation method produces not only sample correlation coefficients that are nearly unbiased, but also estimates that are more stable than those produced by marginal nonrandom regression imputation when correlation coefficients are in a c...