scispace - formally typeset
Search or ask a question

A Model-Based Imputation Procedure for Multilevel Regression Models with Random Coefficients, Interaction Effects, and Non-Linear Terms.

TL;DR: Computer simulation results suggest that this new approach can be quite effective when applied to multilevel models with random coefficients and interaction effects, and in most scenarios that the authors examined, imputation-based parameter estimates were quite accurate and tracked closely with those of the complete data.
Abstract: Despite the broad appeal of missing data handling approaches that assume a missing at random (MAR) mechanism (e.g., multiple imputation and maximum likelihood estimation), some very common analysis models in the behavioral science literature are known to cause bias-inducing problems for these approaches. Regression models with incomplete interactive or polynomial effects are a particularly important example because they are among the most common analyses in behavioral science research applications. In the context of single-level regression, fully Bayesian (model-based) imputation approaches have shown great promise with these popular analysis models. The purpose of this article is to extend model-based imputation to multilevel models with up to 3 levels, including functionality for mixtures of categorical and continuous variables. Computer simulation results suggest that this new approach can be quite effective when applied to multilevel models with random coefficients and interaction effects. In most scenarios that we examined, imputation-based parameter estimates were quite accurate and tracked closely with those of the complete data. The new procedure is available in the Blimp software application for macOS, Windows, and Linux, and the article includes a data analysis example illustrating its use. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Content maybe subject to copyright    Report

Citations
More filters
01 Jan 2018
TL;DR: The purpose of this manuscript is to describe a flexible imputation approach that can accommodate a diverse set of 2-level analysis problems that includes any of the aforementioned features.
Abstract: Specialized imputation routines for multilevel data are widely available in software packages, but these methods are generally not equipped to handle a wide range of complexities that are typical of behavioral science data. In particular, existing imputation schemes differ in their ability to handle random slopes, categorical variables, differential relations at Level-1 and Level-2, and incomplete Level-2 variables. Given the limitations of existing imputation tools, the purpose of this manuscript is to describe a flexible imputation approach that can accommodate a diverse set of 2-level analysis problems that includes any of the aforementioned features. The procedure employs a fully conditional specification (also known as chained equations) approach with a latent variable formulation for handling incomplete categorical variables. Computer simulations suggest that the proposed procedure works quite well, with trivial biases in most cases. We provide a software program that implements the imputation strategy, and we use an artificial data set to illustrate its use. (PsycINFO Database Record

40 citations

Gerko Vink1
15 Jul 2014
TL;DR: This work proposes a new multiple imputation technique for imputing squares that yields both unbiased regression estimates and quadratic relations in the data for both missing at random and MCAR mechanisms.
Abstract: We propose a new multiple imputation technique for imputing squares. Current methods yield either unbiased regression estimates or preserve data relations. No method, however, seems to deliver both, which limits researchers in the implementation of regression analysis in the presence of missing data. Besides, current methods only work under a missing completely at random (MCAR) mechanism. Our method for imputing squares uses a polynomial combination. The proposed method yields both unbiased regression estimates, while preserving the quadratic relations in the data for both missing at random and MCAR mechanisms.

18 citations

Journal ArticleDOI
TL;DR: In this article , the authors identify disparities in mental distress, perceived adversities, resilience, and coping during the COVID-19 pandemic among four age groups (18-34, 35-49, 50-64, and ≥65) and assess the age-moderated time effect on mental distress.
Abstract: The COVID19 pandemic has caused a mental health crisis worldwide, which may have different age-specific impacts, partly due to age-related differences in resilience and coping. The purposes of this study were to 1) identify disparities in mental distress, perceived adversities, resilience, and coping during the COVID-19 pandemic among four age groups (18-34, 35-49, 50-64, and ≥65); 2) assess the age-moderated time effect on mental distress, and 3) estimate the effects of perceived adversities on mental distress as moderated by age, resilience and coping.Data were drawn from a longitudinal survey of a nationally representative sample (n = 7830) administered during the pandemic. Weighted mean of mental distress and adversities (perceived loneliness, perceived stress, and perceived risk), resilience, and coping were compared among different age groups. Hierarchical random-effects models were used to assess the moderated effects of adversities on mental distress.The youngest age group (18-34) reported the highest mental distress at baseline with the mean (standard error) as 2.70 (0.12), which showed an incremental improvement with age (2.27 (0.10), 1.88 (0.08), 1.29 (0.07) for 35-49, 50-64, and ≥65 groups respectively). The older age groups reported lower levels of loneliness and perceived stress, higher perceived risk, greater resilience, and more relaxation coping (ps < .001). Model results showed that mental distress declined slightly over time, and the downward trend was moderated by age group. Perceived adversities, alcohol, and social coping were positively,whereas resilience and relaxation were negatively associated with mental distress. Resilience and age group moderated the slope of each adversity on mental distress.The youngest age group appeared to be most vulnerable during the pandemic. Mental health interventions may provide resilience training to combat everyday adversities for the vulnerable individuals and empower them to achieve personal growth that challenges age boundaries.

12 citations

01 Jan 2018
TL;DR: In this article, the authors address missing data handling for random coefficient models, and focus on the fully conditional specification framework, which is a more general approach than the one we use here, and the few studies to date have focused on the Fully Conditional Specification framework.
Abstract: Literature addressing missing data handling for random coefficient models is particularly scant, and the few studies to date have focused on the fully conditional specification framework an...

11 citations

Journal ArticleDOI
TL;DR: In this paper , an alternative approach to ERP analysis is proposed, called linear mixed effects (LME) modeling, which offers unique utility in developmental ERP research and has been shown to yield accurate, unbiased results even when subjects have low trial-counts.

11 citations

References
More filters
Book
01 Jan 1991
TL;DR: In this article, the effects of predictor scaling on the coefficients of regression equations are investigated. But, they focus mainly on the effect of predictors scaling on coefficients of regressions.
Abstract: Introduction Interactions between Continuous Predictors in Multiple Regression The Effects of Predictor Scaling on Coefficients of Regression Equations Testing and Probing Three-Way Interactions Structuring Regression Equations to Reflect Higher Order Relationships Model and Effect Testing with Higher Order Terms Interactions between Categorical and Continuous Variables Reliability and Statistical Power Conclusion Some Contrasts Between ANOVA and MR in Practice

27,897 citations

Book
03 Mar 1992
TL;DR: The Logic of Hierarchical Linear Models (LMLM) as discussed by the authors is a general framework for estimating and hypothesis testing for hierarchical linear models, and it has been used in many applications.
Abstract: Introduction The Logic of Hierarchical Linear Models Principles of Estimation and Hypothesis Testing for Hierarchical Linear Models An Illustration Applications in Organizational Research Applications in the Study of Individual Change Applications in Meta-Analysis and Other Cases Where Level-1 Variances are Known Three-Level Models Assessing the Adequacy of Hierarchical Models Technical Appendix

23,126 citations

Book
01 Jan 1987
TL;DR: This work states that maximum Likelihood for General Patterns of Missing Data: Introduction and Theory with Ignorable Nonresponse and large-Sample Inference Based on Maximum Likelihood Estimates is likely to be high.
Abstract: Preface.PART I: OVERVIEW AND BASIC APPROACHES.Introduction.Missing Data in Experiments.Complete-Case and Available-Case Analysis, Including Weighting Methods.Single Imputation Methods.Estimation of Imputation Uncertainty.PART II: LIKELIHOOD-BASED APPROACHES TO THE ANALYSIS OF MISSING DATA.Theory of Inference Based on the Likelihood Function.Methods Based on Factoring the Likelihood, Ignoring the Missing-Data Mechanism.Maximum Likelihood for General Patterns of Missing Data: Introduction and Theory with Ignorable Nonresponse.Large-Sample Inference Based on Maximum Likelihood Estimates.Bayes and Multiple Imputation.PART III: LIKELIHOOD-BASED APPROACHES TO THE ANALYSIS OF MISSING DATA: APPLICATIONS TO SOME COMMON MODELS.Multivariate Normal Examples, Ignoring the Missing-Data Mechanism.Models for Robust Estimation.Models for Partially Classified Contingency Tables, Ignoring the Missing-Data Mechanism.Mixed Normal and Nonnormal Data with Missing Values, Ignoring the Missing-Data Mechanism.Nonignorable Missing-Data Models.References.Author Index.Subject Index.

18,201 citations

Journal ArticleDOI
TL;DR: A generalization of the sampling method introduced by Metropolis et al. as mentioned in this paper is presented along with an exposition of the relevant theory, techniques of application and methods and difficulties of assessing the error in Monte Carlo estimates.
Abstract: SUMMARY A generalization of the sampling method introduced by Metropolis et al. (1953) is presented along with an exposition of the relevant theory, techniques of application and methods and difficulties of assessing the error in Monte Carlo estimates. Examples of the methods, including the generation of random orthogonal matrices and potential applications of the methods to numerical problems arising in statistics, are discussed. For numerical problems in a large number of dimensions, Monte Carlo methods are often more efficient than conventional numerical methods. However, implementation of the Monte Carlo methods requires sampling from high dimensional probability distributions and this may be very difficult and expensive in analysis and computer time. General methods for sampling from, or estimating expectations with respect to, such distributions are as follows. (i) If possible, factorize the distribution into the product of one-dimensional conditional distributions from which samples may be obtained. (ii) Use importance sampling, which may also be used for variance reduction. That is, in order to evaluate the integral J = X) p(x)dx = Ev(f), where p(x) is a probability density function, instead of obtaining independent samples XI, ..., Xv from p(x) and using the estimate J, = Zf(xi)/N, we instead obtain the sample from a distribution with density q(x) and use the estimate J2 = Y{f(xj)p(x1)}/{q(xj)N}. This may be advantageous if it is easier to sample from q(x) thanp(x), but it is a difficult method to use in a large number of dimensions, since the values of the weights w(xi) = p(x1)/q(xj) for reasonable values of N may all be extremely small, or a few may be extremely large. In estimating the probability of an event A, however, these difficulties may not be as serious since the only values of w(x) which are important are those for which x -A. Since the methods proposed by Trotter & Tukey (1956) for the estimation of conditional expectations require the use of importance sampling, the same difficulties may be encountered in their use. (iii) Use a simulation technique; that is, if it is difficult to sample directly from p(x) or if p(x) is unknown, sample from some distribution q(y) and obtain the sample x values as some function of the corresponding y values. If we want samples from the conditional dis

14,965 citations


"A Model-Based Imputation Procedure ..." refers methods in this paper

  • ...…# "3(x3j), $ε2# & N")3 # *31()1j ( )1) # *32()2j ( )2), $+32 # (21) In practice, it is more straightforward to use a Metropolis sampling step (Hastings, 1970) to draw imputations from p(Y | Xr , X!r) % p(Xr | X!r) because this approach can approximate a target distribution such as Equation…...

    [...]

Journal ArticleDOI
TL;DR: The focus is on applied inference for Bayesian posterior distributions in real problems, which often tend toward normal- ity after transformations and marginalization, and the results are derived as normal-theory approximations to exact Bayesian inference, conditional on the observed simulations.
Abstract: The Gibbs sampler, the algorithm of Metropolis and similar iterative simulation methods are potentially very helpful for summarizing multivariate distributions. Used naively, however, iterative simulation can give misleading answers. Our methods are simple and generally applicable to the output of any iterative simulation; they are designed for researchers primarily interested in the science underlying the data and models they are analyzing, rather than for researchers interested in the probability theory underlying the iterative simulations themselves. Our recommended strategy is to use several independent sequences, with starting points sampled from an overdispersed distribution. At each step of the iterative simulation, we obtain, for each univariate estimand of interest, a distributional estimate and an estimate of how much sharper the distributional estimate might become if the simulations were continued indefinitely. Because our focus is on applied inference for Bayesian posterior distributions in real problems, which often tend toward normality after transformations and marginalization, we derive our results as normal-theory approximations to exact Bayesian inference, conditional on the observed simulations. The methods are illustrated on a random-effects mixture model applied to experimental measurements of reaction times of normal and schizophrenic patients.

13,884 citations


"A Model-Based Imputation Procedure ..." refers background or methods in this paper

  • ...Blimp can print a table of potential scale reduction factor diagnostics (Gelman & Rubin, 1992), and it optionally saves parameter values that can readily be converted to trace plots in other software (e.g., an R plotting script is provided with the other files for this example)....

    [...]

  • ...After examining potential scale reduction factors (Gelman et al., 2014; Gelman & Rubin, 1992) from several artificial data sets, we generated 10 imputations from a Gibbs sampler algorithm with 1,000 burn-in and thinning iterations (i.e., imputed data sets were saved at 1000-iteration increments)....

    [...]

  • ...After examining potential scale reduction factors (Gelman et al., 2014; Gelman & Rubin, 1992) from several artificial data sets, we generated 10 imputations from a Gibbs sampler algorithm with 1,000 burn-in and thinning iterations (i.e., imputed data sets were saved at 1,000-iteration increments)....

    [...]