scispace - formally typeset
Open AccessJournal Article

Commonalities and Differences in IRT-Based Methods for Nonignorable Item Nonresponses

Reads0
Chats0
TLDR
In this article, the authors focus on the more challenging case of unplanned missing data, which pose not only a loss of efficiency, but potentially lead to biased estimation of item and person parameters in the measurement model.
Abstract
Missing data are an inevitable problem for applied researchers. They may occur for many different reasons. For example, participants may not be willing to participate in a study, leading to unit nonresponses, or participants may be unable or unwilling to answer all items of a test. Such item nonresponses typically result from omitted or not-reached items and are common in educational assessments. Furthermore, test takers provide answers that cannot be scored meaningfully, producing item nonresponses due to notcodable item responses. Unplanned missing data resulting from test takers' response behavior must be distinguished from planned missing data due to the design (Graham, Taylor, & Cumsille, 2001; Graham, Taylor, Olchowski, & Cumsille, 2006). Especially in large scale assessments (LSA), only subsets of items are assigned to test takers to reduce costs, participant burden, fatigue, or potential practice effects. With an appropriate test design, including randomized assignment of the different test forms, planned missing data does not pose a threat to validity. Therefore, we focus on the more challenging case of unplanned missing data, which pose not only a loss of efficiency, but potentially lead to biased estimation of item and person parameters in the measurement model. In large scale assessments (LSA), parameters of the structural model such as means, variances, covariances of latent variables are of primary interest instead of individual proficiency levels; however, these distributional parameters may also be biased due to item nonresponses.Many different approaches to handle missing values have been proposed. Weighting methods, such as inverse probability weighting, are commonly applied to account for unit nonresponses (Li, Shen, Li, & Robins, 2011; Wooldridge, 2007). The simplest approach for item nonresponses is listwise deletion, the inclusion of complete cases into the statistical analysis. Pairwise deletion was proposed as an alternative for models that are based on bivariate statistics, such as structural equation models (SEM) that use covariance matrices as input for parameter estimation. Single and multiple imputation methods rest upon the idea that one should replace missing values with predicted or plausible values in the first step (imputation phase). Next, the augmented data sets are analyzed with standard methods in the second step (analysis phase). In contrast, model-based approaches, such as full information maximum likelihood (FIML), allow for parameter estimation with incomplete data sets. The suitability of the different missing data handling methods depend on whether certain assumptions hold. These assumptions can be derived from Rubin's taxonomy of missing data (1976; 2002). He distinguishes between three missing data mechanisms: Missing completely at random (MCAR), missing at random (MAR), and not missing at random (NMAR). We will examine these mechanisms in greater detail later in this paper. So far it suffices to note that missing data that are MCAR and MAR are also called ignorable. In this case, missingness is either completely independent of the observed and unobserved variables under examination (MCAR), or conditionally stochastically independent of the unobserved variables given the observed variables (MAR). The stochastic independencies imply that missingness is not informative with respect to unobserved variables and underlying model parameters and can therefore be ignored. Almost all modern missing data methods rest upon the assumption that the missing data mechanism is ignorable. This is also true for methods like FIML and multiple imputation, which are regarded as state of the art methods for item nonresponses (Schafer & Graham, 2002). In contrast, missing data that are NMAR are termed nonignorable. In this case, missingness is not conditionally independent of the unobserved variables given the observed variables. Such missingness is also called informative with respect to unobserved variables. …

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Statistical Analysis with Missing Data

Martin G. Gibson
- 01 Mar 1989 - 
Journal ArticleDOI

Modeling Omitted and Not-Reached Items in IRT Models

TL;DR: It is demonstrated that response indicators have different statistical properties depending on whether the items were omitted or not reached, and these differences are used to derive a joint model for nonignorable missing responses with ability to appropriately account for both omitted and not-reached items.
Journal ArticleDOI

Latent variable modelling with non‐ignorable item non‐response: multigroup response propensity models for cross‐national analysis

TL;DR: This work proposes models for non‐ response in survey questions which are treated as measures of latent constructs and analysed by using latent variable models, and considers in particular such models for the analysis of data from cross‐national surveys, where the non‐response model may also vary across the countries.
References
More filters
Book

Statistical Analysis with Missing Data

TL;DR: This work states that maximum Likelihood for General Patterns of Missing Data: Introduction and Theory with Ignorable Nonresponse and large-Sample Inference Based on Maximum Likelihood Estimates is likely to be high.
Journal ArticleDOI

Missing data: Our view of the state of the art.

TL;DR: 2 general approaches that come highly recommended: maximum likelihood (ML) and Bayesian multiple imputation (MI) are presented and may eventually extend the ML and MI methods that currently represent the state of the art.
Journal ArticleDOI

Inference and missing data

Donald B. Rubin
- 01 Dec 1976 - 
TL;DR: In this article, it was shown that ignoring the process that causes missing data when making sampling distribution inferences about the parameter of the data, θ, is generally appropriate if and only if the missing data are missing at random and the observed data are observed at random, and then such inferences are generally conditional on the observed pattern of missing data.
Book

Analysis of Incomplete Multivariate Data

TL;DR: The Normal Model Methods for Categorical Data Loglinear Models Methods for Mixed Data and Inference by Data Augmentation Methods for Normal Data provide insights into the construction of categorical and mixed data models.
Related Papers (5)