scispace - formally typeset
Search or ask a question

Showing papers in "Psychometrika in 2023"


Journal ArticleDOI
TL;DR: In this paper , a robust effect size index (RESI) estimator and confidence/credible intervals that rely on different covariance estimators are developed and evaluated using statistical theory and simulations to develop and evaluate RESI estimators.
Abstract: Abstract Reporting effect size index estimates with their confidence intervals (CIs) can be an excellent way to simultaneously communicate the strength and precision of the observed evidence. We recently proposed a robust effect size index (RESI) that is advantageous over common indices because it’s widely applicable to different types of data. Here, we use statistical theory and simulations to develop and evaluate RESI estimators and confidence/credible intervals that rely on different covariance estimators. Our results show (1) counter to intuition, the randomness of covariates reduces coverage for Chi-squared and F CIs; (2) when the variance of the estimators is estimated, the non-central Chi-squared and F CIs using the parametric and robust RESI estimators fail to cover the true effect size at the nominal level. Using the robust estimator along with the proposed nonparametric bootstrap or Bayesian (credible) intervals provides valid inference for the RESI, even when model assumptions may be violated. This work forms a unified effect size reporting procedure, such that effect sizes with confidence/credible intervals can be easily reported in an analysis of variance (ANOVA) table format.

2 citations



Journal ArticleDOI
TL;DR: In this article , a test-based estimator is proposed to handle the missing at random (MAR) problem in neural information processing systems, which is similar to the sequential estimation of Mohan et al.
Abstract: Ignorable likelihood (IL) approaches are often used to handle missing data when estimating a multivariate model, such as a structural equation model. In this case, the likelihood is based on all available data, and no model is specified for the missing data mechanism. Inference proceeds via maximum likelihood or Bayesian methods, including multiple imputation without auxiliary variables. Such IL approaches are valid under a missing at random (MAR) assumption. Rabe-Hesketh and Skrondal (Ignoring non-ignorable missingness. Presidential Address at the International Meeting of the Psychometric Society, Beijing, China, 2015; Psychometrika, 2023) consider a violation of MAR where a variable A can affect missingness of another variable B also when A is not observed. They show that this case can be handled by discarding more data before proceeding with IL approaches. This data-deletion approach is similar to the sequential estimation of Mohan et al. (in: Advances in neural information processing systems, 2013) based on their ordered factorization theorem but is preferable for parametric models. Which kind of data-deletion or ordered factorization to employ depends on the nature of the MAR violation. In this article, we therefore propose two diagnostic tests, a likelihood-ratio test for a heteroscedastic regression model and a kernel conditional independence test. We also develop a test-based estimator that first uses diagnostic tests to determine which MAR violation appears to be present and then proceeds with the corresponding data-deletion estimator. Simulations show that the test-based estimator outperforms IL when the missing data problem is severe and performs similarly otherwise.

1 citations



Journal ArticleDOI
TL;DR: In this paper , a new family of oblique rotations based on component-wise $$L^p$$ was proposed for exploratory factor analysis (EFA) to learn the latent structure underlying multivariate data.
Abstract: Abstract Researchers have widely used exploratory factor analysis (EFA) to learn the latent structure underlying multivariate data. Rotation and regularised estimation are two classes of methods in EFA that they often use to find interpretable loading matrices. In this paper, we propose a new family of oblique rotations based on component-wise $$L^p$$ L p loss functions $$(0 < p\le 1)$$ ( 0 < p 1 ) that is closely related to an $$L^p$$ L p regularised estimator. We develop model selection and post-selection inference procedures based on the proposed rotation method. When the true loading matrix is sparse, the proposed method tends to outperform traditional rotation and regularised estimation methods in terms of statistical accuracy and computational cost. Since the proposed loss functions are nonsmooth, we develop an iteratively reweighted gradient projection algorithm for solving the optimisation problem. We also develop theoretical results that establish the statistical consistency of the estimation, model selection, and post-selection inference. We evaluate the proposed method and compare it with regularised estimation and traditional rotation methods via simulation studies. We further illustrate it using an application to the Big Five personality assessment.

1 citations


Journal ArticleDOI
TL;DR: In this article , the authors proposed two dynamic IRTree models, which account for systematic continuous changes and additional random fluctuations of response strategies, by defining item position-dependent trait and response style effects.
Abstract: Abstract It is essential to control self-reported trait measurements for response style effects to ensure a valid interpretation of estimates. Traditional psychometric models facilitating such control consider item responses as the result of two kinds of response processes—based on the substantive trait, or based on response styles—and they assume that both of these processes have a constant influence across the items of a questionnaire. However, this homogeneity over items is not always given, for instance, if the respondents’ motivation declines throughout the questionnaire so that heuristic responding driven by response styles may gradually take over from cognitively effortful trait-based responding. The present study proposes two dynamic IRTree models, which account for systematic continuous changes and additional random fluctuations of response strategies, by defining item position-dependent trait and response style effects. Simulation analyses demonstrate that the proposed models accurately capture dynamic trajectories of response processes, as well as reliably detect the absence of dynamics, that is, identify constant response strategies. The continuous version of the dynamic model formalizes the underlying response strategies in a parsimonious way and is highly suitable as a cognitive model for investigating response strategy changes over items. The extended model with random fluctuations of strategies can adapt more closely to the item-specific effects of different response processes and thus is a well-fitting model with high flexibility. By using an empirical data set, the benefits of the proposed dynamic approaches over traditional IRTree models are illustrated under realistic conditions.

1 citations


Journal ArticleDOI
TL;DR: In this article , the authors consider the dependence of chance-corrected weighted agreement coefficients on the weighting scheme that penalizes rater disagreements and obtain the first-order and second-order derivatives of the coefficients with respect to the power parameter and decompose them into components corresponding to all pairs of different category distances.
Abstract: We consider the dependence of a broad class of chance-corrected weighted agreement coefficients on the weighting scheme that penalizes rater disagreements. The considered class encompasses many existing coefficients with any number of raters, and one real-valued power parameter defines the weighting scheme that includes linear, quadratic, identity, and radical weights. We obtain the first-order and second-order derivatives of the coefficients with respect to the power parameter and decompose them into components corresponding to all pairs of different category distances. Each component compares its two distances in terms of the ratio of observed to expected-by-chance frequency. A larger ratio for the smaller distance than the larger distance contributes to a positive relationship between the power parameter and the coefficient value; the opposite contributes to a negative relationship. We provide necessary and sufficient conditions for the coefficient value to increase or decrease and the relationship to intensify or weaken as the power parameter increases. We use the first-order and second-order derivatives for corresponding measurement. Furthermore, we show how these two derivatives allow other researchers to obtain quite accurate estimates of the coefficient value for unreported values of the power parameter, even without access to the original data.

1 citations



Journal ArticleDOI
TL;DR: In this paper , the support of the joint probability distribution of categorical variables in the total population is treated as unknown, and a general subpopulation model with its support equal to the set of all observed score patterns is derived.
Abstract: In this paper, the support of the joint probability distribution of categorical variables in the total population is treated as unknown. From a general total population model with unknown support, a general subpopulation model with its support equal to the set of all observed score patterns is derived. In maximum likelihood estimation of the parameters of any such subpopulation model, the evaluation of the log-likelihood function only requires the summation over a number of terms equal to at most the sample size. It is made clear that the parameters of a hypothesized total population model are consistently and asymptotically efficiently estimated by the values that maximize the log-likelihood function of the corresponding subpopulation model. Next, new likelihood ratio goodness-of-fit tests are proposed as alternatives to the Pearson chi-square goodness-of-fit test and the likelihood ratio test against the saturated model. In a simulation study, the asymptotic bias and efficiency of maximum likelihood estimators and the asymptotic performance of the goodness-of-fit tests are investigated.



Journal ArticleDOI
TL;DR: In this article , generalized additive latent and mixed models (GALAMMs) are proposed to model the complex lifespan trajectories of episodic memory, working memory, and speed/executive function, measured by the California Verbal Learning Test (CVLT), digit span tests, and Stroop tests.
Abstract: Abstract We present generalized additive latent and mixed models (GALAMMs) for analysis of clustered data with responses and latent variables depending smoothly on observed variables. A scalable maximum likelihood estimation algorithm is proposed, utilizing the Laplace approximation, sparse matrix computation, and automatic differentiation. Mixed response types, heteroscedasticity, and crossed random effects are naturally incorporated into the framework. The models developed were motivated by applications in cognitive neuroscience, and two case studies are presented. First, we show how GALAMMs can jointly model the complex lifespan trajectories of episodic memory, working memory, and speed/executive function, measured by the California Verbal Learning Test (CVLT), digit span tests, and Stroop tests, respectively. Next, we study the effect of socioeconomic status on brain structure, using data on education and income together with hippocampal volumes estimated by magnetic resonance imaging. By combining semiparametric estimation with latent variable modeling, GALAMMs allow a more realistic representation of how brain and cognition vary across the lifespan, while simultaneously estimating latent traits from measured items. Simulation experiments suggest that model estimates are accurate even with moderate sample sizes.




Journal ArticleDOI
TL;DR: In this paper , the root-mean-square error of approximation (RMSEA) or the comparative fit index (CFI) are derived from a noncentrality parameter estimate derived from the model fit statistic.
Abstract: Abstract Fit indices are highly frequently used for assessing the goodness of fit of latent variable models. Most prominent fit indices, such as the root-mean-square error of approximation (RMSEA) or the comparative fit index (CFI), are based on a noncentrality parameter estimate derived from the model fit statistic. While a noncentrality parameter estimate is well suited for quantifying the amount of systematic error, the complex weighting function involved in its calculation makes indices derived from it challenging to interpret. Moreover, noncentrality-parameter-based fit indices yield systematically different values, depending on the indicators’ level of measurement. For instance, RMSEA and CFI yield more favorable fit indices for models with categorical as compared to metric variables under otherwise identical conditions. In the present article, approaches for obtaining an approximation discrepancy estimate that is independent from any specific weighting function are considered. From these unweighted approximation error estimates, fit indices analogous to RMSEA and CFI are calculated and their finite sample properties are investigated using simulation studies. The results illustrate that the new fit indices consistently estimate their true value which, in contrast to other fit indices, is the same value for metric and categorical variables. Advantages with respect to interpretability are discussed and cutoff criteria for the new indices are considered.



Journal ArticleDOI
TL;DR: In this article , the authors show that the polychoric correlation does not always approximate the true latent correlation, even when the observed variables have many categories and the latent marginals are known.
Abstract: The polychoric correlation is a popular measure of association for ordinal data. It estimates a latent correlation, i.e., the correlation of a latent vector. This vector is assumed to be bivariate normal, an assumption that cannot always be justified. When bivariate normality does not hold, the polychoric correlation will not necessarily approximate the true latent correlation, even when the observed variables have many categories. We calculate the sets of possible values of the latent correlation when latent bivariate normality is not necessarily true, but at least the latent marginals are known. The resulting sets are called partial identification sets, and are shown to shrink to the true latent correlation as the number of categories increase. Moreover, we investigate partial identification under the additional assumption that the latent copula is symmetric, and calculate the partial identification set when one variable is ordinal and another is continuous. We show that little can be said about latent correlations, unless we have impractically many categories or we know a great deal about the distribution of the latent vector. An open-source R package is available for applying our results.

Journal ArticleDOI
TL;DR: In this paper , the authors compare the properties of lasso approaches used for variable selection to Bayesian variable selection approaches, and highlight the advantages of stochastic search variable selection (SSVS), that make it well suited for feature selection applications in psychology.
Abstract: Abstract In the current paper, we review existing tools for solving variable selection problems in psychology. Modern regularization methods such as lasso regression have recently been introduced in the field and are incorporated into popular methodologies, such as network analysis. However, several recognized limitations of lasso regularization may limit its suitability for psychological research. In this paper, we compare the properties of lasso approaches used for variable selection to Bayesian variable selection approaches. In particular we highlight advantages of stochastic search variable selection (SSVS), that make it well suited for variable selection applications in psychology. We demonstrate these advantages and contrast SSVS with lasso type penalization in an application to predict depression symptoms in a large sample and an accompanying simulation study. We investigate the effects of sample size, effect size, and patterns of correlation among predictors on rates of correct and false inclusion and bias in the estimates. SSVS as investigated here is reasonably computationally efficient and powerful to detect moderate effects in small sample sizes (or small effects in moderate sample sizes), while protecting against false inclusion and without over-penalizing true effects. We recommend SSVS as a flexible framework that is well-suited for the field, discuss limitations, and suggest directions for future development.

Journal ArticleDOI
TL;DR: The authors proposed a class of models called guessing models, which contains most models of how judges make their ratings, and every guessing model have an associated measure of agreement called the knowledge coefficient.
Abstract: Several measures of agreement, such as the Perreault-Leigh coefficient, the [Formula: see text], and the recent coefficient of van Oest, are based on explicit models of how judges make their ratings. To handle such measures of agreement under a common umbrella, we propose a class of models called guessing models, which contains most models of how judges make their ratings. Every guessing model have an associated measure of agreement we call the knowledge coefficient. Under certain assumptions on the guessing models, the knowledge coefficient will be equal to the multi-rater Cohen's kappa, Fleiss' kappa, the Brennan-Prediger coefficient, or other less-established measures of agreement. We provide several sample estimators of the knowledge coefficient, valid under varying assumptions, and their asymptotic distributions. After a sensitivity analysis and a simulation study of confidence intervals, we find that the Brennan-Prediger coefficient typically outperforms the others, with much better coverage under unfavorable circumstances.


Journal ArticleDOI
TL;DR: In this paper , the authors proposed a method to calculate the probability of a collision, and the probability was 0.650/-This.650/$1.650/This/$0.
Abstract: 650/-This

Journal ArticleDOI

Journal ArticleDOI
TL;DR: In this article , the goodness-of-fit of the unidimensional monotone latent variable model is assessed using the empirical conditions of nonnegative correlations (Mokken in A theory and procedure of scale-analysis, Mouton, The Hague, 1971), manifest monotonicity (Junker in Ann Stat 21:1359-1378, 1993), multivariate total positivity of order 2 (Bartolucci and Forcina in Ann stat 28:1206-1218, 2000), and nonnegative partial correlations (Ellis in Psychometrika 79:303-316, 2014).
Abstract: Abstract The goodness-of-fit of the unidimensional monotone latent variable model can be assessed using the empirical conditions of nonnegative correlations (Mokken in A theory and procedure of scale-analysis, Mouton, The Hague, 1971), manifest monotonicity (Junker in Ann Stat 21:1359–1378, 1993), multivariate total positivity of order 2 (Bartolucci and Forcina in Ann Stat 28:1206–1218, 2000), and nonnegative partial correlations (Ellis in Psychometrika 79:303–316, 2014). We show that multidimensional monotone factor models with independent factors also imply these empirical conditions; therefore, the conditions are insensitive to multidimensionality. Conditional association (Rosenbaum in Psychometrika 49(3):425–435, 1984) can detect multidimensionality, but tests of it (De Gooijer and Yuan in Comput Stat Data Anal 55:34–44, 2011) are usually not feasible for realistic numbers of items. The only existing feasible test procedures that can reveal multidimensionality are Rosenbaum’s (Psychometrika 49(3):425–435, 1984) Case 2 and Case 5, which test the covariance of two items or two subtests conditionally on the unweighted sum of the other items. We improve this procedure by conditioning on a weighted sum of the other items. The weights are estimated in a training sample from a linear regression analysis. Simulations show that the Type I error rate is under control and that, for large samples, the power is higher if one dimension is more important than the other or if there is a third dimension. In small samples and with two equally important dimensions, using the unweighted sum yields greater power.


Journal ArticleDOI
TL;DR: In this paper , the authors discuss boundary conditions and argue that person selection effects on item parameters are not unique to item-specific factors and that the effects presented by Lyu et al. may not generalize to the family of IRTree models as a whole.
Abstract: Abstract Lyu et al. (Psychometrika, 2023) demonstrated that item-specific factors can cause spurious effects on the structural parameters of IRTree models for multiple nested response processes per item. Here, we discuss some boundary conditions and argue that person selection effects on item parameters are not unique to item-specific factors and that the effects presented by Lyu et al. (Psychometrika, 2023) may not generalize to the family of IRTree models as a whole. We conclude with the recommendation that IRTree model specification should be guided by theoretical considerations, rather than driven by data, in order to avoid misinterpretations of parameter differences.



Journal ArticleDOI
TL;DR: In this paper , marginal maximum likelihood (ML) estimation methods for hierarchical multinomial processing tree (MPT) models with random and fixed effects are proposed and evaluated, and an illustrative empirical application and an outlook on possible extensions and future applications of the proposed ML approach is given.
Abstract: Abstract The present article proposes and evaluates marginal maximum likelihood (ML) estimation methods for hierarchical multinomial processing tree (MPT) models with random and fixed effects. We assume that an identifiable MPT model with S parameters holds for each participant. Of these S parameters, R parameters are assumed to vary randomly between participants, and the remaining $$S-R$$ S - R parameters are assumed to be fixed. We also propose an extended version of the model that includes effects of covariates on MPT model parameters. Because the likelihood functions of both versions of the model are too complex to be tractable, we propose three numerical methods to approximate the integrals that occur in the likelihood function, namely, the Laplace approximation (LA), adaptive Gauss–Hermite quadrature (AGHQ), and Quasi Monte Carlo (QMC) integration. We compare these three methods in a simulation study and show that AGHQ performs well in terms of both bias and coverage rate. QMC also performs well but the number of responses per participant must be sufficiently large. In contrast, LA fails quite often due to undefined standard errors. We also suggest ML-based methods to test the goodness of fit and to compare models taking model complexity into account. The article closes with an illustrative empirical application and an outlook on possible extensions and future applications of the proposed ML approach.