scispace - formally typeset
Search or ask a question

Showing papers by "Paul De Boeck published in 2022"


Journal ArticleDOI
TL;DR: In this paper , the authors proposed a model-based method to study conditional dependence between response accuracy and response time with the diffusion IRT model, which can explain the behavioral patterns of conditional dependency found in previous studies in psychometrics.
Abstract: In this paper, we propose a model-based method to study conditional dependence between response accuracy and response time (RT) with the diffusion IRT model (Tuerlinckx and De Boeck in Psychometrika 70(4):629–650, 2005, https://doi.org/10.1007/s11336-000-0810-3 ; van der Maas et al. in Psychol Rev 118(2):339–356, 2011, https://doi.org/10.1080/20445911.2011.454498 ). We extend the earlier diffusion IRT model by introducing variability across persons and items in cognitive capacity (drift rate in the evidence accumulation process) and variability in the starting point of the decision processes. We show that the extended model can explain the behavioral patterns of conditional dependency found in the previous studies in psychometrics. Variability in cognitive capacity can predict positive and negative conditional dependency and their interaction with the item difficulty. Variability in starting point can account for the early changes in the response accuracy as a function of RT given the person and item effects. By the combination of the two variability components, the extended model can produce the curvilinear conditional accuracy functions that have been observed in psychometric data. We also provide a simulation study to validate the parameter recovery of the proposed model and present two empirical applications to show how to implement the model to study conditional dependency underlying data response accuracy and RTs.

8 citations


Journal ArticleDOI
TL;DR: This article used the risky-choice framing effect as an example and reported on the generalizability of that effect in three metastudies (total N = 2,338) and found that the framing effect generalized well across most of the potential moderators tested, as was expected.
Abstract: A metastudy is a set of many tiny studies (microstudies) created from a much larger collection of possibilities. Metastudies can yield many of the benefits of time-consuming replications and meta-analyses but more efficiently and with greater attention to generalizability and the causal effects of moderators. Statistical precision and power are higher than in studies with the same total sample size but with fewer conditions and more participants per condition. In this article, we describe metastudies and their benefits, demonstrate how to conduct a metastudy using the well-known risky-choice framing effect as an example, and report on the generalizability of that effect. In three metastudies (total N = 2,338), the framing effect generalized well across most of the potential moderators tested, as was expected. Surprisingly, however, the effect was up to twice as large when the certain option was replaced with a slightly risky option; prospect theory predicts the opposite, and fuzzy-trace theory predicts no difference. Metastudies provide a relatively quick and not-so-painful way of examining an effect’s generalizability without waiting for a meta-analysis. Both individual labs and multilab networks are encouraged to shift from traditional studies to metastudies.

7 citations


Journal ArticleDOI
TL;DR: A novel application of a generalized additive logistic regression model (GAMM) to intensive binary time series eye-tracking data, with typical data complexities as encountered in experimental studies of cognitive processes (i.e., experimental designs with within and between-subjects factors, and crossed random effects).
Abstract: Eye-tracking has emerged as a popular method for empirical studies of cognitive processes across multiple substantive research areas. Eye-tracking systems are capable of automatically generating fixation-location data over time at high temporal resolution. Often, the researcher obtains a binary measure of whether or not, at each point in time, the participant is fixating on a critical interest area or object in the real world or in a computerized display. Eye-tracking data are characterized by spatial-temporal correlations and random variability, driven by multiple fine-grained observations taken over small time intervals (e.g., every 10 ms). Ignoring these data complexities leads to biased inferences for the covariates of interest such as experimental condition effects. This article presents a novel application of a generalized additive logistic regression model for intensive binary time series eye-tracking data from a between- and within-subjects experimental design. The model is formulated as a generalized additive mixed model (GAMM) and implemented in the mgcv R package. The generalized additive logistic regression model was illustrated using an empirical data set aimed at understanding the accommodation of regional accents in spoken language processing. Accuracy of parameter estimates and the importance of modeling the spatial-temporal correlations in detecting the experimental condition effects were shown in conditions similar to our empirical data set via a simulation study. (PsycInfo Database Record (c) 2022 APA, all rights reserved).

2 citations


Journal ArticleDOI
TL;DR: In this article , the authors provide guidelines to select necessary random effects in model-building steps and discuss existing and newly proposed methods to select the optimal set of random effects to account for variability and heterogeneity in multilevel data.
Abstract: Multilevel data structures are often found in multiple substantive research areas, and multilevel models (MLMs) have been widely used to allow for such multilevel data structures. One important step when applying MLM is the selection of an optimal set of random effects to account for variability and heteroscedasticity in multilevel data. Literature reviews on current practices in applying MLM showed that diagnostic plots are only rarely used for model selection and for model checking. In this study, possible random effects and a generic description of the random effects were provided to guide researchers to select necessary random effects. In addition, based on extensive literature reviews, level-specific diagnostic plots were presented using various kinds of level-specific residuals, and diagnostic measures and statistical tests were suggested to select a set of random effects. Existing and newly proposed methods were illustrated using two data sets: a cross-sectional data set and a longitudinal data set. Along with the illustration, we discuss the methods and provide guidelines to select necessary random effects in model-building steps. R code was provided for the analyses.

1 citations


Journal ArticleDOI
TL;DR: This article examined the effect of both language and memory processing of individual words on list learning and showed that list learning performance depends on factors beyond the repetition of words beyond the number of words.
Abstract: OBJECTIVE A variety of factors affect list learning performance and relatively few studies have examined the impact of word selection on these tests. This study examines the effect of both language and memory processing of individual words on list learning. METHOD Item-response data from 1,219 participants, Mage = 74.41 (SD = 7.13), Medu = 13.30 (SD = 2.72), in the Harmonized Cognitive Assessment Protocol were used. A Bayesian generalized (non)linear multilevel modeling framework was used to specify the measurement and explanatory item-response theory models. Explanatory effects on items due to learning over trials, serial position of words, and six word properties obtained through the English Lexicon Project were modeled. RESULTS A two parameter logistic (2PL) model with trial-specific learning effects produced the best measurement fit. Evidence of the serial position effect on word learning was observed. Robust positive effects on word learning were observed for body-object integration while robust negative effects were observed for word frequency, concreteness, and semantic diversity. A weak negative effect of average age of acquisition and a weak positive effect for the number of phonemes in the word were also observed. CONCLUSIONS Results demonstrate that list learning performance depends on factors beyond the repetition of words. Identification of item factors that predict learning could extend to a range of test development problems including translation, form equating, item revision, and item bias. In data harmonization efforts, these methods can also be used to help link tests via shared item features and testing of whether these features are equally explanatory across samples. (PsycInfo Database Record (c) 2022 APA, all rights reserved).

1 citations


Journal ArticleDOI
TL;DR: McShane et al. as mentioned in this paper proposed an approach to analyze variation and covariation in replication studies and appreciate the chance to participate in this discussion of the issues of replication, generalization, and integration of findings.
Abstract: We commend McShane et al. on their innovative approach to analyzing variation and covariation in replication studies and appreciate the chance to participate in this discussion of the issues. Three important challenges of empirical disciplines are replication, generalization, and integration of findings. One way to think of these three is sequential: First replicate, then generalize, and finally integrate. This ordering is the motivation for direct replications, including for large-scale replication studies of multiple effects (e.g., the Many Labs Project or MLP; Klein et al. 2014). The kind of generalization that is of interest in such studies is generalization across labs and their participant populations, using the exact same design and experimental materials in each lab. This is different from generalization across different ways of implementing a study (DeKay et al. in press; Landy et al. 2020). As a result, large-scale replication studies leave out the larger part of what generalization is. Additionally, as McShane et al. note, large-scale efforts involving multiple phenomena almost always analyze the effects separately, so nothing is learned about how the effects and the corresponding variates might vary and covary. Variates are the dependent variables (DVs) as measured per experimental condition. The contributions of large-scale efforts are thus mixed. On the positive side, we have much greater certainty about the replicability of a growing list of specific effects across labs and settings, which is nothing to sneeze at. On the negative side, however, we have learned very little about the generalizability of these effects across different implementations and methods factors, and almost nothing about how the various effects and corresponding variates might be integrated into a more coherent view of the science. For the most part, generalization and integration are still handled in qualitative reviews that organize the literature on the basis of conceptual similarities and theoretical connections. In comparison to the promising situation for replicability, the dearth of direct empirical evidence for generalizability and (especially) integration can be viewed as a failure of largescale projects to live up to their potential. But that seems too harsh, particularly for such recent endeavors. A more positive, forward-looking view is that these knowledge gaps represent a grand opportunity for advances in research design and analysis. As explained below, we believe that McShane et al.’s

1 citations


Journal ArticleDOI
TL;DR: In this article , the role of reliability in the quality of experimental research is not always well understood, and a latent variable framework is proposed to investigate the relationship between reliability and power.
Abstract: The replication crisis has led to a renewed discussion about the impacts of measurement quality on the precision of psychology research. High measurement quality is associated with low measurement error, yet the role of reliability in the quality of experimental research is not always well understood. In this study, we attempt to understand the role of reliability through its relationship with power while focusing on between-group designs for experimental studies. We outline a latent variable framework to investigate this nuanced relationship through equations. An under-evaluated aspect of the relationship is the variance and homogeneity of the subpopulation from which the study sample is drawn. Higher homogeneity implies a lower reliability, but yields higher power. We proceed to demonstrate the impact of this relationship between reliability and power by imitating different scenarios of large-scale replications with between-group designs. We find negative correlations between reliability and power when there are sizable differences in the latent variable variance and negligible differences in the other parameters across studies. Finally, we analyze the data from the replications of the ego depletion effect (Hagger et al., 2016) and the replications of the grammatical aspect effect (Eerland et al., 2016), each time with between-group designs, and the results align with previous findings. The applications show that a negative relationship between reliability and power is a realistic possibility with consequences for applied work. We suggest that more attention be given to the homogeneity of the subpopulation when study-specific reliability coefficients are reported in between-group studies.