scispace - formally typeset
Search or ask a question

Showing papers in "Psychometrika in 2020"


Journal ArticleDOI
TL;DR: This paper provides a general framework that extends GGM modeling with latent variables, including relationships over time, from time-series data or panel data featuring at least three waves of measurement, and takes the form of a graphical vector-autoregression model between latent variables.
Abstract: Researchers in the field of network psychometrics often focus on the estimation of Gaussian graphical models (GGMs)—an undirected network model of partial correlations—between observed variables of cross-sectional data or single-subject time-series data. This assumes that all variables are measured without measurement error, which may be implausible. In addition, cross-sectional data cannot distinguish between within-subject and between-subject effects. This paper provides a general framework that extends GGM modeling with latent variables, including relationships over time. These relationships can be estimated from time-series data or panel data featuring at least three waves of measurement. The model takes the form of a graphical vector-autoregression model between latent variables and is termed the ts-lvgvar when estimated from time-series data and the panel-lvgvar when estimated from panel data. These methods have been implemented in the software package psychonetrics, which is exemplified in two empirical examples, one using time-series data and one using panel data, and evaluated in two large-scale simulation studies. The paper concludes with a discussion on ergodicity and generalizability. Although within-subject effects may in principle be separated from between-subject effects, the interpretation of these results rests on the intensity and the time interval of measurement and on the plausibility of the assumption of stationarity.

94 citations


Journal ArticleDOI
TL;DR: A new Bayesian variable selection algorithm is developed that explicitly enforces generic identifiability conditions and monotonicity of item response functions to ensure valid posterior inference in cognitive diagnostic models.
Abstract: Cognitive diagnostic models (CDMs) are latent variable models developed to infer latent skills, knowledge, or personalities that underlie responses to educational, psychological, and social science tests and measures. Recent research focused on theory and methods for using sparse latent class models (SLCMs) in an exploratory fashion to infer the latent processes and structure underlying responses. We report new theoretical results about sufficient conditions for generic identifiability of SLCM parameters. An important contribution for practice is that our new generic identifiability conditions are more likely to be satisfied in empirical applications than existing conditions that ensure strict identifiability. Learning the underlying latent structure can be formulated as a variable selection problem. We develop a new Bayesian variable selection algorithm that explicitly enforces generic identifiability conditions and monotonicity of item response functions to ensure valid posterior inference. We present Monte Carlo simulation results to support accurate inferences and discuss the implications of our findings for future SLCM research and educational testing.

42 citations


Journal ArticleDOI
TL;DR: This article proposed a robust effect size index based on M-estimators, which is invariant across models and has the potential to make communication and comprehension of effect size uniform across the behavioral sciences.
Abstract: Effect size indices are useful tools in study design and reporting because they are unitless measures of association strength that do not depend on sample size. Existing effect size indices are developed for particular parametric models or population parameters. Here, we propose a robust effect size index based on M-estimators. This approach yields an index that is very generalizable because it is unitless across a wide range of models. We demonstrate that the new index is a function of Cohen's d, [Formula: see text], and standardized log odds ratio when each of the parametric models is correctly specified. We show that existing effect size estimators are biased when the parametric models are incorrect (e.g., under unknown heteroskedasticity). We provide simple formulas to compute power and sample size and use simulations to assess the bias and standard error of the effect size estimator in finite samples. Because the new index is invariant across models, it has the potential to make communication and comprehension of effect size uniform across the behavioral sciences.

26 citations


Journal ArticleDOI
TL;DR: In this article, a multidimensional scaling framework is proposed to extract useful information from response processes, and the proposed method is applied to both simulated data and real process data from 14 PSTRE items in PIAAC 2012.
Abstract: Computer-based interactive items have become prevalent in recent educational assessments. In such items, detailed human–computer interactive process, known as response process, is recorded in a log file. The recorded response processes provide great opportunities to understand individuals’ problem solving processes. However, difficulties exist in analyzing these data as they are high-dimensional sequences in a nonstandard format. This paper aims at extracting useful information from response processes. In particular, we consider an exploratory analysis that extracts latent variables from process data through a multidimensional scaling framework. A dissimilarity measure is described to quantify the discrepancy between two response processes. The proposed method is applied to both simulated data and real process data from 14 PSTRE items in PIAAC 2012. A prediction procedure is used to examine the information contained in the extracted latent variables. We find that the extracted latent variables preserve a substantial amount of information in the process and have reasonable interpretability. We also empirically prove that process data contains more information than classic binary item responses in terms of out-of-sample prediction of many variables.

24 citations


Journal ArticleDOI
TL;DR: An approach that recursively splits the sample based on covariates in order to detect significant differences in the structure of the covariance or correlation matrix and adapt model-based recursive partitioning and conditional inference tree approaches for finding covariate splits in a recursive manner.
Abstract: In many areas of psychology, correlation-based network approaches (i.e., psychometric networks) have become a popular tool. In this paper, we propose an approach that recursively splits the sample based on covariates in order to detect significant differences in the structure of the covariance or correlation matrix. Psychometric networks or other correlation-based models (e.g., factor models) can be subsequently estimated from the resultant splits. We adapt model-based recursive partitioning and conditional inference tree approaches for finding covariate splits in a recursive manner. The empirical power of these approaches is studied in several simulation conditions. Examples are given using real-life data from personality and clinical research.

20 citations


Journal ArticleDOI
TL;DR: An exact approach is proposed for power and sample size calculations in ANCOVA with random assignment and multinormal covariates and the improved solution is illustrated with an example regarding the comparative effectiveness of interventions.
Abstract: The analysis of covariance (ANCOVA) has notably proven to be an effective tool in a broad range of scientific applications. Despite the well-documented literature about its principal uses and statistical properties, the corresponding power analysis for the general linear hypothesis tests of treatment differences remains a less discussed issue. The frequently recommended procedure is a direct application of the ANOVA formula in combination with a reduced degrees of freedom and a correlation-adjusted variance. This article aims to explicate the conceptual problems and practical limitations of the common method. An exact approach is proposed for power and sample size calculations in ANCOVA with random assignment and multinormal covariates. Both theoretical examination and numerical simulation are presented to justify the advantages of the suggested technique over the current formula. The improved solution is illustrated with an example regarding the comparative effectiveness of interventions. In order to facilitate the application of the described power and sample size calculations, accompanying computer programs are also presented.

19 citations


Journal ArticleDOI
TL;DR: Simulations show that the new method is computationally efficient and can outperform previously proposed Bayesian Markov chain Monte-Carlo algorithms in terms of Q matrix recovery, and item and structural parameter estimation.
Abstract: In diagnostic classification models (DCMs), the Q matrix encodes in which attributes are required for each item. The Q matrix is usually predetermined by the researcher but may in practice be misspecified which yields incorrect statistical inference. Instead of using a predetermined Q matrix, it is possible to estimate it simultaneously with the item and structural parameters of the DCM. Unfortunately, current methods are computationally intensive when there are many attributes and items. In addition, the identification constraints necessary for DCMs are not always enforced in the estimation algorithms which can lead to non-identified models being considered. We address these problems by simultaneously estimating the item, structural and Q matrix parameters of the Deterministic Input Noisy “And” gate model using a constrained Metropolis–Hastings Robbins–Monro algorithm. Simulations show that the new method is computationally efficient and can outperform previously proposed Bayesian Markov chain Monte-Carlo algorithms in terms of Q matrix recovery, and item and structural parameter estimation. We also illustrate our approach using Tatsuoka’s fraction–subtraction data and Certificate of Proficiency in English data.

15 citations


Journal ArticleDOI
TL;DR: In this paper, two sets of combination rules for the standardized regression coefficients and their confidence intervals are proposed, and their statistical properties are discussed, and two improved point estimators of [Formula: see text] in multiply imputed data are proposed.
Abstract: Whenever statistical analyses are applied to multiply imputed datasets, specific formulas are needed to combine the results into one overall analysis, also called combination rules. In the context of regression analysis, combination rules for the unstandardized regression coefficients, the t-tests of the regression coefficients, and the F-tests for testing [Formula: see text] for significance have long been established. However, there is still no general agreement on how to combine the point estimators of [Formula: see text] in multiple regression applied to multiply imputed datasets. Additionally, no combination rules for standardized regression coefficients and their confidence intervals seem to have been developed at all. In the current article, two sets of combination rules for the standardized regression coefficients and their confidence intervals are proposed, and their statistical properties are discussed. Additionally, two improved point estimators of [Formula: see text] in multiply imputed data are proposed, which in their computation use the pooled standardized regression coefficients. Simulations show that the proposed pooled standardized coefficients produce only small bias and that their 95% confidence intervals produce coverage close to the theoretical 95%. Furthermore, the simulations show that the newly proposed pooled estimates for [Formula: see text] are less biased than two earlier proposed pooled estimates.

15 citations


Journal ArticleDOI
TL;DR: Extensions to ERGMs are introduced to address limitations: Conway-Maxwell-Binomial distribution to model the marginal dependence among multiple layers; a "layer logic" language to translate familiar ERGM effects to substantively meaningful interactions of observed layers; and nondegenerate triadic and degree effects.
Abstract: Multi-layer networks arise when more than one type of relation is observed on a common set of actors. Modeling such networks within the exponential-family random graph (ERG) framework has been previously limited to special cases and, in particular, to dependence arising from just two layers. Extensions to ERGMs are introduced to address these limitations: Conway-Maxwell-Binomial distribution to model the marginal dependence among multiple layers; a "layer logic" language to translate familiar ERGM effects to substantively meaningful interactions of observed layers; and nondegenerate triadic and degree effects. The developments are demonstrated on two previously published datasets.

14 citations


Journal ArticleDOI
TL;DR: A maximum likelihood estimation routine for two-level structural equation models with random slopes for latent covariates is presented, relying upon a method proposed by du Toit and Cudeck for reformulating the likelihood function so that an often large subset of the random effects can be integrated analytically, reducing the computational burden of high-dimensional numerical integration.
Abstract: A maximum likelihood estimation routine for two-level structural equation models with random slopes for latent covariates is presented. Because the likelihood function does not typically have a closed-form solution, numerical integration over the random effects is required. The routine relies upon a method proposed by du Toit and Cudeck (Psychometrika 74(1):65–82, 2009) for reformulating the likelihood function so that an often large subset of the random effects can be integrated analytically, reducing the computational burden of high-dimensional numerical integration. The method is demonstrated and assessed using a small-scale simulation study and an empirical example.

12 citations


Journal ArticleDOI
TL;DR: A joint Bayesian additive mixed modeling framework that simultaneously assesses brain activation and connectivity patterns from multiple subjects is proposed and applied to a multi-subject fMRI dataset from a balloon-analog risk-taking experiment, showing the effectiveness of the model in providing interpretable joint inference on voxel-level activations and inter-regional connectivity associated with how the brain processes risk.
Abstract: Brain activation and connectivity analyses in task-based functional magnetic resonance imaging (fMRI) experiments with multiple subjects are currently at the forefront of data-driven neuroscience. In such experiments, interest often lies in understanding activation of brain voxels due to external stimuli and strong association or connectivity between the measurements on a set of pre-specified groups of brain voxels, also known as regions of interest (ROI). This article proposes a joint Bayesian additive mixed modeling framework that simultaneously assesses brain activation and connectivity patterns from multiple subjects. In particular, fMRI measurements from each individual obtained in the form of a multi-dimensional array/tensor at each time are regressed on functions of the stimuli. We impose a low-rank parallel factorization decomposition on the tensor regression coefficients corresponding to the stimuli to achieve parsimony. Multiway stick-breaking shrinkage priors are employed to infer activation patterns and associated uncertainties in each voxel. Further, the model introduces region-specific random effects which are jointly modeled with a Bayesian Gaussian graphical prior to account for the connectivity among pairs of ROIs. Empirical investigations under various simulation studies demonstrate the effectiveness of the method as a tool to simultaneously assess brain activation and connectivity. The method is then applied to a multi-subject fMRI dataset from a balloon-analog risk-taking experiment, showing the effectiveness of the model in providing interpretable joint inference on voxel-level activations and inter-regional connectivity associated with how the brain processes risk. The proposed method is also validated through simulation studies and comparisons to other methods used within the neuroscience community.

Journal ArticleDOI
TL;DR: In this article, a variational Bayes (VB) inference algorithm was proposed for deriving conditionally conjugate priors of model parameters in saturated diagnostic classification models (DCM).
Abstract: Saturated diagnostic classification models (DCM) can flexibly accommodate various relationships among attributes to diagnose individual attribute mastery, and include various important DCMs as sub-models. However, the existing formulations of the saturated DCM are not better suited for deriving conditionally conjugate priors of model parameters. Because their derivation is the key in developing a variational Bayes (VB) inference algorithm, in the present study, we proposed a novel mixture formulation of saturated DCM. Based on it, we developed a VB inference algorithm of the saturated DCM that enables us to perform scalable and computationally efficient Bayesian estimation. The simulation study indicated that the proposed algorithm could recover the parameters in various conditions. It has also been demonstrated that the proposed approach is particularly suited to the case when new data become sequentially available over time, such as in computerized diagnostic testing. In addition, a real educational dataset was comparatively analyzed with the proposed VB and Markov chain Monte Carlo (MCMC) algorithms. The result demonstrated that very similar estimates were obtained between the two methods and that the proposed VB inference was much faster than MCMC. The proposed method can be a practical solution to the problem of computational load.

Journal ArticleDOI
TL;DR: A way to combine the social relations model with the structural equation modeling (SEM) framework is introduced and it is shown how the parameters of the combination can be estimated with a maximum likelihood (ML) approach.
Abstract: The social relations model (SRM) is widely used in psychology to investigate the components that underlie interpersonal perceptions, behaviors, and judgments. SRM researchers are often interested in investigating the multivariate relations between SRM effects. However, at present, it is not possible to investigate such relations without relying on a two-step approach that depends on potentially unreliable estimates of the true SRM effects. Here, we introduce a way to combine the SRM with the structural equation modeling (SEM) framework and show how the parameters of our combination can be estimated with a maximum likelihood (ML) approach. We illustrate the model with an example from personality psychology. We also investigate the statistical properties of the model in a small simulation study showing that our approach performs well in most simulation conditions. An R package (called srm) is available implementing the proposed methods.

Journal ArticleDOI
TL;DR: A new model for social influence is introduced, the latent space model for influence, which employs latent space positions so that individuals are affected most by those who are “closest” to them in the latentspace.
Abstract: Social network data represent interactions and relationships among groups of individuals. One aspect of social interaction is social influence, the idea that beliefs or behaviors change as a result of one's social network. The purpose of this article is to introduce a new model for social influence, the latent space model for influence, which employs latent space positions so that individuals are affected most by those who are "closest" to them in the latent space. We describe this model along with some of the contexts in which it can be used and explore the operating characteristics using a series of simulation studies. We conclude with an example of teacher advice-seeking networks to show that changes in beliefs about teaching mathematics may be attributed to network influence.

Journal ArticleDOI
TL;DR: A novel estimator that uses matrix calculus to derive the analytic derivatives of the PIV estimator and is extended to apply to any mixture of binary, ordinal, and continuous variables, and enables a general parameterization that permits the estimation of means, variances, and covariances of the underlying variables to use as input into a SEM analysis with PIV.
Abstract: Methodological development of the model-implied instrumental variable (MIIV) estimation framework has proved fruitful over the last three decades. Major milestones include Bollen’s (Psychometrika 61(1):109–121, 1996) original development of the MIIV estimator and its robustness properties for continuous endogenous variable SEMs, the extension of the MIIV estimator to ordered categorical endogenous variables (Bollen and Maydeu-Olivares in Psychometrika 72(3):309, 2007), and the introduction of a generalized method of moments estimator (Bollen et al., in Psychometrika 79(1):20–50, 2014). This paper furthers these developments by making several unique contributions not present in the prior literature: (1) we use matrix calculus to derive the analytic derivatives of the PIV estimator, (2) we extend the PIV estimator to apply to any mixture of binary, ordinal, and continuous variables, (3) we generalize the PIV model to include intercepts and means, (4) we devise a method to input known threshold values for ordinal observed variables, and (5) we enable a general parameterization that permits the estimation of means, variances, and covariances of the underlying variables to use as input into a SEM analysis with PIV. An empirical example illustrates a mixture of continuous variables and ordinal variables with fixed thresholds. We also include a simulation study to compare the performance of this novel estimator to WLSMV.

Journal ArticleDOI
TL;DR: The dynamic IRTree model was illustrated using an experimental study that employed the visual-world eye-tracking technique and showed that parameter recovery of the model was satisfactory and that ignoring trend and autoregressive effects resulted in biased estimates of experimental condition effects in the same conditions found in the empirical study.
Abstract: This paper presents a dynamic tree-based item response (IRTree) model as a novel extension of the autoregressive generalized linear mixed effect model (dynamic GLMM). We illustrate the unique utility of the dynamic IRTree model in its capability of modeling differentiated processes indicated by intensive polytomous time-series eye-tracking data. The dynamic IRTree was inspired by but is distinct from the dynamic GLMM which was previously presented by Cho, Brown-Schmidt, and Lee (Psychometrika 83(3):751-771, 2018). Unlike the dynamic IRTree, the dynamic GLMM is suitable for modeling intensive binary time-series eye-tracking data to identify visual attention to a single interest area over all other possible fixation locations. The dynamic IRTree model is a general modeling framework which can be used to model change processes (trend and autocorrelation) and which allows for decomposing data into various sources of heterogeneity. The dynamic IRTree model was illustrated using an experimental study that employed the visual-world eye-tracking technique. The results of a simulation study showed that parameter recovery of the model was satisfactory and that ignoring trend and autoregressive effects resulted in biased estimates of experimental condition effects in the same conditions found in the empirical study.

Journal ArticleDOI
TL;DR: A new item response theory model is developed to account for situations in which respondents overreport or underreport their actual opinions on a positive or negative issue by incorporating a deception term into a multidimensional rating scale model.
Abstract: In this study, a new item response theory model is developed to account for situations in which respondents overreport or underreport their actual opinions on a positive or negative issue. Such behavior is supposed to be a result of deception and transfer mechanisms. In the proposed model, this behavior is simulated by incorporating a deception term into a multidimensional rating scale model, followed by multiplication by a transfer term, with the two operations performed by an indicator function and a transition matrix separately. The proposed model is presented in a Bayesian framework approximated by Markov chain Monte Carlo algorithms. Through a series of simulations, the parameters of the proposed model are recovered accurately. The methodology is also implemented within an online experimental study to demonstrate the methodology’s application.

Journal ArticleDOI
TL;DR: In this paper, a continuous-time dynamic choice model is proposed for measuring problem-solving competency in simulated interactive tasks, which requires students to uncover some of the information needed to solve the problem through interactions with a computer-simulated environment.
Abstract: Problem solving has been recognized as a central skill that today’s students need to thrive and shape their world. As a result, the measurement of problem-solving competency has received much attention in education in recent years. A popular tool for the measurement of problem solving is simulated interactive tasks, which require students to uncover some of the information needed to solve the problem through interactions with a computer-simulated environment. A computer log file records a student’s problem-solving process in details, including his/her actions and the time stamps of these actions. It thus provides rich information for the measurement of students’ problem-solving competency. On the other hand, extracting useful information from log files is a challenging task, due to its complex data structure. In this paper, we show how log file process data can be viewed as a marked point process, based on which we propose a continuous-time dynamic choice model. The proposed model can serve as a measurement model for scaling students along the latent traits of problem-solving competency and action speed, based on data from one or multiple tasks. A real data example is given based on data from Program for International Student Assessment 2012.

Journal ArticleDOI
TL;DR: Structural equation modeling (SEM) is a statistical analytic framework that allows researchers to specify and test models with observed and latent (or unobservable) variables and their generally linear relationships.
Abstract: Structural equation modeling (SEM) is a statistical analytic framework that allows researchers to specify and test models with observed and latent (or unobservable) variables and their generally linear relationships. In the past decades, SEM has become a standard statistical analysis technique in behavioral, educational, psychological, and social science researchers’ repertoire. From a technical perspective, SEMwas developed as a mixture of two statistical fields—path analysis and data reduction. Path analysis is used to specify and examine directional relationships between observed variables, whereas data reduction is applied to uncover (unobserved) lowdimensional representations of observed variables, which are referred to as latent variables. Since two different data reduction techniques (i.e., factor analysis and principal component analysis) were available to the statistical community, SEM also evolved into two domains—factor-based and component-based (e.g., Jöreskog and Wold 1982). In factor-based SEM, in which the psychometric or psychological measurement tradition has strongly influenced, a (common) factor represents a latent variable under the assumption that each latent variable exists as an entity independent of observed variables, but also serves as the sole source of the associations between the observed variables. Conversely, in component-based SEM, which is more in line with traditional multivariate statistics, a weighted composite or a component of observed variables represents a latent variable under the assumption that the latter is an aggregation (or a direct consequence) of observed variables.

Journal ArticleDOI
TL;DR: In this paper, the authors revisited the singular value decomposition (SVD) algorithm given in Chen et al. (2019b) for exploratory item factor analysis (IFA).
Abstract: We revisit a singular value decomposition (SVD) algorithm given in Chen et al. (Psychometrika 84:124–146, 2019b) for exploratory item factor analysis (IFA). This algorithm estimates a multidimensional IFA model by SVD and was used to obtain a starting point for joint maximum likelihood estimation in Chen et al. (2019b). Thanks to the analytic and computational properties of SVD, this algorithm guarantees a unique solution and has computational advantage over other exploratory IFA methods. Its computational advantage becomes significant when the numbers of respondents, items, and factors are all large. This algorithm can be viewed as a generalization of principal component analysis to binary data. In this note, we provide the statistical underpinning of the algorithm. In particular, we show its statistical consistency under the same double asymptotic setting as in Chen et al. (2019b). We also demonstrate how this algorithm provides a scree plot for investigating the number of factors and provide its asymptotic theory. Further extensions of the algorithm are discussed. Finally, simulation studies suggest that the algorithm has good finite sample performance.

Journal ArticleDOI
TL;DR: This study shows how the gamma coefficient is always larger in absolute value than Kendall's rank correlation; this discrepancy lessens when the number of categories increases or, given the same number of Categories, when using equally probable categories.
Abstract: We consider a bivariate normal distribution with linear correlation [Formula: see text] whose random components are discretized according to two assigned sets of thresholds. On the resulting bivariate ordinal random variable, one can compute Goodman and Kruskal's gamma coefficient, [Formula: see text], which is a common measure of ordinal association. Given the known analytical monotonic relationship between Pearson's [Formula: see text] and Kendall's rank correlation [Formula: see text] for the bivariate normal distribution, and since in the continuous case, Kendall's [Formula: see text] coincides with Goodman and Kruskal's [Formula: see text], the change of this association measure before and after discretization is worth studying. We consider several experimental settings obtained by varying the two sets of thresholds, or, equivalently, the marginal distributions of the final ordinal variables. This study, confirming previous findings, shows how the gamma coefficient is always larger in absolute value than Kendall's rank correlation; this discrepancy lessens when the number of categories increases or, given the same number of categories, when using equally probable categories. Based on these results, a proposal is suggested to build a bivariate ordinal variable with assigned margins and Goodman and Kruskal's [Formula: see text] by ordinalizing a bivariate normal distribution. Illustrative examples employing artificial and real data are provided.

Journal ArticleDOI
TL;DR: A partially confirmatory approach to estimate both structures using Bayesian regression and a covariance Lasso within a unified framework is proposed and found to be flexible in addressing loading selection and local dependence.
Abstract: For test development in the setting of multidimensional item response theory, the exploratory and confirmatory approaches lie on two ends of a continuum in terms of the loading and residual structures. Inspired by the recent development of the Bayesian Lasso (least absolute shrinkage and selection operator), this research proposes a partially confirmatory approach to estimate both structures using Bayesian regression and a covariance Lasso within a unified framework. The Bayesian hierarchical formulation is implemented using Markov chain Monte Carlo estimation, and the shrinkage parameters are estimated simultaneously. The proposed approach with different model variants and constraints was found to be flexible in addressing loading selection and local dependence. Both simulated and real-life data were analyzed to evaluate the performance of the proposed model across different situations.

Journal ArticleDOI
TL;DR: This work relax the diagonality assumption of residual covariance matrix and estimate it using a formal Bayesian Lasso method, which improves goodness of fit and avoids ad hoc one-at-a-time manipulation of entries in the covariance Matrix via modification indexes.
Abstract: Ansari et al. (Psychometrika 67:49–77, 2002) applied a multilevel heterogeneous model for confirmatory factor analysis to repeated measurements on individuals. While the mean and factor loadings in this model vary across individuals, its factor structure is invariant. Allowing the individual-level residuals to be correlated is an important means to alleviate the restriction imposed by configural invariance. We relax the diagonality assumption of residual covariance matrix and estimate it using a formal Bayesian Lasso method. The approach improves goodness of fit and avoids ad hoc one-at-a-time manipulation of entries in the covariance matrix via modification indexes. We illustrate the approach using simulation studies and real data from an ecological momentary assessment.

Journal ArticleDOI
TL;DR: The authors proposed a dyadic Item Response Theory (dIRT) model for measuring interactions of pairs of individuals when the responses to items represent the actions (or behaviors, perceptions, etc.) of each individual (actor) made within the context of a dyad formed with another individual (partner).
Abstract: We propose a dyadic Item Response Theory (dIRT) model for measuring interactions of pairs of individuals when the responses to items represent the actions (or behaviors, perceptions, etc.) of each individual (actor) made within the context of a dyad formed with another individual (partner). Examples of its use include the assessment of collaborative problem solving or the evaluation of intra-team dynamics. The dIRT model generalizes both Item Response Theory models for measurement and the Social Relations Model for dyadic data. The responses of an actor when paired with a partner are modeled as a function of not only the actor's inclination to act and the partner's tendency to elicit that action, but also the unique relationship of the pair, represented by two directional, possibly correlated, interaction latent variables. Generalizations are discussed, such as accommodating triads or larger groups. Estimation is performed using Markov-chain Monte Carlo implemented in Stan, making it straightforward to extend the dIRT model in various ways. Specifically, we show how the basic dIRT model can be extended to accommodate latent regressions, multilevel settings with cluster-level random effects, as well as joint modeling of dyadic data and a distal outcome. A simulation study demonstrates that estimation performs well. We apply our proposed approach to speed-dating data and find new evidence of pairwise interactions between participants, describing a mutual attraction that is inadequately characterized by individual properties alone.

Journal ArticleDOI
TL;DR: This dissertation aims to provide a history of web exceptionalism from 1989 to 2002, a period chosen in order to explore its roots as well as specific cases up to and including the year in which descriptions of “Web 2.0” began to circulate.
Abstract: This book is written to be a practical guide to both structural equation modeling (SEM) and to using the R package lavaan (Rosseel 2012) to apply SEM. As one can read in Gana and Broc’s (2019) introduction, the book is meant to be “a didactic book presenting the basics of a technique for beginners who wish to gradually learn structural equation modeling and make use of its flexibility, opportunities, and upgrades and extenxions [sic]” (p. xi). The authors stress that they have put themselves in the shoes of the user with a limited statistical background, who should “easily find in it both a technical introduction and a practical introduction, oriented towards the use of SEM” (p. xii). Open-source software was chosen deliberately on Gana’s part, as the preface mentions his experience “teaching SEM in African universities where it was impossible to have SEM commercial software” (p. ix).

Journal ArticleDOI
TL;DR: This paper estimated a discrete two-tier item response theory model, which allowed for identifying homogeneous student profiles with regard to their ability to solve CPS tasks while taking into account the multidimensionality of the data and the explanatory effect of individual characteristics.
Abstract: Complex problem solving (CPS) is an up-and-coming twenty-first century skill that requires test-takers to solve dynamically changing problems, often assessed using computer-based tests The log data that users produce when interacting with a computer-based test provide valuable information about each individual behavioral action they undertake, but such data are rather difficult to handle from a statistical point of view This paper addresses this issue by building upon recent research focused on decoding log data and aims to identify homogeneous student profiles with regard to their ability to solve CPS tasks Therefore, we estimated a discrete two-tier item response theory model, which allowed us to profile units (ie, students) while taking into account the multidimensionality of the data and the explanatory effect of individual characteristics The results indicate that: (1) CPS can be thought of as a three-dimensional latent variable; (2) there are ten latent classes of students with homogenous profiles regarding the CPS dimensions; (3) students in the higher latent classes generally demonstrate higher cognitive and non-cognitive performances; (4) some of the latent classes seem to profit from learning-by-doing within tasks, whereas others seem to exhibit the reverse behavior; (5) cognitive and non-cognitive skills, as well as gender and to some extent age, contribute to distinguishing among the latent classes

Journal ArticleDOI
TL;DR: A new model is proposed with restrictions inspired by this new literature to help with the identification issue of the four-parameter item response theory model, developed by placing a hierarchical structure on the DINA model and imposing equality constraints on a priori unknown dyads of items.
Abstract: Recently, there has been a renewed interest in the four-parameter item response theory model as a way to capture guessing and slipping behaviors in responses. Research has shown, however, that the nested three-parameter model suffers from issues of unidentifiability (San Martin et al. in Psychometrika 80:450–467, 2015), which places concern on the identifiability of the four-parameter model. Borrowing from recent advances in the identification of cognitive diagnostic models, in particular, the DINA model (Gu and Xu in Stat Sin https://doi.org/10.5705/ss.202018.0420 , 2019), a new model is proposed with restrictions inspired by this new literature to help with the identification issue. Specifically, we show conditions under which the four-parameter model is strictly and generically identified. These conditions inform the presentation of a new exploratory model, which we call the dyad four-parameter normal ogive (Dyad-4PNO) model. This model is developed by placing a hierarchical structure on the DINA model and imposing equality constraints on a priori unknown dyads of items. We present a Bayesian formulation of this model, and show that model parameters can be accurately recovered. Finally, we apply the model to a real dataset.

Journal ArticleDOI
TL;DR: It is shown that, under mild conditions, factor uniqueness is preserved even if the specific factors are assumed to be within-variable, or within-occasion, correlated and the model is modified to become scale invariant.
Abstract: Factor analysis is a well-known method for describing the covariance structure among a set of manifest variables through a limited number of unobserved factors. When the observed variables are collected at various occasions on the same statistical units, the data have a three-way structure and standard factor analysis may fail. To overcome these limitations, three-way models, such as the Parafac model, can be adopted. It is often seen as an extension of principal component analysis able to discover unique latent components. The structural version, i.e., as a reparameterization of the covariance matrix, has been also formulated but rarely investigated. In this article, such a formulation is studied by discussing under what conditions factor uniqueness is preserved. It is shown that, under mild conditions, such a property holds even if the specific factors are assumed to be within-variable, or within-occasion, correlated and the model is modified to become scale invariant.


Journal ArticleDOI
TL;DR: Results show that the proposed probabilistic framework, called polytomous local independence model, can be successfully applied in practice, paving the way to a number of applications of KST outside the area of knowledge and learning assessment.
Abstract: A probabilistic framework for the polytomous extension of knowledge space theory (KST) is proposed. It consists in a probabilistic model, called polytomous local independence model, that is developed as a generalization of the basic local independence model. The algorithms for computing "maximum likelihood" (ML) and "minimum discrepancy" (MD) estimates of the model parameters have been derived and tested in a simulation study. Results show that the algorithms differ in their capability of recovering the true parameter values. The ML algorithm correctly recovers the true values, regardless of the manipulated variables. This is not totally true for the MD algorithm. Finally, the model has been applied to a real polytomous data set collected in the area of psychological assessment. Results show that it can be successfully applied in practice, paving the way to a number of applications of KST outside the area of knowledge and learning assessment.