scispace - formally typeset
Search or ask a question

Showing papers in "Psychometrika in 2011"


Journal ArticleDOI
TL;DR: The OpenMx data structures are introduced—these novel structures define the user interface framework and provide new opportunities for model specification and a discussion of directions for future development.
Abstract: OpenMx is free, full-featured, open source, structural equation modeling (SEM) software. OpenMx runs within the R statistical programming environment on Windows, Mac OS–X, and Linux computers. The rationale for developing OpenMx is discussed along with the philosophy behind the user interface. The OpenMx data structures are introduced—these novel structures define the user interface framework and provide new opportunities for model specification. Two short example scripts for the specification and fitting of a confirmatory factor model are next presented. We end with an abbreviated list of modeling applications available in OpenMx 1.0 and a discussion of directions for future development.

1,045 citations


Journal ArticleDOI
TL;DR: The G-DINA (generalized deterministic inputs, noisy “and” gate) model is a generalization of the DINA model with more relaxed assumptions and is equivalent to other general models for cognitive diagnosis based on alternative link functions.
Abstract: The G-DINA (generalized deterministic inputs, noisy “and” gate) model is a generalization of the DINA model with more relaxed assumptions. In its saturated form, the G-DINA model is equivalent to other general models for cognitive diagnosis based on alternative link functions. When appropriate constraints are applied, several commonly used cognitive diagnosis models (CDMs) can be shown to be special cases of the general models. In addition to model formulation, the G-DINA model as a general CDM framework includes a component for item-by-item model estimation based on design and weight matrices, and a component for item-by-item model comparison based on the Wald test. The paper illustrates the estimation and application of the G-DINA model as a framework using real and simulated data. It concludes by discussing several potential implications of and relevant issues concerning the proposed framework.

531 citations


Journal ArticleDOI
TL;DR: Regularized generalized canonical correlation analysis (RGCCA) as mentioned in this paper combines the power of multi-block data analysis methods (maximization of well identified criteria) and the flexibility of PLS path modeling (the researcher decides which blocks are connected and which are not).
Abstract: Regularized generalized canonical correlation analysis (RGCCA) is a generalization of regularized canonical correlation analysis to three or more sets of variables. It constitutes a general framework for many multi-block data analysis methods. It combines the power of multi-block data analysis methods (maximization of well identified criteria) and the flexibility of PLS path modeling (the researcher decides which blocks are connected and which are not). Searching for a fixed point of the stationary equations related to RGCCA, a new monotonically convergent algorithm, very similar to the PLS algorithm proposed by Herman Wold, is obtained. Finally, a practical example is discussed.

290 citations


Journal ArticleDOI
TL;DR: The purpose of this article is to introduce an exploratory form of bi-factor analysis, a form of confirmatory factor analysis originally introduced by Holzinger, designed to approximate perfect cluster structure in all but the first column of a rotated loading matrix.
Abstract: Bi-factor analysis is a form of confirmatory factor analysis originally introduced by Holzinger. The bi-factor model has a general factor and a number of group factors. The purpose of this paper is to introduce an exploratory form of bi-factor analysis. An advantage of using exploratory bi-factor analysis is that one need not provide a specific bi-factor model a priori. The result of an exploratory bi-factor analysis, however, can be used as an aid in defining a specific bi-factor model. Our exploratory bi-factor analysis is simply exploratory factor analysis using a bi-factor rotation criterion. This is a criterion designed to produce perfect cluster structure in all but the first column of a rotated loading matrix. Examples are given to show how exploratory bi-factor analysis can be used with ideal and real data. The relation of exploratory bi-factor analysis to the Schmid-Leiman method is discussed.

257 citations


Journal ArticleDOI
TL;DR: In this paper, a general discrete latent variable model was used to specify and compare two types of multidimensional item-response-theory (MIRT) models for longitudinal data.
Abstract: The aim of the research presented here is the use of extensions of longitudinal item response theory (IRT) models in the analysis and comparison of group-specific growth in large-scale assessments of educational outcomes. A general discrete latent variable model was used to specify and compare two types of multidimensional item-response-theory (MIRT) models for longitudinal data: (a) a model that handles repeated measurements as multiple, correlated variables over time and (b) a model that assumes one common variable over time and additional variables that quantify the change. Using extensions of these MIRT models, we approach the issue of modeling and comparing group-specific growth in observed and unobserved subpopulations. The analyses presented in this paper aim at answering the question whether academic growth is homogeneous across types of schools defined by academic demands and curricular differences. In order to facilitate answering this research question, (a) a model with a single two-dimensional ability distribution was compared to (b) a model assuming multiple populations with potentially different two-dimensional ability distributions based on type of school and to (c) a model that assumes that the observations are sampled from a discrete mixture of (unobserved) populations, allowing for differences across schools with respect to mixing proportions. For this purpose, we specified a hierarchical-mixture distribution variant of the two MIRT models. The latter model, (c), is a growth-mixture MIRT model that allows for variation of the mixing proportions across clusters in a hierarchically organized sample. We applied the proposed models to the PISA-I-Plus data for assessing learning and change across multiple subpopulations. The results of this study support the hypothesis of differential growth.

76 citations


Journal ArticleDOI
TL;DR: A rigorous investigation on the relationships among four promising item selection methods: D-optimality, KL information index, continuous entropy, and mutual information shows that mutual information not only improved the overall estimation accuracy but also yielded the smallest conditional mean squared error in most region of θ.
Abstract: Over the past thirty years, obtaining diagnostic information from examinees’ item responses has become an increasingly important feature of educational and psychological testing. The objective can be achieved by sequentially selecting multidimensional items to fit the class of latent traits being assessed, and therefore Multidimensional Computerized Adaptive Testing (MCAT) is one reasonable approach to such task. This study conducts a rigorous investigation on the relationships among four promising item selection methods: D-optimality, KL information index, continuous entropy, and mutual information. Some theoretical connections among the methods are demonstrated to show how information about the unknown vector θ can be gained from different perspectives. Two simulation studies were carried out to compare the performance of the four methods. The simulation results showed that mutual information not only improved the overall estimation accuracy but also yielded the smallest conditional mean squared error in most region of θ. In the end, the overlap rates were calculated to empirically show the similarity and difference among the four methods.

70 citations


Journal ArticleDOI
TL;DR: Two versions of simplified KL index (SKI), built from the analytical results, are proposed to mimic the behavior of KI, while reducing the overall computational intensity of MAT.
Abstract: This paper first discusses the relationship between Kullback–Leibler information (KL) and Fisher information in the context of multi-dimensional item response theory and is further interpreted for the two-dimensional case, from a geometric perspective. This explication should allow for a better understanding of the various item selection methods in multi-dimensional adaptive tests (MAT) which are based on these two information measures. The KL information index (KI) method is then discussed and two theorems are derived to quantify the relationship between KI and item parameters. Due to the fact that most of the existing item selection algorithms for MAT bear severe computational complexity, which substantially lowers the applicability of MAT, two versions of simplified KL index (SKI), built from the analytical results, are proposed to mimic the behavior of KI, while reducing the overall computational intensity.

55 citations


Journal ArticleDOI
TL;DR: An application of a hierarchical IRT model for items in families generated through the application of different combinations of design rules is discussed, with the potential to increase the cost-effectiveness of item generation as well as the flexibility of item administration.
Abstract: An application of a hierarchical IRT model for items in families generated through the application of different combinations of design rules is discussed. Within the families, the items are assumed to differ only in surface features. The parameters of the model are estimated in a Bayesian framework, using a data-augmented Gibbs sampler. An obvious application of the model is computerized algorithmic item generation. Such algorithms have the potential to increase the cost-effectiveness of item generation as well as the flexibility of item administration. The model is applied to data from a non-verbal intelligence test created using design rules. In addition, results from a simulation study conducted to evaluate parameter recovery are presented.

48 citations


Journal ArticleDOI
TL;DR: This paper contains a systematic study of the Markov properties and the way they can be used to distinguish spurious from genuine evidence of DIF and local dependence and proposes a strategy for initial item screening that will reduce the time needed to identify a graphical loglinear Rasch model that fits the item responses.
Abstract: In behavioural sciences, local dependence and DIF are common, and purification procedures that eliminate items with these weaknesses often result in short scales with poor reliability. Graphical loglinear Rasch models (Kreiner & Christensen, in Statistical Methods for Quality of Life Studies, ed. by M. Mesbah, F.C. Cole & M.T. Lee, Kluwer Academic, pp. 187–203, 2002) where uniform DIF and uniform local dependence are permitted solve this dilemma by modelling the local dependence and DIF. Identifying loglinear Rasch models by a stepwise model search is often very time consuming, since the initial item analysis may disclose a great deal of spurious and misleading evidence of DIF and local dependence that has to disposed of during the modelling procedure. Like graphical models, graphical loglinear Rasch models possess Markov properties that are useful during the statistical analysis if they are used methodically. This paper describes how. It contains a systematic study of the Markov properties and the way they can be used to distinguish spurious from genuine evidence of DIF and local dependence and proposes a strategy for initial item screening that will reduce the time needed to identify a graphical loglinear Rasch model that fits the item responses. The last part of the paper illustrates the item screening procedure on simulated data and on data on the PF subscale measuring physical functioning in the SF36 Health Survey inventory.

41 citations


Journal ArticleDOI
TL;DR: In this article, the authors propose three latent scales within the framework of nonparametric item response theory for polytomously scored items, which imply an invariant item ordering, meaning that the order of the items is the same for each measurement value on the latent scale.
Abstract: We propose three latent scales within the framework of nonparametric item response theory for polytomously scored items. Latent scales are models that imply an invariant item ordering, meaning that the order of the items is the same for each measurement value on the latent scale. This ordering property may be important in, for example, intelligence testing and person-fit analysis. We derive observable properties of the three latent scales that can each be used to investigate in real data whether the particular model adequately describes the data. We also propose a methodology for analyzing test data in an effort to find support for a latent scale, and we use two real-data examples to illustrate the practical use of this methodology.

36 citations


Journal ArticleDOI
TL;DR: In this paper, a Bayesian hierarchical framework is proposed that allows estimation of the correlation between time intensity and difficulty at the item level, and between speed and ability at the subject level.
Abstract: In the psycholinguistic literature, reaction times and accuracy can be analyzed separately using mixed (logistic) effects models with crossed random effects for item and subject. Given the potential correlation between these two outcomes, a joint model for the reaction time and accuracy may provide further insight. In this paper, a Bayesian hierarchical framework is proposed that allows estimation of the correlation between time intensity and difficulty at the item level, and between speed and ability at the subject level. The framework is shown to be flexible in that reaction times can follow a (log-) normal or (shifted) Weibull distribution. A simulation study reveals the reduction in bias gains possible when using joint models, and an analysis of an example from a Dutch–English word recognition study illustrates the proposed method.

Journal ArticleDOI
TL;DR: In this article, the significance of the contribution of variables to the principal components in principal components analysis (PCA) is assessed nonparametrically by the use of permutation tests.
Abstract: In this paper, the statistical significance of the contribution of variables to the principal components in principal components analysis (PCA) is assessed nonparametrically by the use of permutation tests. We compare a new strategy to a strategy used in previous research consisting of permuting the columns (variables) of a data matrix independently and concurrently, thus destroying the entire correlational structure of the data. This strategy is considered appropriate for assessing the significance of the PCA solution as a whole, but is not suitable for assessing the significance of the contribution of single variables. Alternatively, we propose a strategy involving permutation of one variable at a time, while keeping the other variables fixed. We compare the two approaches in a simulation study, considering proportions of Type I and Type II error. We use two corrections for multiple testing: the Bonferroni correction and controlling the False Discovery Rate (FDR). To assess the significance of the variance accounted for by the variables, permuting one variable at a time, combined with FDR correction, yields the most favorable results. This optimal strategy is applied to an empirical data set, and results are compared with bootstrap confidence intervals.

Journal ArticleDOI
TL;DR: In this paper, a fast and effective implementation of tabu search is proposed for analyzing two-mode blockmodeling based on structural equivalence, where the goal is to identify partitions for the row and column objects such that the clusters of the rows and columns form blocks that are either complete (all 1s) or null (all 0s) to the greatest extent possible.
Abstract: Two-mode binary data matrices arise in a variety of social network contexts, such as the attendance or non-attendance of individuals at events, the participation or lack of participation of groups in projects, and the votes of judges on cases. A popular method for analyzing such data is two-mode blockmodeling based on structural equivalence, where the goal is to identify partitions for the row and column objects such that the clusters of the row and column objects form blocks that are either complete (all 1s) or null (all 0s) to the greatest extent possible. Multiple restarts of an object relocation heuristic that seeks to minimize the number of inconsistencies (i.e., 1s in null blocks and 0s in complete blocks) with ideal block structure is the predominant approach for tackling this problem. As an alternative, we propose a fast and effective implementation of tabu search. Computational comparisons across a set of 48 large network matrices revealed that the new tabu-search heuristic always provided objective function values that were better than those of the relocation heuristic when the two methods were constrained to the same amount of computation time.

Journal ArticleDOI
TL;DR: In this paper, it was shown that the weighted kappa with linear weights can be interpreted as a weighted arithmetic mean of the kappas corresponding to the 2×2 tables.
Abstract: An agreement table with n∈ℕ≥3 ordered categories can be collapsed into n−1 distinct 2×2 tables by combining adjacent categories. Vanbelle and Albert (Stat. Methodol. 6:157–163, 2009c) showed that the components of Cohen’s weighted kappa with linear weights can be obtained from these n−1 collapsed 2×2 tables. In this paper we consider several consequences of this result. One is that the weighted kappa with linear weights can be interpreted as a weighted arithmetic mean of the kappas corresponding to the 2×2 tables, where the weights are the denominators of the 2×2 kappas. In addition, it is shown that similar results and interpretations hold for linearly weighted kappas for multiple raters.

Journal ArticleDOI
TL;DR: Monte Carlo results imply that, for both standardized and unstandardized sample regression coefficients, SE estimates based on asymptotics tend to under-predict the empirical ones at smaller sample sizes.
Abstract: The paper obtains consistent standard errors (SE) and biases of order O(1/n) for the sample standardized regression coefficients with both random and given predictors. Analytical results indicate that the formulas for SEs given in popular text books are consistent only when the population value of the regression coefficient is zero. The sample standardized regression coefficients are also biased in general, although it should not be a concern in practice when the sample size is not too small. Monte Carlo results imply that, for both standardized and unstandardized sample regression coefficients, SE estimates based on asymptotics tend to under-predict the empirical ones at smaller sample sizes.

Journal ArticleDOI
Johan Braeken1
TL;DR: In this article, a boundary mixture model is proposed to account for the local dependence problem by finding a balance between independence on the one side and absolute dependence on the other side, and sharp bounds on these violations are defined.
Abstract: Conditional independence is a fundamental principle in latent variable modeling and item response theory. Violations of this principle, commonly known as local item dependencies, are put in a test information perspective, and sharp bounds on these violations are defined. A modeling approach is proposed that makes use of a mixture representation of these boundaries to account for the local dependence problem by finding a balance between independence on the one side and absolute dependence on the other side. In contrast to alternative approaches, the nature of the proposed boundary mixture model does not necessitate a change in formulation of the typical item characteristic curves used in item response theory. This has attractive interpretational advantages and may be useful for general test construction purposes.

Journal ArticleDOI
TL;DR: In this article, the identification and consistency of Bayesian semiparametric IRT-type models is studied, where the uncertainty on the abilities' distribution is modeled using a prior distribution on the space of probability measures.
Abstract: We study the identification and consistency of Bayesian semiparametric IRT-type models, where the uncertainty on the abilities’ distribution is modeled using a prior distribution on the space of probability measures. We show that for the semiparametric Rasch Poisson counts model, simple restrictions ensure the identification of a general distribution generating the abilities, even for a finite number of probes. For the semiparametric Rasch model, only a finite number of properties of the general abilities’ distribution can be identified by a finite number of items, which are completely characterized. The full identification of the semiparametric Rasch model can be only achieved when an infinite number of items is available. The results are illustrated using simulated data.

Journal ArticleDOI
TL;DR: An analytic method to determine when a choice of fixed weights will incur less mean squared error than OLS as a function of sample size, error variance, and model predictability is presented.
Abstract: Many researchers have demonstrated that fixed, exogenously chosen weights can be useful alternatives to Ordinary Least Squares (OLS) estimation within the linear model (e.g., Dawes, Am. Psychol. 34:571–582, 1979; Einhorn & Hogarth, Org. Behav. Human Perform. 13:171–192, 1975; Wainer, Psychol. Bull. 83:213-217, 1976). Generalizing the approach of Davis-Stober, Dana, and Budescu (Psychometrika 75:521–541, 2010b), I present an analytic method to determine when a choice of fixed weights will incur less mean squared error than OLS as a function of sample size, error variance, and model predictability. Geometrically, I solve for the region of population β that favors a choice of fixed weights over OLS. I derive closed-form upper and lower bounds on the volume of this region, giving tight bounds on the proportion of population β favoring a choice of fixed weights. I illustrate this methodology with several examples and provide a MATLAB© (The MathWorks, Matlab software, version 2009b, 2010) programming implementation of the major results.

Journal ArticleDOI
TL;DR: This note describes a methodology for scaling selected off-diagonal rows and columns of such a matrix to achieve positive definiteness as a contrast to recently developed ridge procedures.
Abstract: Indefinite symmetric matrices that are estimates of positive definite population matrices occur in a variety of contexts such as correlation matrices computed from pairwise present missing data and multinormal based theory for discretized variables. This note describes a methodology for scaling selected off-diagonal rows and columns of such a matrix to achieve positive definiteness. As a contrast to recently developed ridge procedures, the proposed method does not need variables to contain measurement errors. When minimum trace factor analysis is used to implement the theory, only correlations that are associated with Heywood cases are shrunk.

Journal ArticleDOI
TL;DR: The discussion addresses the consequences of the distinction for the true-score model, the linear factor model, Structural Equation Models, longitudinal and multilevel models, and item-response models.
Abstract: A distinction is proposed between measures and predictors of latent variables. The discussion addresses the consequences of the distinction for the true-score model, the linear factor model, Structural Equation Models, longitudinal and multilevel models, and item-response models. A distribution-free treatment of calibration and error-of-measurement is given, and the contrasting properties of measures and predictors are examined.

Journal ArticleDOI
TL;DR: Asymptotic expansions of the maximum likelihood estimator (MLE) and weighted likelihood estimators (WLE) of an examinee's ability are derived while item parameter estimators are treated as covariates measured with error as discussed by the authors.
Abstract: Asymptotic expansions of the maximum likelihood estimator (MLE) and weighted likelihood estimator (WLE) of an examinee’s ability are derived while item parameter estimators are treated as covariates measured with error The asymptotic formulae present the amount of bias of the ability estimators due to the uncertainty of item parameter estimators A numerical example is presented to illustrate how to apply the formulae to evaluate the impact of uncertainty about item parameters on ability estimation and the appropriateness of estimating ability using the regular MLE or WLE method

Journal ArticleDOI
TL;DR: The results indicate that coherence is relatively stronger for intense emotional stimuli than for neutral stimuli and this is discussed in relation to multivariate methods and emotion theories.
Abstract: We present an approach for evaluating coherence in multivariate systems that considers all the variables simultaneously. We operationalize the multivariate system as a network and define coherence as the efficiency with which a signal is transmitted throughout the network. We illustrate this approach with time series data from 15 psychophysiological signals representing individuals’ moment-by-moment emotional reactions to emotional films. First, we summarize the time series through nonparametric Receiver Operating Characteristic (ROC) curves. Second, we use Spearman rank correlations to calculate relationships between each pair of variables. Third, based on the obtained associations, we construct a network using the variables as nodes. Finally, we examine signal transmission through all the nodes in the network. Our results indicate that the network consisting of the 15 psychophysiological signals has a small-world structure, with three clusters of variables and strong within-cluster connections. This structure supports an effective signal transmission across the entire network. When compared across experimental conditions, our results indicate that coherence is relatively stronger for intense emotional stimuli than for neutral stimuli. These findings are discussed in relation to multivariate methods and emotion theories.

Journal ArticleDOI
TL;DR: In this paper, the authors define new optimization criteria and estimation methods for exploratory factor analysis, based on the linear combination of factor loadings, and vice versa, and show the methodology to be promising.
Abstract: When the factor analysis model holds, component loadings are linear combinations of factor loadings, and vice versa. This interrelation permits us to define new optimization criteria and estimation methods for exploratory factor analysis. Although this article is primarily conceptual in nature, an illustrative example and a small simulation show the methodology to be promising.

Journal ArticleDOI
TL;DR: In this article, a structural analysis for generalized linear models when some explanatory variables are measured with error and the measurement error variance is a function of the true variables is proposed, which leads to a two-stage estimation procedure which constitutes an alternative to a joint model for the outcome variable and the responses given to the questionnaire.
Abstract: This paper proposes a structural analysis for generalized linear models when some explanatory variables are measured with error and the measurement error variance is a function of the true variables. The focus is on latent variables investigated on the basis of questionnaires and estimated using item response theory models. Latent variable estimates are then treated as observed measures of the true variables. This leads to a two-stage estimation procedure which constitutes an alternative to a joint model for the outcome variable and the responses given to the questionnaire. Simulation studies explore the effect of ignoring the true error structure and the performance of the proposed method. Two illustrative examples concern achievement data of university students. Particular attention is given to the Rasch model.

Journal ArticleDOI
TL;DR: Equations are provided for populating the sets B1 and B2 and for demonstrating that maximum enhancement occurs when b is collinear with the eigenvector that is associated with λp (the smallest eigenvalue of the predictor correlation matrix).
Abstract: In linear multiple regression, “enhancement” is said to occur when R2=b′r>r′r, where b is a p×1 vector of standardized regression coefficients and r is a p×1 vector of correlations between a criterion y and a set of standardized regressors, x. When p=1 then b≡r and enhancement cannot occur. When p=2, for all full-rank Rxx≠I, Rxx=E[xx′]=VΛV′ (where VΛV′ denotes the eigen decomposition of Rxx; λ1>λ2), the set \(\boldsymbol{B}_{1}:=\{\boldsymbol{b}_{i}:R^{2}=\boldsymbol{b}_{i}'\boldsymbol{r}_{i}=\boldsymbol{r}_{i}'\boldsymbol{r}_{i};0 \boldsymbol{r}_{i}'\boldsymbol{r}_{i}\); \(0 λ2>⋯>λp), both sets contain an uncountably infinite number of vectors. Geometrical arguments demonstrate that B1 occurs at the intersection of two hyper-ellipsoids in ℝp. Equations are provided for populating the sets B1 and B2 and for demonstrating that maximum enhancement occurs when b is collinear with the eigenvector that is associated with λp (the smallest eigenvalue of the predictor correlation matrix). These equations are used to illustrate the logic and the underlying geometry of enhancement in population, multiple-regression models. R code for simulating population regression models that exhibit enhancement of any degree and any number of predictors is included in Appendices A and B.

Journal ArticleDOI
TL;DR: Methods for assessing all possible criteria and subsets of criteria for regression models with a fixed set of predictors, x (where x is an n×1 vector of independent variables), and geometrical notions can be easily extended to assess the sampling performance of alternate regression weights in models with either fixed or random predictors and for models with any value of R2.
Abstract: We describe methods for assessing all possible criteria (i.e., dependent variables) and subsets of criteria for regression models with a fixed set of predictors, x (where x is an n×1 vector of independent variables). Our methods build upon the geometry of regression coefficients (hereafter called regression weights) in n-dimensional space. For a full-rank predictor correlation matrix, Rxx, of order n, and for regression models with constant R2 (coefficient of determination), the OLS weight vectors for all possible criteria terminate on the surface of an n-dimensional ellipsoid. The population performance of alternate regression weights—such as equal weights, correlation weights, or rounded weights—can be modeled as a function of the Cartesian coordinates of the ellipsoid. These geometrical notions can be easily extended to assess the sampling performance of alternate regression weights in models with either fixed or random predictors and for models with any value of R2. To illustrate these ideas, we describe algorithms and R (R Development Core Team, 2009) code for: (1) generating points that are uniformly distributed on the surface of an n-dimensional ellipsoid, (2) populating the set of regression (weight) vectors that define an elliptical arc in ℝn, and (3) populating the set of regression vectors that have constant cosine with a target vector in ℝn. Each algorithm is illustrated with real data. The examples demonstrate the usefulness of studying all possible criteria when evaluating alternate regression weights in regression models with a fixed set of predictors.

Journal ArticleDOI
TL;DR: The typical rank of a three-way array is the smallest number of rank-one arrays that have the array as their sum, when the array is generated by random sampling from a continuous distribution as discussed by the authors.
Abstract: Matrices can be diagonalized by singular vectors or, when they are symmetric, by eigenvectors. Pairs of square matrices often admit simultaneous diagonalization, and always admit block wise simultaneous diagonalization. Generalizing these possibilities to more than two (non-square) matrices leads to methods of simplifying three-way arrays by nonsingular transformations. Such transformations have direct applications in Tucker PCA for three-way arrays, where transforming the core array to simplicity is allowed without loss of fit. Simplifying arrays also facilitates the study of array rank. The typical rank of a three-way array is the smallest number of rank-one arrays that have the array as their sum, when the array is generated by random sampling from a continuous distribution. In some applications, the core array of Tucker PCA is constrained to have a vast majority of zero elements. Both simplicity and typical rank results can be applied to distinguish constrained Tucker PCA models from tautologies. An update of typical rank results over the real number field is given in the form of two tables.

Journal ArticleDOI
TL;DR: This work provides some results relating the unconditional covariances to the goodness of fit of the latent profile model, and to its excess multivariate kurtosis, and leads to some useful parameter restrictions related to symmetry.
Abstract: The relationship between linear factor models and latent profile models is addressed within the context of maximum likelihood estimation based on the joint distribution of the manifest variables. Although the two models are well known to imply equivalent covariance decompositions, in general they do not yield equivalent estimates of the unconditional covariances. In particular, a 2-class latent profile model with Gaussian components underestimates the observed covariances but not the variances, when the data are consistent with a unidimensional Gaussian factor model. In explanation of this phenomenon we provide some results relating the unconditional covariances to the goodness of fit of the latent profile model, and to its excess multivariate kurtosis. The analysis also leads to some useful parameter restrictions related to symmetry.

Journal ArticleDOI
TL;DR: It is shown that INDSCAL may fail to identify a common space representative of the observed data structure in presence of heterogeneity, and a new model that removes the rotational invariance of the classical multidimensional scaling problem and specifies K common homogeneous spaces is proposed.
Abstract: A weighted Euclidean distance model for analyzing three-way dissimilarity data (stimuli by stimuli by subjects) for heterogeneous subjects is proposed. First, it is shown that INDSCAL may fail to identify a common space representative of the observed data structure in presence of heterogeneity. A new model that removes the rotational invariance of the classical multidimensional scaling problem and specifies K common homogeneous spaces is proposed. The model, called mixture INDSCAL in K classes, or briefly K-INDSCAL, still includes individual saliencies. However, the large number of parameters in K-INDSCAL may produce instability of the estimates and therefore a parsimonious model will also be discussed. The parameters of the model are estimated in a least-squares fitting context and an efficient coordinate descent algorithm is given. The usefulness of K-INDSCAL is demonstrated by both artificial and real data analyses.

Journal ArticleDOI
TL;DR: In this article, the authors extend a sandwich-type standard error estimator of independent data to multivariate time series data, where one required element is the asymptotic covariance matrix of concurrent and lagged correlations among manifest variables, whose closed-form expression has not been presented in the literature.
Abstract: Structural equation models are increasingly used as a modeling tool for multivariate time series data in the social and behavioral sciences. Standard error estimators of SEM models, originally developed for independent data, require modifications to accommodate the fact that time series data are inherently dependent. In this article, we extend a sandwich-type standard error estimator of independent data to multivariate time series data. One required element of this estimator is the asymptotic covariance matrix of concurrent and lagged correlations among manifest variables, whose closed-form expression has not been presented in the literature. The performance of the adapted sandwich-type standard error estimator is evaluated using a simulation study and further illustrated using an empirical example.