scispace - formally typeset
Search or ask a question

Showing papers in "Psychometrika in 1995"


Journal ArticleDOI
TL;DR: In this paper, the authors describe a model for estimation of effect size when there is selection based on one-tailed p-values, when the process of publication favors studies with smallp-values and hence large effect estimates.
Abstract: When the process of publication favors studies with smallp-values, and hence large effect estimates, combined estimates from many studies may be biased. This paper describes a model for estimation of effect size when there is selection based on one-tailedp-values. The model employs the method of maximum likelihood in the context of a mixed (fixed and random) effects general linear model for effect sizes. It offers a test for the presence of publication bias, and corrected estimates of the parameters of the linear model for effect magnitude. The model is illustrated using a well-known data set on the benefits of psychotherapy.

259 citations


Journal ArticleDOI
TL;DR: In this article, a generalized least squares estimation procedure for structural equation modeling with a mixture of dichotomous, ordered categorical, and continuous measures of latent variables is presented. But the emphasis is placed on showing the asymptotic normality of the estimates obtained in the first and second stages and the validity of the weight matrix used in the GLS estimation of the third stage.
Abstract: Muthen (1984) formulated a general model and estimation procedure for structural equation modeling with a mixture of dichotomous, ordered categorical, and continuous measures of latent variables. A general three-stage procedure was developed to obtain estimates, standard errors, and a chi-square measure of fit for a given structural model. While the last step uses generalized least-squares estimation to fit a structural model, the first two steps involve the computation of the statistics used in this model fitting. A key component in the procedure was the development of a GLS weight matrix corresponding to the asymptotic covariance matrix of the sample statistics computed in the first two stages. This paper extends the description of the asymptotics involved and shows how the Muthen formulas can be derived. The emphasis is placed on showing the asymptotic normality of the estimates obtained in the first and second stage and the validity of the weight matrix used in the GLS estimation of the third stage.

192 citations


Journal ArticleDOI
TL;DR: In this article, the application of a class of Rasch models to situations where test items are grouped into subsets and the common attributes of items within these subsets brings into question the usual assumption of conditional independence is discussed.
Abstract: This paper discusses the application of a class of Rasch models to situations where test items are grouped into subsets and the common attributes of items within these subsets brings into question the usual assumption of conditional independence. The models are all expressed as particular cases of the random coefficients multinomial logit model developed by Adams and Wilson. This formulation allows a very flexible approach to the specification of alternative models, and makes model testing particularly straightforward. The use of the models is illustrated using item bundles constructed in the framework of the SOLO taxonomy of Biggs and Collis.

142 citations


Journal ArticleDOI
TL;DR: In this paper, some psychometric models belong to the larger class of latent response models (LRMs), and examples of how they can be applied, and a method for obtaining maximum likelihood and some maximum a posteriori (MAP) estimates of the parameters of LRMs is presented.
Abstract: In this paper, some psychometric models will be presented that belong to the larger class oflatent response models (LRMs). First, LRMs are introduced by means of an application in the field ofcomponential item response theory (Embretson, 1980, 1984). Second, a general definition of LRMs (not specific for the psychometric subclass) is given. Third, some more psychometric LRMs, and examples of how they can be applied, are presented. Fourth, a method for obtaining maximum likelihood (ML) and some maximum a posteriori (MAP) estimates of the parameters of LRMs is presented. This method is then applied to theconjunctive Rasch model. Fifth and last, an application of the conjunctive Rasch model is presented. This model was applied to responses to typical verbal ability items (open synonym items).

140 citations


Journal ArticleDOI
TL;DR: In this paper, a Monte Carlo experiment is conducted to investigate the performance of the bootstrap methods in normal theory maximum likelihood factor analysis both when the distributional assumption is satisfied and unsatisfied.
Abstract: A Monte Carlo experiment is conducted to investigate the performance of the bootstrap methods in normal theory maximum likelihood factor analysis both when the distributional assumption is satisfied and unsatisfied. The parameters and their functions of interest include unrotated loadings, analytically rotated loadings, and unique variances. The results reveal that (a) bootstrap bias estimation performs sometimes poorly for factor loadings and nonstandardized unique variances; (b) bootstrap variance estimation performs well even when the distributional assumption is violated; (c) bootstrap confidence intervals based on the Studentized statistics are recommended; (d) if structural hypothesis about the population covariance matrix is taken into account then the bootstrap distribution of the normal theory likelihood ratio test statistic is close to the corresponding sampling distribution with slightly heavier right tail.

79 citations


Journal ArticleDOI
David Andrich1
TL;DR: In this paper, the authors study the mathematical implication of the Rasch model for graded responses and show that it is inconsistent with the joining assumption of Jansen and Roskam (1986), which states that if two categoriesj andk are combined to form categoryh, then the probability of a response inh should equal the sum of the probabilities of responses inj andk.
Abstract: It is common in educational, psychological, and social measurement in general, to collect data in the form of graded responses and then to combine adjacent categories. It has been argued that because the division of the continuum into categories is arbitrary, any model used for analyzing graded responses should accommodate such action. Specifically, Jansen and Roskam (1986) enunciate ajoining assumption which specifies that if two categoriesj andk are combined to form categoryh, then the probability of a response inh should equal the sum of the probabilities of responses inj andk. As a result, they question the use of the Rasch model for graded responses which explicitly prohibits the combining of categories after the data are collected except in more or less degenerate cases. However, the Rasch model is derived from requirements of invariance of comparisons of entities with respect to different instruments, which might include different partitions of the continuum, and is consistent with fundamental measurement. Therefore, there is a strong case that the mathematical implication of the Rasch model should be studied further in order to understand how and why it conflicts with the joining assumption. This paper pursues the mathematics of the Rasch model and establishes, through a special case when the sizes of the categories are equal and when the model is expressed in the multiplicative metric, that its probability distribution reflects the precision with which the data are collected, and that if a pair of categories is collapsed after the data are collected, it no longer reflects the original precision. As a consequence, and not because of a qualitative change in the variable, the joining assumption is destroyed when categories are combined. Implications of the choice between a model which satisfies the joining assumption or one which reflects on the precision of the data collection considered are discussed.

64 citations


Journal ArticleDOI
TL;DR: In this article, a conditional likelihood approach is presented that yields an ML estimator of modifiability for given item parameters, allowing one to test hypotheses about change by means of a Clopper-Pearson confidence interval for the modifability parameter, or to estimate modifiable jointly with the item parameters.
Abstract: The paper addresses three neglected questions from IRT. In section 1, the properties of the “measurement” of ability or trait parameters and item difficulty parameters in the Rasch model are discussed. It is shown that the solution to this problem is rather complex and depends both on general assumptions about properties of the item response functions and on assumptions about the available item universe. Section 2 deals with the measurement of individual change or “modifiability” based on a Rasch test. A conditional likelihood approach is presented that yields (a) an ML estimator of modifiability for given item parameters, (b) allows one to test hypotheses about change by means of a Clopper-Pearson confidence interval for the modifiability parameter, or (c) to estimate modifiability jointly with the item parameters. Uniqueness results for all three methods are also presented. In section 3, the Mantel-Haenszel method for detecting DIF is discussed under a novel perspective: What is the most general framework within which the Mantel-Haenszel method correctly detects DIF of a studied item? The answer is that this is a 2PL model where, however, all discrimination parameters are known and the studied item has the same discrimination in both populations. Since these requirements would hardly be satisfied in practical applications, the case of constant discrimination parameters, that is, the Rasch model, is the only realistic framework. A simple Pearsonx 2 test for DIF of one studied item is proposed as an alternative to the Mantel-Haenszel test; moreover, this test is generalized to the case of two items simultaneously studied for DIF.

63 citations


Journal ArticleDOI
TL;DR: A new model, called acceleration model, is proposed in the framework of the heterogenous case of the graded response model, based on processing functions defined for a finite or enumerable number of steps, expected to be useful in cognitive assessment and more traditional areas of application of latent trait models.
Abstract: A new model, called acceleration model, is proposed in the framework of the heterogenous case of the graded response model, based on processing functions defined for a finite or enumerable number of steps The model is expected to be useful in cognitive assessment, as well as in more traditional areas of application of latent trait models Criteria for evaluating models are proposed, and soundness and robustness of the acceleration model are discussed Graded response models based on individual choice behavior are also discussed, and criticisms on model selection in terms of fitnesses of models to the data are also given

57 citations


Journal ArticleDOI
TL;DR: It is shown how conjunctive and disjunctive hierarchical classes models relate to Galois lattices, and how hierarchical classes analysis can be useful to construct lattice models of empirical data.
Abstract: This paper describes the conjunctive counterpart of De Boeck and Rosenberg's hierarchical classes model. Both the original model and its conjunctive counterpart represent the set-theoretical structure of a two-way two-mode binary matrix. However, unlike the original model, the new model represents the row-column association as a conjunctive function of a set of hypothetical binary variables. The conjunctive nature of the new model further implies that it may represent some conjunctive higher order dependencies among rows and columns. The substantive significance of the conjunctive model is illustrated with empirical applications. Finally, it is shown how conjunctive and disjunctive hierarchical classes models relate to Galois lattices, and how hierarchical classes analysis can be useful to construct lattice models of empirical data.

55 citations


Journal ArticleDOI
TL;DR: A model is proposed, in which different sets of linear constraints are imposed on different dimensions in component analysis and “classical” multidimensional scaling frameworks, and a simple, efficient, and monotonically convergent algorithm is presented for fitting the model to the data by least squares.
Abstract: Many of the “classical” multivariate data analysis and multidimensional scaling techniques call for approximations by lower dimensional configurations. A model is proposed, in which different sets of linear constraints are imposed on different dimensions in component analysis and “classical” multidimensional scaling frameworks. A simple, efficient, and monotonically convergent algorithm is presented for fitting the model to the data by least squares. The basic algorithm is extended to cover across-dimension constraints imposed in addition to the dimensionwise constraints, and to the case of a symmetric data matrix. Examples are given to demonstrate the use of the method.

50 citations


Journal ArticleDOI
TL;DR: In this paper, the concept of ordinal instrumental probabilistic comparison is introduced, which relies on an ordinal scale given a priori and on the notion of stochastic dominance.
Abstract: The concept of an ordinal instrumental probabilistic comparison is introduced It relies on an ordinal scale given a priori and on the concept of stochastic dominance It is used to define a weakly independently ordered system, or isotonic ordinal probabilistic (ISOP) model, which allows the construction of separate “sample-free” ordinal scales on a set of “subjects” and a set of “items” The ISOP-model is a common nonparametric theoretical structure for unidimensional models for quantitative, ordinal and dichotomous variables


Journal ArticleDOI
TL;DR: The authors adapted a modification of the Welch-James procedure for comparing means when population variances are heterogeneous, and obtained a generally robust and powerful analysis with any of the recommended nonorthogonal solutions.
Abstract: Numerous types of analyses for factorial designs having unequal cell frequencies have been discussed in the literature. These analyses test either weighted or unweighted marginal means which, in turn, correspond to different model comparisons. Previous research has indicated, however, that these analyses result in biased (liberal or conservative) tests when cell variances are heterogeneous. We show how to obtain a generally robust and powerful analysis with any of the recommended nonorthogonal solutions by adapting a modification of the Welch-James procedure for comparing means when population variances are heterogeneous.

Journal ArticleDOI
TL;DR: In this article, the 1-parameter logistic latent trait model was used to compare subjects with ordinal independence models and even within the 2-parameters logistic model.
Abstract: Comparisons of subjects are specifically objective if they do not depend on the items involved. Such comparisons are not restricted to the 1-parameter logistic latent trait model, but may also be defined within ordinal independence models and even within the 2-parameter logistic model.

Journal ArticleDOI
TL;DR: In this paper, the probability that an examinee chooses a particular option within an item is estimated by averaging over the responses to that item of examinees with similar response patterns for the whole test.
Abstract: The probability that an examinee chooses a particular option within an item is estimated by averaging over the responses to that item of examinees with similar response patterns for the whole test. The approach does not presume any latent variable structure or any dimensionality. But simulated and actual data analyses are presented to show that when the responses are determined by a latent ability variable, this similarity-based smoothing procedure can reveal the dimensionality of ability very satisfactorily.

Journal ArticleDOI
TL;DR: In this article, the authors consider test equating under this situation as an incomplete data problem, that is, examinees have observed scores on one test form and missing scores on the other.
Abstract: In the design of common-item equating, two groups of examinees are administered separate test forms, and each test form contains a common subset of items. We consider test equating under this situation as an incomplete data problem—that is, examinees have observed scores on one test form and missing scores on the other. Through the use of statistical data-imputation techniques, the missing scores can be replaced by reasonable estimates, and consequently the forms may be directly equated as if both forms were administered to both groups. In this paper we discuss different data-imputation techniques that are useful for equipercentile equating; we also use empirical data to evaluate the accuracy of these techniques as compared with chained equipercentile equating.

Journal ArticleDOI
David Andrich1
TL;DR: In this paper, the authors show how the special case of the MPM revealed why the joining assumption and dichotomization are not, in general, properties of the URM for graded responses, and identify the circumstances where one would require that this property did not hold in empirical graded responses.
Abstract: I have tried to indicate not only the analytic differences in some of the points made in the paper and the rejoinder, but also the perspective from which these differences arise. The point of the paper, and these remarks, is to show how the special case of the MPM revealed why the joining assumption and dichotomization are not, in general, properties of the URM for graded responses, and, thereby to identify the circumstances where one would require that this property did not hold in empirical graded responses.

Journal ArticleDOI
TL;DR: In this article, a general algorithm for maximizing six (constrained) functions of vectors x, or matrices X with columns x(1),..., x(r) with columns X constrained to be columnwise orthonormal is described.
Abstract: Monotonically convergent algorithms are described for maximizing six (constrained) functions of vectors x, or matrices X with columns x(1),..., x(r). These functions are h(1)(x) = Sigma(k) (x'A(k)x)(x'C(k)x)(-1), H-1(X) = Sigma(k) tr (X'A(k)X)(X'C(k)X)(-1), (h) over tilde(1)(X) = Sigma(k) Sigma(l)(x'(l)A(k)x(l))(x'(l)C(k)x(l))(-1) with X constrained to be columnwise orthonormal, h(2)(x) = Sigma(k) (x'A(k)x)(2)(x'C(k)X)(-1) subject to x'x = 1, H-2(X) = Sigma(k) tr (X'A(k)X)(X'A(k)X)'(X'C(k)X)(-1) subject to X'X = I, and (h) over tilde(2)(X) = Sigma(k) Sigma(l) (x'(l)A(k)x(l))(2)(x'(l)C(k)x(l))(-1) subject to X'X = I. In these functions the matrices C-k are assumed to be positive definite. The matrices A(k) can be arbitrary square matrices. The general formulation of the functions and the algorithms allows for application of the algorithms in various problems that arise in multivariate analysis. Several applications of the general algorithms are given. Specifically, algorithms are given for reciprocal principal components analysis, binormamin rotation, generalized discriminant analysis, variants of generalized principal components analysis, simple structure rotation for one of the latter variants, and set component analysis. For most of these methods the algorithms appear to be new, for the others the existing algorithms turn out to be special cases of the newly derived general algorithms.

Journal ArticleDOI
TL;DR: In this article, it was shown that Fisher's exact and Pearson's chi-square tests are asymptotically equivalent, and that a formal similarity also exists in small samples.
Abstract: It is demonstrated in this paper that two major tests for 2 × 2 talbes are highly related from a Bayesian perspective. Although it is well-known that Fisher's exact and Pearson's chi-square tests are asymptotically equivalent, the present analysis shows that a formal similarity also exists in small samples. The key assumption that leads to the resemblance is the presence of a continuous parameter measuring association. In particular, it is shown that Pearson's probability can be obtained by integrating a two-moment approximation to the posterior distribution of the log-odds ratio. Furthermore, Pearson's chi-square test gave an excellent approximation to the actual Bayes probability in all 2×2 tables examined, except for those with extremely disproportionate marginal frequencies.

Journal ArticleDOI
TL;DR: In this paper, the compatibility of some response time models with psychometric and with information processing approaches to response times is discussed, and five RT models are analyzed with respect to their compatibility with the psychometric properties, with serial-additive processing and with some alternative types of processing.
Abstract: This paper discusses the compatibility of some response time (RT) models with psychometric and with information processing approaches to response times. First, three psychometrically desirable properties of probabilistic models for binary data, related to the principle of specific objectivity, are adapted to the domain of RT models. One of these is the separability of item and subject parameters, and another is double monotonicity. Next, the compatibility of these psychometric properties with one very popular information processing approach, the serial-additive model, is discussed. Finally, five RT models are analyzed with respect to their compatibility with the psychometric properties, with serial-additive processing and with some alternative types of processing. It is concluded that (a) current psychometric models each satisfy one or more of the psychometric properties, but are not (easily) compatible with serial-additive processing, (b) at least one serial-additive processing model satisfies separability of item and subject parameters, and (c) RT models will more easily satisfy double monotonicity than the other two psychometric properties.

Journal ArticleDOI
TL;DR: In this article, the differences between the influence curves based on the covariance and the correlation matrices are derived for the unique variance matrix, factor loadings and some other parameters, though the influence curve themselves are in complex forms.
Abstract: Influence curves of some parameters under various methods of factor analysis have been given in the literature. These influence curves depend on the influence curves for either the covariance or the correlation matrix used in the analysis. The differences between the influence curves based on the covariance and the correlation matrices are derived in this paper. Simple formulas for the differences of the influence curves, based on the two matrices, for the unique variance matrix, factor loadings and some other parameter are obtained under scale-invariant estimation methods, though the influence curves themselves are in complex forms.

Journal ArticleDOI
TL;DR: In this article, the authors consider a general framework where the correct category is assumed to have an arbitrary prior distribution, and where classification probabilities vary by correct category, judge, and category of classification.
Abstract: We study a proportional reduction in loss (PRL) measure for the reliability of categorical data and consider the general case in which each ofN judges assigns a subject to one ofK categories. This measure has been shown to be equivalent to a measure proposed by Perreault and Leigh for a special case when there are two equally competent judges, and the correct category has a uniform prior distribution. We consider a general framework where the correct category is assumed to have an arbitrary prior distribution, and where classification probabilities vary by correct category, judge, and category of classification. In this setting, we consider PRL reliability measures based on two estimators of the correct category—the empirical Bayes estimator and an estimator based on the judges' consensus choice. We also discuss four important special cases of the general model and study several types of lower bounds for PRL reliability.


Journal ArticleDOI
TL;DR: In this article, a parametric, maximum likelihood based procedure for estimating ultrametric trees for the analysis of conditional rank order proximity data is proposed, and the technical aspects of the model and the estimation algorithm are discussed.
Abstract: The psychometric and classification literatures have illustrated the fact that a wide class of discrete or network models (e.g., hierarchical or ultrametric trees) for the analysis of ordinal proximity data are plagued by potential degenerate solutions if estimated using traditional nonmetric procedures (i.e., procedures which optimize a STRESS-based criteria of fit and whose solutions are invariant under a monotone transformation of the input data). This paper proposes a new parametric, maximum likelihood based procedure for estimating ultrametric trees for the analysis of conditional rank order proximity data. We present the technical aspects of the model and the estimation algorithm. Some preliminary Monte Carlo results are discussed. A consumer psychology application is provided examining the similarity of fifteen types of snack/breakfast items. Finally, some directions for future research are provided.

Journal ArticleDOI
TL;DR: In this paper, the residuals for check of model fit in the polytomous Rasch model are examined and comparisons are made between using counts for all response pattern and using item totals for score groups for the construction of residuals.
Abstract: Residuals for check of model fit in the polytomous Rasch model are examined. Comparisons are made between using counts for all response pattern and using item totals for score groups for the construction of the residuals. Comparisons are also, for the residuals based on score group totals, made between using as basis the item totals, or using the estimated item parameters. The developed methods are illustrated by two examples, one from a psychiatric rating scale, one from a Danish Welfare Study.

Journal ArticleDOI
TL;DR: In this article, a least squares strategy is proposed for representing a two-mode proximity matrix as an approximate sum of a small number of matrices that satisfy simple order constraints on their entries.
Abstract: A least-squares strategy is proposed for representing a two-mode proximity matrix as an approximate sum of a small number of matrices that satisfy certain simple order constraints on their entries. The primary class of constraints considered define Q-forms (or anti-Q-forms) for a two-mode matrix, where after suitable and separate row and column reorderings, the entries within each row and within each column are nondecreasing (or nonincreasing) to a maximum (or minimum) and thereafter nonincreasing (or nondecreasing). Several other types of order constraints are also mentioned to show how alternative structures can be considered using the same computational strategy.

Journal ArticleDOI
TL;DR: In this article, the skipping phenomenon is demonstrated, and it is shown how to prevent it in Varimax rotation, and a solution to prevent the repeated skipping problem is presented.
Abstract: Varimax rotation consists of iteratively rotating pairs of columns of a matrix to a maximal sum (over columns) of variances of squared elements of the matrix. Without loss of optimality, the two rotated columns can be permuted and/or reflected. Although permutations and reflections are harmless for each planar rotation per se, they can be harmful in Varimax rotation. Specifically, they often give rise to the phenomenon that certain pairs of columns are consistently skipped in the iterative process, whence Varimax will be terminated at a nonstationary point. The skipping phenomenon is demonstrated, and it is shown how to prevent it.

Journal ArticleDOI
TL;DR: In this paper, the stability of principal components is measured by the expectation of the absolute inner product of the sample principal component with the corresponding population component, and a multiple regression model to predict stability is devised, calibrated, and tested using simulated Normal data.
Abstract: This paper presents an analysis, based on simulation, of the stability of principal components. Stability is measured by the expectation of the absolute inner product of the sample principal component with the corresponding population component. A multiple regression model to predict stability is devised, calibrated, and tested using simulated Normal data. Results show that the model can provide useful predictions of individual principal component stability when working with correlation matrices. Further, the predictive validity of the model is tested against data simulated from three non-Normal distributions. The model predicted very well even when the data departed from normality, thus giving robustness to the proposed measure. Used in conjunction with other existing rules this measure will help the user in determining interpretability of principal components.

Journal ArticleDOI
TL;DR: The manner in which the conditional independence graph of a multiway contingency table effects the fitting and interpretation of the Goodman association model (RC) and of correspondence analysis (CA) is considered and estimation of the row and column scores is presented.
Abstract: The manner in which the conditional independence graph of a multiway contingency table effects the fitting and interpretation of the Goodman association model (RC) and of correspondence analysis (CA) is considered. Estimation of the row and column scores is presented in this context by developing a unified framework that includes both models. Incorporation of the conditional independence constraints inherent in the graph may lead to equal or additive scores for the corresponding marginal tables, depending on the topology of the graph. An example of doubly additive scores in the analysis of a Burt subtable is given.

Journal ArticleDOI
TL;DR: In this paper, a Monte Carlo study was conducted to investigate the robustness of the assumed error distribution in maximum likelihood estimation models for multidimensional scaling, and the results showed that violations of the log-normal error distribution have virtually no effect on the estimated distance parameters.
Abstract: A Monte Carlo study was conducted to investigate the robustness of the assumed error distribution in maximum likelihood estimation models for multidimensional scaling. Data sets generated according to the lognormal, the normal, and the rectangular distribution were analysed with the log-normal error model in Ramsay's MULTISCALE program package. The results show that violations of the assumed error distribution have virtually no effect on the estimated distance parameters. In a comparison among several dimensionality tests, the corrected version of thex2 test, as proposed by Ramsay, yielded the best results, and turned out to be quite robust against violations of the error model.