scispace - formally typeset
Search or ask a question

Showing papers in "Psychometrika in 1993"


Journal ArticleDOI
TL;DR: In this article, structural bias, weak measurement invariance, strong factorial invariance (SFI), and factorial robustness have been defined and defined for employment/admissions testing and salary equity.
Abstract: Several concepts are introduced and defined: measurement invariance, structural bias, weak measurement invariance, strong factorial invariance, and strict factorial invariance. It is shown that factorial invariance has implications for (weak) measurement invariance. Definitions of fairness in employment/admissions testing and salary equity are provided and it is argued that strict factorial invariance is required for fairness/equity to exist. Implications for item and test bias are developed and it is argued that item or test bias probably depends on the existence of latent variables that are irrelevant to the primary goal of test constructers.

3,638 citations


Journal ArticleDOI
TL;DR: In this paper, a model-based modification of the standardization index based upon a multidimensional IRT bias modeling approach is presented that detects and estimates DIF or item bias simultaneously for several items.
Abstract: A model-based modification (SIBTEST) of the standardization index based upon a multidimensional IRT bias modeling approach is presented that detects and estimates DIF or item bias simultaneously for several items. A distinction between DIF and bias is proposed. SIBTEST detects bias/DIF without the usual Type 1 error inflation due to group target ability differences. In simulations, SIBTEST performs comparably to Mantel-Haenszel for the one item case. SIBTEST investigates bias/DIF for several items at the test score level (multiple item DIF called differential test functioning: DTF), thereby allowing the study of test bias/DIF, in particular bias/DIF amplification or cancellation and the cognitive bases for bias/DIF.

650 citations


Journal ArticleDOI
TL;DR: In this article, the marginal distribution of a test set with a truncated bivariate normal distribution is considered, and the relationship of this distribution to Azzalini's "skew-normal" distribution is obtained.
Abstract: Inference is considered for the marginal distribution ofX, when (X, Y) has a truncated bivariate normal distribution. TheY variable is truncated, but only theX values are observed. The relationship of this distribution to Azzalini's “skew-normal” distribution is obtained. Method of moments and maximum likelihood estimation are compared for the three-parameter Azzalini distribution. Samples that are uniformative about the skewness of this distribution may occur, even for largen. Profile likelihood methods are employed to describe the uncertainty involved in parameter estimation. A sample of 87 Otis test scores is shown to be well-described by this model.

180 citations


Journal ArticleDOI
Han de Vries1
TL;DR: In this paper, the authors discuss rowwise matrix correlation, based on the weighted sum of correlations between all pairs of corresponding rows of two proximity matrices, which may both be square (symmetric or asymmetric) or rectangular.
Abstract: This paper discusses rowwise matrix correlation, based on the weighted sum of correlations between all pairs of corresponding rows of two proximity matrices, which may both be square (symmetric or asymmetric) or rectangular. Using the correlation coefficients usually associated with Pearson, Spearman, and Kendall, three different rowwise test statistics and their normalized coefficients are discussed, and subsequently compared with their nonrowwise alternatives like Mantel'sZ. It is shown that the rowwise matrix correlation coefficient between two matricesX andY is the partial correlation between the entries ofX andY controlled for the nominal variable that has the row objects as categories. Given this fact, partial rowwise correlations (as well as multiple regression extensions in the case of Pearson's approach) can be easily developed.

116 citations


Journal ArticleDOI
TL;DR: This article showed that the posterior distribution of examinee ability given test response is approximately normal for a long test, under very general and nonrestrictive nonparametric assumptions, for a broad class of latent models.
Abstract: It has long been part of the item response theory (IRT) folklore that under the usual empirical Bayes unidimensional IRT modeling approach, the posterior distribution of examinee ability given test response is approximately normal for a long test. Under very general and nonrestrictive nonparametric assumptions, we make this claim rigorous for a broad class of latent models.

110 citations


Journal ArticleDOI
TL;DR: A weighted Euclidean distance model for analyzing three-way proximity data is proposed that incorporates a latent class approach and removes the rotational invariance of the classical multidimensional scaling model retaining psychologically meaningful dimensions, and drastically reduces the number of parameters in the traditional INDSCAL model.
Abstract: A weighted Euclidean distance model for analyzing three-way proximity data is proposed that incorporates a latent class approach. In this latent class weighted Euclidean model, the contribution to the distance function between two stimuli is per dimension weighted identically by all subjects in the same latent class. This model removes the rotational invariance of the classical multidimensional scaling model retaining psychologically meaningful dimensions, and drastically reduces the number of parameters in the traditional INDSCAL model. The probability density function for the data of a subject is posited to be a finite mixture of spherical multivariate normal densities. The maximum likelihood function is optimized by means of an EM algorithm; a modified Fisher scoring method is used to update the parameters in the M-step. A model selection strategy is proposed and illustrated on both real and artificial data.

92 citations


Journal ArticleDOI
TL;DR: In this paper, a model for describing dynamic processes is constructed by combining the common Rasch model with the concept of structurally incomplete designs, which is accomplished by mapping each item on a collection of virtual items, one of which is assumed to be presented to the respondent dependent on the preceding responses and/or the feedback obtained.
Abstract: In the present paper a model for describing dynamic processes is constructed by combining the common Rasch model with the concept of structurally incomplete designs. This is accomplished by mapping each item on a collection of virtual items, one of which is assumed to be presented to the respondent dependent on the preceding responses and/or the feedback obtained. It is shown that, in the case of subject control, no unique conditional maximum likelihood (CML) estimates exist, whereas marginal maximum likelihood (MML) proves a suitable estimation procedure. A hierarchical family of dynamic models is presented, and it is shown how to test special cases against more general ones. Furthermore, it is shown that the model presented is a generalization of a class of mathematical learning models, known as Luce's beta-model.

83 citations


Journal ArticleDOI
TL;DR: In this article, a conditional mixture, maximum likelihood method for latent class censored regression is proposed to simultaneously estimate separate regression functions and subject membership in K latent classes or groups given a censored dependent variable for a cross-section of subjects.
Abstract: The standard tobit or censored regression model is typically utilized for regression analysis when the dependent variable is censored. This model is generalized by developing a conditional mixture, maximum likelihood method for latent class censored regression. The proposed method simultaneously estimates separate regression functions and subject membership in K latent classes or groups given a censored dependent variable for a cross-section of subjects. Maximum likelihood estimates are obtained using an EM algorithm. The proposed method is illustrated via a consumer psychology application.

80 citations


Journal ArticleDOI
TL;DR: A class of models for gamma distributed random variables is presented, which are shown to be more flexible than the classical linear models with respect to the structure that can be imposed on the expected value.
Abstract: A class of models for gamma distributed random variables is presented. These models are shown to be more flexible than the classical linear models with respect to the structure that can be imposed on the expected value. In particular, both additive, multiplicative, and combined additive-multiplicative models can be formulated. As a special case, a class of psychometric models for reaction times is presented, together with their psychological interpretation. By means of a comparison with existing models, this class of models is shown to offer some possibilities that are not available in existing methods. Parameter estimation by means of maximum likelihood (ML) is shown to have some attractive properties, since the models belong to the exponential family. Then, the results of a simulation study of the bias in the ML estimates are presented. Finally, the application of these models is illustrated by an analysis of the data from a mental rotation experiment. This analysis is preceded by an evaluation of the appropriateness of the gamma distribution for these data.

70 citations


Journal ArticleDOI
TL;DR: In this paper, it was shown that local homogeneity is equivalent to subpopulation invariance of latent trait models, and the homogeneous monotone IRT model holds for a finite or countable item pool if and only if the pool is experimentally independent and pairwise nonnegative association holds in every positive subpopulation.
Abstract: The stochastic subject formulation of latent trait models contends that, within a given subject, the event of obtaining a certain response pattern may be probabilistic. Ordinary latent trait models do not imply that these within-subject probabilities are identical to the conditional probabilities specified by the model. The latter condition is called local homogeneity. It is shown that local homgeneity is equivalent to subpopulation invariance of the model. In case of the monotone IRT model, local homogeneity implies absence of item bias, absence of item specific traits, and the possibility to join overlapping subtests. The following characterization theorem is proved: the homogeneous monotone IRT model holds for a finite or countable item pool if and only if the pool is experimentally independent and pairwise nonnegative association holds in every positive subpopulation.

58 citations


Journal ArticleDOI
TL;DR: In this paper, five different ability estimators (maximum likelihood (MLE), weighted likelihood (WLE), Bayesian modal (BME), expected a posteriori (EAP), and the standardized number-right score (Z (θ)) were used as scores for conventional, multiple-choice tests.
Abstract: Five different ability estimators—maximum likelihood [MLE (θ)], weighted likelihood [WLE (θ)], Bayesian modal [BME (θ)], expected a posteriori [EAP (θ)] and the standardized number-right score [Z (θ)]—were used as scores for conventional, multiple-choice tests The bias, standard error and reliability of the five ability estimators were evaluated using Monte Carlo estimates of the unknown conditional means and variances of the estimators The results indicated that ability estimates based on BME (θ), EAP (θ) or WLE (θ) were reasonably unbiased for the range of abilities corresponding to the difficulty of a test, and that their standard errors were relatively small Also, they were as reliable as the old standby—the number-right score

Journal ArticleDOI
TL;DR: A general model for two-level multivariate data, with responses possibly missing at random, is described, which combines regressions on fixed explanatory variables with structured residual covariance matrices.
Abstract: A general model for two-level multivariate data, with responses possibly missing at random, is described. The model combines regressions on fixed explanatory variables with structured residual covariance matrices. The likelihood function is reduced to a form enabling computational methods for estimating the model to be devised.

Journal ArticleDOI
TL;DR: The slide-vector scaling model as mentioned in this paper attempts to account for the asymmetry of a proximity matrix by a uniform shift in a fixed direction imposed on a symmetric Euclidean representation of the scaled objects.
Abstract: The slide-vector scaling model attempts to account for the asymmetry of a proximity matrix by a uniform shift in a fixed direction imposed on a symmetric Euclidean representation of the scaled objects. Although no method for fitting the slide-vector model seems available in the literature, the model can be viewed as a constrained version of the unfolding model, which does suggest one possible algorithm. The slide-vector model is generalized to handle three-way data, and two examples from market structure analysis are presented.

Journal ArticleDOI
TL;DR: In this paper, a deterministic stimulation-dependent response time model is proposed, where the only role of the criteria in this theoretical language is to numerically calibrate the ordinal-scale axes for the deterministic response processes.
Abstract: Any family of simple response time distributions that correspond to different values of stimulation variables can be modeled by a deterministic stimulation-dependent process that terminates when it crosses a randomly preset criterion. The criterion distribution function is stimulation-independent and can be chosen arbitrarily, provided it is continuous and strictly increasing. Any family ofN-alternative choice response time distributions can be modeled byN such process-criterion pairs, with response choice and response time being determined by the process that reaches its criterion first. The joint distribution of theN criteria can be chosen arbitrarily, provided it satisfies certain unrestrictive conditions. In particular, the criteria can be chosen to be stochastically independent. This modeling scheme, therefore, is a descriptive theoretical language rather than an empirically falsifiable model. The only role of the criteria in this theoretical language is to numerically calibrate the ordinal-scale axes for the deterministic response processes.

Journal ArticleDOI
TL;DR: In this paper, a method for avoiding the problem is described for the partial credit model while maintaining the integrity of the original response framework, which is based on a simple re-expression of the basic parameters of the model.
Abstract: A category where the frequency of responses is zero, either for sampling or structural reasons, will be called anull category. One approach for ordered polytomous item response models is to downcode the categories (i.e., reduce the score of each category above the null category by one), thus altering the relationship between the substantive framework and the scoring scheme for items with null categories. It is discussed why this is often not a good idea, and a method for avoiding the problem is described for the partial credit model while maintaining the integrity of the original response framework. This solution is based on a simple reexpression of the basic parameters of the model.

Journal ArticleDOI
TL;DR: A mixture distribution model is formulated that can be considered as a latent class model for continuous single stimulus preference ratings and is applied to political science data concerning party preferences from members of the Dutch Parliament.
Abstract: A multidimensional unfolding model is developed that assumes that the subjects can be clustered into a small number of homogeneous groups or classes. The subjects that belong to the same group are represented by a single ideal point. Since it is not known in advance to which group of class a subject belongs, a mixture distribution model is formulated that can be considered as a latent class model for continuous single stimulus preference ratings. A GEM algorithm is described for estimating the parameters in the model. The M-step of the algorithm is based on a majorization procedure for updating the estimates of the spatial model parameters. A strategy for selecting the appropriate number of classes and the appropriate number of dimensions is proposed and fully illustrated on some artificial data. The latent class unfolding model is applied to political science data concerning party preferences from members of the Dutch Parliament. Finally, some possible extensions of the model are discussed.

Journal ArticleDOI
TL;DR: In this article, an approximation for the bias function of the maximum likelihood estimate of the latent trait, or ability, was developed using the same assumptions for the more general case where item responses are discrete.
Abstract: Lord developed an approximation for the bias function for the maximum likelihood estimate in the context of the three-parameter logistic model. Using Taylor's expansion of the likelihood equation, he obtained an equation that includes the conditional expectation, given true ability, of the discrepancy between the maximum likelihood estimate and true ability. All terms of orders higher thann−1 are ignored wheren indicates the number of items. Lord assumed that all item and individual parameters are bounded, all item parameters are known or well-estimated, and the number of items is reasonably large. In the present paper, an approximation for the bias function of the maximum likelihood estimate of the latent trait, or ability, will be developed using the same assumptions for the more general case where item responses are discrete. This will include the dichotomous response level, for which the three-parameter logistic model has been discussed, the graded response level and the nominal response level. Some observations will be made for both dichotomous and graded response levels.

Journal ArticleDOI
TL;DR: In this article, observations are made about the behavior of this bias function for the dichotomous response level in general, and also with respect to several widely used mathematical models, and empirical examples are given.
Abstract: Samejima has recently given an approximation for the bias function for the maximum likelihood estimate of the latent trait in the general case where item responses are discrete, generalizing Lord's bias function in the three-parameter logistic model for the dichotomous response level. In the present paper, observations are made about the behavior of this bias function for the dichotomous response level in general, and also with respect to several widely used mathematical models. Some empirical examples are given.

Journal ArticleDOI
TL;DR: The authors showed that analyzing multiple imputations as if they were multiple indicators does not generally yield correct results; they must instead be analyzed by means concordant with their construction, which is not always the case.
Abstract: Rubin's “multiple imputation” approach to missing data creates synthetic data sets, in which each missing variable is replaced by a draw from its predictive distribution, conditional on the observed data. By construction, analyses of such filled-in data sets as if the imputations were true values have the correct expectations for population parameters. In a recent paper, Mislevy showed how this approach can be applied to estimate the distributions of latent variables from complex samples. Multiple imputations for a latent variable bear a surface similarity to classical “multiple indicators” of a latent variable, as might be addressed in structural equation modelling or hierarchical modelling of successive stages of random sampling. This note demonstrates with a simple example why analyzing “multiple imputations” as if they were “multiple indicators” does not generally yield correct results; they must instead be analyzed by means concordant with their construction.

Journal ArticleDOI
TL;DR: The authors developed a Fisherr toZ transformation for the corrected correlation for each of two conditions: (1) the criterion data were missing due to selection on the predictor (the missing data were MAR); and (2) the criteria was missing at random, not due to the selection (MCAR).
Abstract: The validity of a test is often estimated in a nonrandom sample of selected individuals. To accurately estimate the relation between the predictor and the criterion we correct this correlation for range restriction. Unfortunately, this corrected correlation cannot be transformed using Fisher'sZ transformation, and asymptotic tests of hypotheses based on small or moderate samples are not accurate. We developed a Fisherr toZ transformation for the corrected correlation for each of two conditions: (a) the criterion data were missing due to selection on the predictor (the missing data were MAR); and (b) the criterion was missing at random, not due to selection (the missing data were MCAR). The twoZ transformations were evaluated in a computer simulation. The transformations were accurate, and tests of hypotheses and confidence intervals based on the transformations were superior to those that were not based on the transformations.

Journal ArticleDOI
TL;DR: In this paper, a flexible class of stochastic mixture models for the analysis and interpretation of individual differences in recurrent choice and other types of count data is introduced, which are derived by specifying elements of the choice process at the individual level.
Abstract: This paper introduces a flexible class of stochastic mixture models for the analysis and interpretation of individual differences in recurrent choice and other types of count data. These choice models are derived by specifying elements of the choice process at the individual level. Probability distributions are introduced to describe variations in the choice process among individuals and to obtain a representation of the aggregate choice behavior. Due to the explicit consideration of random effect sources, the choice models are parsimonious and readily interpretable. An easy to implement EM algorithm is presented for parameter estimation. Two applications illustrate the proposed approach.

Journal ArticleDOI
TL;DR: It is argued that the idea behind points of view analysis deserves new attention, especially as a technique to analyze group differences, and a procedure is proposed that can be viewed as a streamlined, integrated version of the Tucker and Messick Process.
Abstract: Points of view analysis (PVA), proposed by Tucker and Messick in 1963, was one of the first methods to deal explicitly with individual differences in multidimensional scaling, but at some point was apparently superceded by the weighted Euclidean model, well-known as the Carroll and Chang INDSCAL model. This paper argues that the idea behind points of view analysis deserves new attention, especially as a technique to analyze group differences. A procedure is proposed that can be viewed as a streamlined, integrated version of the Tucker and Messick Process, which consisted of a number of separate steps. At the same time, our procedure can be regarded as a particularly constrained weighted Euclidean model. While fitting the model, two types of nonlinear data transformations are feasible, either for given dissimilarities, or for variables from which the dissimilarities are derived. Various applications are discussed, where the two types of transformation can be mixed in the same analysis; a quadratic assignment framework is used to evaluate the results.

Journal ArticleDOI
TL;DR: In this article, the authors provide a characterization of the total space used by Guttman-type quantification of contingency tables and multiple-choice data (incidence data) and pertinent discussion are presented.
Abstract: In quantifying categorical data, constraints play an important role in characterizing the outcome. In the Guttman-type quantification of contingency tables and multiple-choice data (incidence data), the trivial solution due to the marginal constraints is typically removed before quantification; this removal, however, has the effect of distorting the shape of the total space. Awareness of this is important for the interpretation of the quantified outcome. The present study provides some relevant formulas for those cases that are affected by the trivial solution and those cases that are not. The characterization of the total space used by the Guttman-type quantification and pertinent discussion are presented.

Journal ArticleDOI
TL;DR: In this article, a probabilistic choice model is developed for paired comparisons data about psychophysical stimuli, which is based on Thurstone's Law of Comparative Judgment Case V and assumes that each stimulus is measured on a small number of physical variables.
Abstract: A probabilistic choice model is developed for paired comparisons data about psychophysical stimuli. The model is based on Thurstone's Law of Comparative Judgment Case V and assumes that each stimulus is measured on a small number of physical variables. The utility of a stimulus is related to its values on the physical variables either by means of an additive univariate spline model or by means of multivariate spline model. In the additive univariate spline model, a separate univariate spline transformation is estimated for each physical dimension and the utility of a stimulus is assumed to be an additive combination of these transformed values. In the multivariate spline model, the utility of a stimulus is assumed to be a general multivariate spline function in the physical variables. The use of B splines for estimating the transformation functions is discussed and it is shown how B splines can be generalized to the multivariate case by using as basis functions tensor products of the univariate basis functions. A maximum likelihood estimation procedure for the Thurstone Case V model with spline transformation is described and applied for illustrative purposes to various artificial and real data sets. Finally, the model is extended using a latent class approach to the case where there are unreplicated paired comparisons data from a relatively large number of subjects drawn from a heterogeneous population. An EM algorithm for estimating the parameters in this extended model is outlined and illustrated on some real data.

Journal ArticleDOI
TL;DR: In this article, a family of coefficients of relational agreement for numerical scales is proposed, which is a generalization to multiple judges of the Zegers and ten Berge theory of association coefficients for two variables.
Abstract: A family of coefficients of relational agreement for numerical scales is proposed. The theory is a generalization to multiple judges of the Zegers and ten Berge theory of association coefficients for two variables and is based on the premise that the choice of a coefficient depends on the scale type of the variables, defined by the class of admissible transformations. Coefficients of relational agreement that denote agreement with respect to empirically meaningful relationships are derived for absolute, ratio, interval, and additive scales. The proposed theory is compared to intraclass correlation, and it is shown that the coefficient of additivity is identical to one measure of intraclass correlation.

Journal ArticleDOI
TL;DR: An implementation of the Gauss-Newton algorithm for the analysis of covariance structures that is specifically adapted for high-level computer languages is reviewed and a large class of models can be estimated, including many that utilize functional relationships among the parameters that are not possible in most available computer programs.
Abstract: An implementation of the Gauss-Newton algorithm for the analysis of covariance structures that is specifically adapted for high-level computer languages is reviewed. With this procedure one need only describe the structural form of the population covariance matrix, and provide a sample covariance matrix and initial values for the parameters. The gradient and approximate Hessian, which vary from model to model, are computed numerically. Using this approach, the entire method can be operationalized in a comparatively small program. A large class of models can be estimated, including many that utilize functional relationships among the parameters that are not possible in most available computer programs. Some examples are provided to illustrate how the algorithm can be used.

Journal ArticleDOI
TL;DR: If a Shepard-type similarity function accurately describes behavior, then under typical experimental conditions it should be difficult to see the effects of perceptual dependence, which provides strong support for a perceptualindependence assumption when using these models.
Abstract: Probabilistic models of same-different and identification judgments are compared (within each paradigm) with regard to their sensitivity to perceptual dependence or the degree to which the underlying psychological dimensions are correlated. Three same-different judgment models are compared. One is a step function or decision bound model and the other two are probabilistic variants of a similarity model proposed by Shepard. Three types of identification models are compared: decision bound models, a probabilistic multidimensional scaling model, and probabilistic models based on the Shepard-Luce choice rule. The decision bound models were found to be most sensitive to perceptual dependence, especially when there is considerable distributional overlap. The same-different model based on the city-block metric and an exponential decay similarity function, and the corresponding identification model were found to be particularly insensitive to perceptual dependence. These results suggest that if a Shepard-type similarity function accurately describes behavior, then under typical experimental conditions it should be difficult to see the effects of perceptual dependence. This result provides strong support for a perceptualindependence assumption when using these models. These theoretical results may also play an important role in studying different decision rules employed at different stages of identification training.

Journal ArticleDOI
TL;DR: In this paper, it was shown that IRTs information function for an item is functionally related to local versions of classical test theories' signal/noise ratio and reliability coefficient.
Abstract: It is shown that IRTs information function for an item is functionally related to “local” versions of classical test theories' signal/noise ratio and reliability coefficient.

Journal ArticleDOI
TL;DR: In this article, the authors examined a method of comparing the one-step M-estimators of location corresponding to two independent groups which provided good control over the probability of a Type I error even for unequal sample sizes, unequal variances, and different shaped distributions.
Abstract: Methods for comparing means are known to be highly nonrobust in terms of Type II errors. The problem is that slight shifts from normal distributions toward heavy-tailed distributions inflate the standard error of the sample mean. In contrast, the standard error of various robust measures of location, such as the one-step M-estimator, are relatively unaffected by heavy tails. Wilcox recently examined a method of comparing the one-step M-estimators of location corresponding to two independent groups which provided good control over the probability of a Type I error even for unequal sample sizes, unequal variances, and different shaped distributions. There is a fairly obvious extension of this procedure to pairwise comparisons of more than two independent groups, but simulations reported here indicate that it is unsatisfactory. A slight modification of the procedure is found to give much better results, although some caution must be taken when there are unequal sample sizes and light-tailed distributions. An omnibus test is examined as well.

Journal ArticleDOI
TL;DR: In this paper, a reparameterization is formulated that yields estimates of scale-invariant parameters in recursive path models with latent variables, and (asymptotically) correct standard errors, without the use of constrained optimization.
Abstract: A reparameterization is formulated that yields estimates of scale-invariant parameters in recursive path models with latent variables, and (asymptotically) correct standard errors, without the use of constrained optimization. The method is based on the logical structure of the reticular action model.