scispace - formally typeset
Search or ask a question

Showing papers in "Psychometrika in 1965"


Journal ArticleDOI
TL;DR: It is suggested that if Guttman's latent-root-one lower bound estimate for the rank of a correlation matrix is accepted as a psychometric upper bound, then the rank for a sample matrix should be estimated by subtracting out the component in the latent roots which can be attributed to sampling error.
Abstract: It is suggested that if Guttman's latent-root-one lower bound estimate for the rank of a correlation matrix is accepted as a psychometric upper bound, following the proofs and arguments of Kaiser and Dickman, then the rank for a sample matrix should be estimated by subtracting out the component in the latent roots which can be attributed to sampling error, and least-squares “capitalization” on this error, in the calculation of the correlations and the roots. A procedure based on the generation of random variables is given for estimating the component which needs to be subtracted.

6,722 citations


Journal ArticleDOI
TL;DR: A new method of factor analysis based upon the psychometric concept of generalizability is described, which determines factors which have maximumgeneralizability in the Kuder-Richardson, or alpha, sense.
Abstract: A distinction is made between statistical inference and psychometric inference in factor analysis. After reviewing Rao's canonical factor analysis (CFA), a fundamental statistical method of factoring, a new method of factor analysis based upon the psychometric concept of generalizability is described. This new procedure (alpha factor analysis, AFA) determines factors which have maximum generalizability in the Kuder-Richardson, or alpha, sense. The two methods, CFA and AFA, each have the important property of giving the same factors regardless of the units of measurement of the observable variables. In determining factors, the principal distinction between the two methods is that CFA operates in the metric of the unique parts of the observable variables while AFA operates in the metric of the common (“communality”) parts. On the other hand, the two methods are substantially different as to how they establish the number of factors. CFA answers this crucial question with a statistical test of significance while AFA retains only those alpha factors with positive generalizability. This difference is discussed at some length. A brief outline of a computer program for AFA is described and an example of the application of AFA is given.

482 citations


Journal ArticleDOI

343 citations


Journal ArticleDOI
TL;DR: An approximation to the sampling distribution of Kuder-Richardson reliability formula 20 is derived, using its algebraic equivalent obtained through an items-by-subjects analysis of variance.
Abstract: An approximation to the sampling distribution of Kuder-Richardson reliability formula 20 is derived, using its algebraic equivalent obtained through an items-by-subjects analysis of variance. The theoretical distribution is compared to empirical estimates of the sampling distribution to assess how crucial certain assumptions are. The use of the theoretical distribution for testing hypotheses and deriving confidence intervals is illustrated. A table of equations for approximating 80, 90, and 95 per cent confidence intervals is presented withN ranging from 40 to 500.

260 citations


Journal ArticleDOI
TL;DR: In this article, a mathematical model for the relation between observed scores and true scores is developed, which can be used to estimate bivariate distributions from univariate distributions, with good results, as checked by chi-square tests.
Abstract: A “strong” mathematical model for the relation between observed scores and true scores is developed. This model can be used The model has been tested empirically, using it to estimate bivariate distributions from univariate distributions, with good results, as checked by chi-square tests.

169 citations


Journal ArticleDOI
TL;DR: The authors derived axioms of the classical test theory model from a specified sampling rule and from the assumption that the observed score of an arbitrarily specified or randomly selected person may be considered as an observation of a random variable having finite and positive variance.
Abstract: Following an approach due to Guttman the axioms of the classical test theory model are shown to be derivable as constructions from a specified sampling rule and from the assumption that the observed score of an arbitrarily specified or randomly selected person may be considered as an observation of a random variable having finite and positive variance. Without further assumption the reliability of a test is defined. Parallel measurements are then independently defined, and the concept of replication is explicated. The derived axioms of the classical test theory model are then stated in a refined form of Woodbury's stochastic process notation, and the basic results of this model are derived. The assumptions of experimental independence, homogeneity of error distribution, and conditional independence are related to the classical model and to each other. Finally, a brief sketch of some stronger models assuming the independence of error and true scores or the existence of higher-order moments of error distributions or those making specific distributional assumptions is given.

155 citations


Journal ArticleDOI
TL;DR: The generalizability theory is applied to a universe in which observations are classifiable according to two independent variable aspects of the measuring procedure and several types of universe scores are developed and the variance components ascertained for each type.
Abstract: Generalizability theory concerns the adequacy with which a “universe” score can be inferred from a set of observations. In this paper the theory is applied to a universe in which observations are classifiable according to two independent variable aspects of the measuring procedure. Several types of universe scores are developed and the variance components ascertained for each type. The composition of expected observed-score variance and the adequacy of inference to a particular type of universe score is a function of the procedure used in gathering data. A generalizability study provides estimates of variance components which can be used in designing an efficient procedure for a particular decision purpose.

133 citations


Journal ArticleDOI
TL;DR: It is argued that the use of inappropriate statistics leads to the formulation of statements which are either semantically meaning-less or empirically nonsignificant.
Abstract: A formal theory of appropriateness for statistical operations is presented which incorporates features of Stevens' theory of appropriate statistics and Suppes' theory of empirical meaningfulness. It is proposed that a statistic be regarded as appropriate relative to statements made about it in case the truths of these statements are invariant under permissible transformations of the measurement scale. It is argued that the use of inappropriate statistics leads to the formulation of statements which are either semantically meaning-less or empirically nonsignificant.

96 citations



Journal ArticleDOI
TL;DR: An analysis of the method of paired comparisons shows that Case V and Case VI, the latter characterized by log-normal distributions and Weber's law for subjective continua, are fundamentally indistinguishable.
Abstract: (1) An analysis of the method of paired comparisons shows that Case V and Case VI, the latter characterized by log-normal distributions and Weber's law for subjective continua, are fundamentally indistinguishable. Case VI produces a log-arithmetic interval scale of subjective magnitude. (2) It is demonstrated that the difference between discrimination scales according to Case V and from category rating is due to the difference between intra- and interindividual variability yielding different Weber functions.

71 citations


Journal ArticleDOI
TL;DR: While the traditional multiple correlation coefficient appears to be inherently an asymmetrical statistic, it is actually a special case of a more general measure of linear relationship between twosets of variables that rest upon the generalized variance of a multivariate distribution.
Abstract: While the traditional multiple correlation coefficient appears to be inherently an asymmetrical statistic, it is actually a special case of a more general measure of linear relationship between twosets of variables. Another symmetric generalization of linear correlation is to the total relatednesswithin a set of variables. Both of these developments rest upon thegeneralized variance of a multivariate distribution, which is seen to be the fundamental concept of linear correlational theory.

Journal ArticleDOI
TL;DR: A method of generating any number of score and correlation matrices with arbitrary population parameters is described, using an IBM 1620 program to represent factor scores.
Abstract: A method of generating any number of score and correlation matrices with arbitrary population parameters is described. EitherZ scores or stanines are sampled from a normal population to represent factor scores by an IBM 1620 program. These are converted to variates from a population with an a priori factor structure. The effectiveness of the method is illustrated from research data. Some further modifications and uses of the method are discussed.

Journal ArticleDOI
TL;DR: A nonparametric model is proposed for the errors underlying such judgments, and conditions are given under which Cochran'sQ statistic is valid for testing the hypothesis of no systematic differences among the judgments of the different observers.
Abstract: A reliability study is assumed to be carried out with each of a number of observers making a dichotomous judgment concerning each of a sample of subjects. A nonparametric model is proposed for the errors underlying such judgments, and conditions are given under which Cochran'sQ statistic is valid for testing the hypothesis of no systematic differences among the judgments of the different observers. Inferences concerning the probabilities of error are shown to be possible in terms of the intraclass correlation coefficient. A numerical example is given.

Journal ArticleDOI
TL;DR: This paper seeks to meet the need for a general treatment of the problem of error in classification within an m-attribute classificatory system by finding particular systemsO =f(T, E) which are solvable forT andE givenO.
Abstract: This paper seeks to meet the need for a general treatment of the problem of error in classification. Within an m-attribute classificatory system, an object's typical subclass is that subclass to which it is most often allocated under repeated experimentally independent applications of the classificatory criteria. In these terms, an error of classification is an atypical subclass allocation. This leads to definition of probabilitiesO of occasional subclass membership, probabilitiesT of typical subclass membership, and probabilitiesE of error or, more generally, occasional subclass membership conditional upon typical subclass membership. In the relationshipf: (O, T, E) the relative incidence of independentO, T, andE values is such that generally one can specifyO values givenT andE, but one cannot generally specifyT andE values givenO. Under the restrictions of homogeneity ofE values for all members of a given typical subclass, mutual stochastic independence of errors of classification, and suitable conditions of replication, one can find particular systemsO =f(T, E) which are solvable forT andE givenO. A minimum of three replications of occasional classification is necessary for a solution of systems for marginal attributes, and a minimum of two replications is needed with any cross-classification. Although for such systems one can always specifyT andE values givenO values, the solution is unique for dichotomous systems only.

Journal ArticleDOI
TL;DR: This paper found that at least two such basic personality traits operate in the area of acquiescence; two more appear in the areas of desirable responding, and two or three may underlie extremity of response.
Abstract: Factor analytic and correlational studies of response styles have been reviewed for evidence relating stylistic responding to objective or performance measures of personality traits. The evidence strongly suggests that correlations among measures of any one style, such as acquiescence, are determined by more than one basic personality disposition. At least two such basic traits operate in the area of acquiescence; two more appear in the area of desirable responding, and two or three may underlie extremity of response. The evidence also tends to suggest that scores on many current response style variables are nonlinear functions of the basic trait variables. Theories about the psychological nature of these underlying traits are discussed, developed, and compared with findings in the current literature.

Journal ArticleDOI
TL;DR: There have been a number of important developments by several authors which are based on the assumption that test taking behavior is stochastic, that is, it is assumed that the probability that an individual will "endorse" or "pass" a dichotomous item can be represented by some function of an underlying, "latent" attribute called the item trace line or item characteristic curve.
Abstract: There have been a number of important developments by several authors which are based on the assumption that test taking behavior is stochastic. That is, it is assumed that the probability that an individual will "endorse" or "pass" a dichotomous item can be represented by some function of an underlying, "latent" attribute (possibly multivariate) called the item trace line or item characteristic curve. Various forms for trace lines have been postulated such as polynomial [9, 11], step function [9], normal ogive [13, 19], and logistic [1, 2]. An interesting model due to Rasch [18] utilizes only a subject parameter and item parameter to determine the probability of passing an item. The only model which does not make some formal assumptions about trace lines is Lazarsfeld's latent class model. All of these models for test behavior are unified by two basic assumptions. One is the stochastic assumption of a probability of passing or endorsing an item. The probability of passing a specified item may be construed as an attribute of each individual with a given score on the latent attribute, or it may be construed as the proportion passing the item in the class of individuals with a given value on the latent attribute. Lazarsfeld has presented an excellent discussion of this topic [10]. Unfortunately, since these two different postulates usually yield the same results, they cannot be distinguished empirically. On the other hand, the same techniques may be applied without commitment to one or the other assumptions. The other assumption common to this class of models is the assumption of local independence. If the probability of passing an item is construed as an attribute of an individual, the local independence assumption asserts that the probability an individual will pass two or more items in a test is the product of his separate probabilities of passing. If the probability of passing is construed as a proportion of the class of individuals with a given latent attribute score, the local independence assumption asserts that the proportion of the class passing two or more items in a test is the product of the separate proportions passing characteristic of that class. Now if it is assumed that the probability of passing an item is an at-

Journal ArticleDOI
TL;DR: The paired-comparison model may be derived from a variety of different initial assumptions about the nature of paired comparisons, some of these assumptions appearing to be more appropriate to a description of the preference-decision process than others.
Abstract: The first contribution of this paper is to demonstrate that the paired-comparison model may be derived from a variety of different initial assumptions about the nature of paired comparisons, some of these assumptions appearing to the author to be more appropriate to a description of the preference-decision process than others. The second contribution of this paper is to note that the generalization of the model to triple comparisons chosen earlier is not the one compatible with the Lehmann model even though it possesses other desirable properties. Limited numerical calculations suggest that both models for triple comparisons give comparable results in applications. Their asymptotic properties should be similar.

Journal ArticleDOI
TL;DR: It is shown that certain research studies could use item-sampling methods to improve their efficiency of experimental design and various sampling error formulas are derived, making it possible to determine whether and to what extent various uses of item sampling will increase the information gained by the research worker.
Abstract: New and less new concepts and results in the item-sampling model for mental test theory are summarized. It is shown that certain research studies–typically those concerned with group means–could use item-sampling methods to improve their efficiency of experimental design. Various sampling error formulas are derived, making it possible to determine whether and to what extent various uses of item sampling will increase the information gained by the research worker.

Journal ArticleDOI
TL;DR: Two of the job aspects (work itself and opportunity for achievement), both motivators, were sufficient to account for the variance in overall satisfaction.
Abstract: Ratings of four motivator job aspects, four hygiene job aspects, and overall job satisfaction were obtained from 93 male subjects who were equally satisfied with both the motivator and the hygiene aspects of their jobs. Two of the job aspects (work itself and opportunity for achievement), both motivators, were sufficient to account for the variance in overall satisfaction.

Journal ArticleDOI
TL;DR: When the purpose of the experiment is to compare treatments, the Sequences × Positions Latin Square has been employed to control unwanted effects attributable to individuals, position, and sequence; if subject interactions are present, square uniqueness may be used as the error term and the bias in the test of treatments will be conservative.
Abstract: When the purpose of the experiment is to compare treatments, the Sequences × Positions Latin Square has been employed to control unwanted effects attributable to individuals, position, and sequence. This particular Latin Square has been subjected to criticism on the grounds there is confounding due to structure, random variables, and subject interactions. Special Latin Square, a subclass of the Sequences × Positions Latin Square, is basically ap ×p factorial design in blocks of sizep. The two factors are treatments (T) and positions (P). Sequence is one component of theTP interaction, and square uniqueness is the sum of the remaining components. This completely replicated factorial design has no structural or random variable confounding; if subject interactions are present, square uniqueness may be used as the error term and the bias in the test of treatments will be conservative.

Journal ArticleDOI
TL;DR: This work has shown that the proportion of correct answers to an item has a normal-ogive or logistic relationship to total test score, and this is shown to be a mistaken and an undesirable notion.
Abstract: It is common to assume that the proportion of correct answers to an item has a normal-ogive or logistic relationship to total test score. However, this is shown to be a mistaken and an undesirable notion.

Journal ArticleDOI
TL;DR: Several techniques of making grade adjustments are reviewed in this article. Evaluations and comparisons of the various techniques are made in terms of the properties of various models and on the basis of the empirical results based on the application of these models.
Abstract: Several techniques of making grade adjustments are reviewed. Evaluations and comparisons of the various techniques are made in terms of the properties of the various models and on the basis of the empirical results based on the application of these models. Although there are several relatively elegant models available, the empirical results have shown negligible improvement in predictive accuracy due to grade adjustments when (a) the results have been cross-validated, and (b) one or more standardized test variables have been included as predictors.

Journal ArticleDOI
TL;DR: The present model treats the scaling of pair-comparison preference judgments among a unidimensional set of stimuli across a population of individuals as a percentage of individuals whose first choice among the elements ofS isSi.
Abstract: The present model treats the scaling of pair-comparison preference judgments among a unidimensional set of stimuli across a population of individuals. Given a setS ofn stimuli,S = {S 1,S 2, …,S n }, the model yields a partially ordered metric on the interstimulus distances which may be used to construct an interval scale of values forS. Obtained also are a set of predictionsP = {P 1,P 2, …,P n } whereP i is the proportion of individuals in the population whose first choice among the elements ofS isS i . A numerical illustration is offered and comparisons are drawn with Coombs' unfolding technique.

Journal ArticleDOI
TL;DR: General features of a probability model for errors of classification are recapitulated as an introduction to particular cases and applications and a procedure for relating models and data is described.
Abstract: General features of a probability model for errors of classification are recapitulated as an introduction to particular cases and applications. Several models for dichotomous and nondichotomous systems are examined in sufficient detail to elaborate a procedure for dealing with any particular case. The systemO =f(T,E) has empirical reference where, as statistic or parameter, probability of occasional subclass membership is given by observation, and one seeks to recoverT andE values fromO. A procedure for relating models and data is described. Applications of the concepts and methods are illustrated for several areas of psychological research.

Journal ArticleDOI
TL;DR: By using cumulants in place of moments, considerable simplification of the treatment of convoluted distributions is obtained, particularly if one of the components is normally distributed.
Abstract: This paper presents an adaptation of the method of moments for comparing observed and theoretical distributions of reaction time. By using cumulants in place of moments, considerable simplification of the treatment of convoluted distributions is obtained, particularly if one of the components is normally distributed. Stochastic latency models are often poorly fitted by reaction time data. This may be because a simple latency distribution is convoluted with a normal or high-order gamma distribution. The comparison method described will assist investigation of this and other interpretations of reaction time distributions.

Journal ArticleDOI
TL;DR: An objective method for the orthogonal rotation of factors which gives results closer to the graphic method is proposed and it is shown that they are very close to the factors obtained from empirical studies both in values and in signs.
Abstract: An objective method for the orthogonal rotation of factors which gives results closer to the graphic method is proposed First, the fact that the varimax method does not always satisfy simple-structure criteria, eg, the positive manifold and the level contributions of all factors, is pointed out Next, the principles of our method which are based on “geometric vector” are discussed, and the computational procedures for this method are explained using Harman and Holzinger's eight physical variables Finally, six numerical examples by our method are presented, and it is shown that they are very close to the factors obtained from empirical studies both in values and in signs

Journal ArticleDOI
TL;DR: The one-factor and the multiple-factor models for reliability are compared and conditions under which one or more of the principal components should be utilized are discussed.
Abstract: As an alternative to the analysis of variance approach to reliability a multiple-factor analysis approach is illustrated The one-factor and the multiple-factor models for reliability are compared Tests on the latent roots associated with the principal components of intercorrelation matrices are used to determine the number of components to be retained Conditions under which one or more of the principal components should be utilized are discussed

Journal ArticleDOI
TL;DR: In this article, the scale properties of the interest index were investigated and it was concluded that the Interest Index performs at least as well as any comparable test and, in addition, has three advantages over most.
Abstract: Extensive research by John French on the Interest Index produced a 12-scale interest test with 16 items per scale. The present study investigated certain scale properties of the Interest Index. Internal consistency coefficients were found to cluster about .91, item-scale assignment was verified for 99% of the items, scale test-retest correlations (three-week interval) were found to cluster about .86, and the individual scales were found to be remarkably independent of one another. It was concluded that, on all these measures, the Interest Index performs at least as well as any comparable test and, in addition, has three advantages over most. It is a very efficient instrument (in terms of number of items per scale), it utilizes subject matter experiences familiar to most students, and it measures interest in a wide variety of subject matter course areas. Partial confirmation of the construct validity of this test is provided, and it is recommended that (a) appropriate norms be developed and (b) further validation studies be conducted.

Journal ArticleDOI
TL;DR: In this article, a study was conducted to determine whether group members recognize the occurrence of group-induced shifts toward enhanced risk taking, and the results indicated that females exhibit greater interpersonal attentiveness than do males.
Abstract: The present investigation was designed to determine whether group members recognize the occurrence of group-induced shifts toward enhanced risk taking. The subjects, 122 male and 139 female university students, were assigned at random within sex to decision-making groups calling for discussion with or without a consensus requirement. As in previous work, groups gravitated toward increased risk taking under both conditions. Upon the completion of all decisions, the group members made judgments concerning the effect of the discussion on the groups' risk-taking levels. These judgments were significantly biased in the risky direction, but fell significantly below the actual magnitude of the risky shifts. Results were highly uniform for both sexes and for both types of group. We then inquired whether the trend toward veridicality in the judgment data reflected a genuine awareness of group outcomes or a process of assimilative projection, i.e., attribution of one's own shift behavior to the group. The findings pointed to greater genuine awareness for the females and stronger assimilative projection for males, suggesting that females may display greater interpersonal attentiveness than do males. In the case of both types of group and both sexes, group members ascribed greater forcefulness to those who in fact had higher initial risk-taking levels.

Journal ArticleDOI
TL;DR: In this article, the problem of computing estimates of factor loadings, specific variances, and communalities for a factor analytic model is considered, and the equations for maximum likelihood estimators are discussed.
Abstract: This paper considers the problem of computing estimates of factor loadings, specific variances, and communalities for a factor analytic model. The equations for maximum-likelihood estimators are discussed. Iterative formulas are developed to solve the maximum-likelihood equations and a simple and efficient method of implementation on a digital computer is described. Use of the iterative formulas and computing techniques for other estimators of factor loadings and communalities is also considered to provide a very general approach for this aspect of factor analysis.