scispace - formally typeset
Search or ask a question

Showing papers in "Psychometrika in 1958"


Journal ArticleDOI
TL;DR: In this article, an analytic criterion for rotation is defined and the scientific advantage of analytic criteria over subjective (graphical) rotational procedures is discussed, and a computational outline for the orthogonal normal varimax is appended.
Abstract: An analytic criterion for rotation is defined. The scientific advantage of analytic criteria over subjective (graphical) rotational procedures is discussed. Carroll's criterion and the quartimax criterion are briefly reviewed; the varimax criterion is outlined in detail and contrasted both logically and numerically with the quartimax criterion. It is shown that thenormal varimax solution probably coincides closely to the application of the principle of simple structure. However, it is proposed that the ultimate criterion of a rotational procedure is factorial invariance, not simple structure—although the two notions appear to be highly related. The normal varimax criterion is shown to be a two-dimensional generalization of the classic Spearman case, i.e., it shows perfect factorial invariance for two pure clusters. An example is given of the invariance of a normal varimax solution for more than two factors. The oblique normal varimax criterion is stated. A computational outline for the orthogonal normal varimax is appended.

6,754 citations


Journal ArticleDOI
TL;DR: The inter-battery method of factor analysis was devised to provide information relevant to the stability of factors over different selections of tests as mentioned in this paper, and the correlation between factors determined by scores on the tests in the two batteries was taken as factor reliability coefficients.
Abstract: The inter-battery method of factor analysis was devised to provide information relevant to the stability of factors over different selections of tests. Two batteries of tests, postulated to depend on the same common factors, but not parallel tests, are given to one sample of individuals. Factors are determined from the correlation of the tests in one battery with the tests in the other battery. These factors are only those that are common to the two batteries. No communality estimates are required. A statistical test is provided for judging the minimum number of factors involved. Rotation of axes is carried out independently for the two batteries. A final step provides the correlation between factors determined by scores on the tests in the two batteries. The correlations between corresponding factors are taken as factor reliability coefficients.

366 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider the determination of parameters of a functional relation between two variables by the means of factor analysis techniques and derive approximate solutions to obtain important results from experimental investigations.
Abstract: Consideration is given to determination of parameters of a functional relation between two variables by the means of factor analysis techniques. If the function can be separated into a sum of products of functions of the individual parameters and corresponding functions of the independent variable, particular values of the functions of the parameters and of the functions of the independent variables might be found by factor analysis. Otherwise approximate solutions may be determined. These solutions may represent important results from experimental investigations.

277 citations


Journal ArticleDOI
TL;DR: In this paper, three techniques are commonly employed to capitalize on a concomitant variate and improve the precision of treatment comparisons: stratification of the experimental samples and use of a factorial design, analysis of covariance, and analysis of variance of difference scores.
Abstract: Three techniques are commonly employed to capitalize on a concomitant variate and improve the precision of treatment comparisons: (1) stratification of the experimental samples and use of a factorial design, (2) analysis of covariance, and (3) analysis of variance of difference scores. The purpose of this paper is to compare the effectiveness of these alternatives in improving experimental precision, to identify the most precise design and the conditions under which its advantage holds, and to derive, in the case of the factorial approach, recommendations as to the optimal numbers of levels.

114 citations


Journal ArticleDOI
TL;DR: In this paper, a variance-components analysis is presented for paired comparisons in terms of three components: s, the scale value of the stimuli;d, a deviation from the linear model specified by the law of comparative judgment; andb, a binomial error component.
Abstract: A variance-components analysis is presented for paired comparisons in terms of three components:s, the scale value of the stimuli;d, a deviation from the linear model specified by the law of comparative judgment; andb, a binomial error component. Estimates are given for each of the three variances, σs2, σd2, and σb2. Several coefficients, analogous to reliability coefficients, based on these three variances are indicated. The techniques are illustrated in a replicated comparison of handwriting specimens.

91 citations


Journal ArticleDOI
TL;DR: Guttman's principal components for the weighting system are the item scoring weights that maximize the generalized Kuder-Richardson reliability coefficient as mentioned in this paper, the principal component for any item is effectively the same as the factor loading of the item divided by the item standard deviation.
Abstract: Guttman's principal components for the weighting system are the item scoring weights that maximize the generalized Kuder-Richardson reliability coefficient. The principal component for any item is effectively the same as the factor loading of the item divided by the item standard deviation, the factor loadings being obtained from an ordinary factor analysis of the item intercorrelation matrix.

67 citations


Journal ArticleDOI
TL;DR: The question of whether the null hypothesis concerning the number of common factors underlying a given set of correlations should be that this number is small is examined in this paper. But, as shown in this paper, the number is relatively large, and that smallness should be but an alternative hypothesis.
Abstract: The question is raised as to whether the null hypothesis concerning the number of common factors underlying a given set of correlations should be that this number is small. Psychological and algebraic evidence indicate that a more appropriate null hypothesis is that the number is relatively large, and that smallness should be but an alternative hypothesis. The question is also raised as to why approximation procedures should be aimed primarily at the observed correlation matrixR and not at, say,R −1. What may be best forR may be worst forR −1, and conversely, yetR −1 is directly involved in problems of multiple and partial regressions. It is shown that a widely accepted inequality for the possible rank to whichR can be reduced, when modified by communalities, is indeed false.

59 citations


Journal ArticleDOI
TL;DR: In this article, a large-sample statistical procedure is presented for making the desired significance test after appropriate adjustment for the fallibility of the control variable, when the control variables contains errors of measurement, the usual analysis of covariance fails to adjust adequately for initial differences between groups.
Abstract: When the control variable contains errors of measurement, the usual analysis of covariance fails to adjust adequately for initial differences between groups. A large-sample statistical procedure is presented for making the desired significance test after appropriate adjustment for the fallibility of the control variable.

58 citations


Journal ArticleDOI
TL;DR: In this paper, it is shown how measurement error decreases the sensitivity of a test of significance, and a method of reducing such loss of sensitivity is described and recommended for general practice.
Abstract: Implications of random error of measurement for the sensitivity of theF test of differences between means are elaborated. By considering the mathematical models appropriate to design situations involving true and fallible measures, it is shown how measurement error decreases the sensitivity of a test of significance. A method of reducing such loss of sensitivity is described and recommended for general practice.

47 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present a simple and direct estimate of the sample size required for F tests of specified power, where the power of the F tests is defined as the probability that the test will be successful.
Abstract: The specification of sample size is an important aspect of the planning of every experiment. When the investigator intends to use the techniques of analysis of variance in the study of treatments effects, he should, in specifying sample size, take into consideration the power of theF tests which will be made. The charts presented in this paper make possible a simple and direct estimate of the sample size required forF tests of specified power.

44 citations


Journal ArticleDOI
TL;DR: A theory for discrimination learning which incorporates the concept of an observing response is presented and applications of the model to cases involving probabilistic and nonprobabilistic schedules of reinforcement are considered.
Abstract: A theory for discrimination learning which incorporates the concept of an observing response is presented. The theory is developed in detail for experimental procedures in which two stimuli are employed and two responses are available to the subject. Applications of the model to cases involving probabilistic and nonprobabilistic schedules of reinforcement are considered; some predictions are derived and compared with experimental results.

Journal ArticleDOI
TL;DR: A formal set of axioms for the method of successive intervals, and directly testable consequences of the scaling assumptions are derived in this paper, where the scaling model is generalized to non-normal stimulus distributions of both specified and unspecified form.
Abstract: A formal set of axioms is presented for the method of successive intervals, and directly testable consequences of the scaling assumptions are derived. Then by a systematic modification of basic axioms the scaling model is generalized to non-normal stimulus distributions of both specified and unspecified form.

Journal ArticleDOI
TL;DR: In this article, a modified version of the simple structure proposed by Thurstone was applied to four separate correlation matrices and found to yield satisfactory results, while the results do not fully correspond to a previous graphical solution, it can be argued that the results obtained by the new method show an improved simple structure.
Abstract: The analytical method for simple structure proposed by Thurstone is applied to four separate cases and found to yield satisfactory results. The simple structure obtained by Thurstone's method is found to match closely that obtained by other methods and corresponds to the true structure of the matrix in those cases where true structure is known. Difficulties about the choice of the correct trial vector led the writer to develop a modification of Thurstone's method, useful where high speed computational facilities are available. Instructions are given for this so-called mass modification, and the procedure is illustrated with a 5-factor, 14-variable example. While the results do not fully correspond to a previous graphical solution, it can be argued that the results obtained by the new method show an improved simple structure. The modified method is applied to three other correlation matrices, yielding in each case a satisfactory simple structure.

Journal ArticleDOI
TL;DR: A significance test for determining whether a correlation coefficient is less than unity by an amount greater than that attributable to errors of measurement is proposed in this paper, where the significance test is defined as the ratio of the amount of errors to the number of errors.
Abstract: A significance test is proposed for determining whether a correlation coefficient is less than unity by an amount greater than that attributable to errors of measurement.

Journal ArticleDOI
TL;DR: In this article, the proportion of correct judgments that will be made about individual students is determined as a function of the reliability coefficient of the difference between two test scores, and the method for using such difference scores is examined in detail.
Abstract: The difference between two test scores may be quite useful even when this difference has very low reliability by conventional standards One method for using such difference scores is examined in detail: the proportion of correct judgments that will be made about individual students is determined as a function of the reliability coefficient of the difference.

Journal ArticleDOI
TL;DR: Computational formulas for two types of inconsistency which may arise are derived, and examples illustrating the use of these formulas are presented.
Abstract: Consistency in paired comparison data is defined. Two types of inconsistency which may arise are defined. Computational formulas for these types of inconsistency are derived, and examples illustrating the use of these formulas are presented.

Journal ArticleDOI
TL;DR: In this paper, a method of studying the problem of correction for guessing and other problems associated with behavior in the test situation is described and an illustrative example presented, but this method of approach is novel but at the same time, it covers many of the practical and theoretical points raised by other writers as reviewed in the introduction.
Abstract: A method of studying the problem of correction for guessing and other problems associated with behavior in the test situation is described and an illustrative example presented. As far as the writers are aware this method of approach is novel but, at the same time, it covers many of the practical and theoretical points raised by other writers as reviewed in the introduction.

Journal ArticleDOI
TL;DR: In this paper, a three-component model for comparative judgment which allows for individual differences in preference is proposed, and bounds may be set for the correlation effect which make a valid test possible in some cases and provide useful standard errors for the estimated affective values.
Abstract: A three-component model for comparative judgment which allows for individual differences in preference is proposed. An implication of the model is that errors in the observed proportions due to sampling individuals in paired comparisons experiments are correlated. By neglecting this correlation, Mosteller's test for the method of paired comparisons tends to accept falsely the goodness of fit of the Case V solution. It is shown that bounds may be set for the correlation effect which make a valid test possible in some cases and provide useful standard errors for the estimated affective values.


Journal ArticleDOI
TL;DR: This article found that the subjects expected the test directions to serve a different purpose than the one they were designed to serve, and there were differences in the degree to which these subjects treated the problems as exercises in logic or exercises in perception.
Abstract: During the past twenty-five years or so, a number of factor-analytic studies have identified a group of tests as belonging to a category which is frequently called Spatial Relations. However, continued use of a single experimental technique may lead to diminishing returns, and new techniques may be needed to develop new kinds of information. The present study was undertaken to find what information could be gained by relatively intensive and extended interviewing. Five subjects took part in about two hours of testing and six hours of interviews and discussion. The purpose of this investigation included studying the uses of interviewing as a method as well as studying the nature of spatial relations tests and the abilities they measure. It was observed that the subjects expected the test directions to serve a different purpose than the one they were designed to serve. It was also found that there were differences in the degree to which these subjects treated the problems as exercises in logic or exercises in perception; and it was found that the units of thought and perception varied with the subjects, the problems, and with the subjects' purposes.

Journal ArticleDOI
TL;DR: In this paper, two different linear models are presented for the four-dimensional classification system in which correlations exist between certain pairs of observations, except for the assumption of correlated observations, classical assumptions associated with classification systems are made.
Abstract: Two different linear models are presented for the four-dimensional classification system in which correlations exist between certain pairs of observations. Except for the assumption of correlated observations, classical assumptions associated with classification systems are made. The models considered are modifications of those which underlie the split-plot design and the split-split-plot design. In the first model the correlations between observations of the levels of one dimension are all set equal toρ. In the second model the observations of the levels of one dimension are assumed correlated to degreeρ1, whereas the observations of a second dimension are correlated to degreeρ2. Analyses for the two models and tests of hypotheses for various parameters are indicated.

Journal ArticleDOI
TL;DR: In this article, the author pointed out that real life has mocked this branch of fiction so closely that I now miss in it the elements of fantasy and escape that used to delight.
Abstract: Until recently, two happy vices stole a good share of my time. Of those burglars, the first was the reading of science-fiction stories, a source of entertainment that I discovered at the age of eight, but kept manfully in check because I never stumbled upon a continuous supply. Of late, however, real life has mocked this branch of fiction so closely that I now miss in it the elements of fantasy and escape that used to delight. You might say that, being crowded out by the real thing, we armchair spacemen face technological obsolescence. My second time-stealer has been "Murder for Pleasure," as Howard Haycraft calls the detective story. Fortunately, I did not discover its existence until my twenties, so I had a little time for study before then. The great value of the detective story, quite apart from its educational aspects-after all, where else can one gather so much data about useful everyday questions such as tasteless poisons, homicidal law, and criminal psychology i s the fine training one gets in applied psychology and in the logic of everyday life. Sigmund Freud and Carl Gustav Jung can look to their laurels because, after even a short course, we mystery-story fans can twist the Slightest tongue-slip into a full confession, and impractical logic-choppers like Alfred North Whitehead and Bertrand Russell are pitifully pedestrian in their thinking, compared to the agile mental leaps we make--alongside Sherlock Holmes. We have to lump because Sherlock has a great reluctance to share his evidence with the reader. In fact, to put it bluntly, Sherlock cheated, but we excuse him because he performed before the rules were written. Now that we have rules of evidence for modern detective stories, all readers find it easy to search out the killer. (Anyway all readers I've ever met.) Indeed Robert Louis Stevenson commented upon this remarkable ability, "I t is the difficulty of the police romance that the reader is ahvays a person of such vastly greater ingenuity than the writer." And I personally have a sneaking suspicion that this is also one of the difficulties of Presidential Addresses. In best American style I wish to capitalize on my own deficiencies and take advantage of your ingenuity by pointing to a number of puzzles that seem to me to need more detective work in this Society--some minor like parking

Journal ArticleDOI
TL;DR: In this paper, it is argued that the factors that emerge when tests are inter-correlated represent to some extent these formal characteristics rather than desired intellectual functions, which detracts from their validity, especially in counseling or differentiating abilities along different lines.
Abstract: Evidence is here brought together suggesting that educational and other psychological tests are appreciably dependent not only on the abilities that they are supposed to measure, but on the form and material of the tests and their items, sets and attitudes in the testees, and various conditions of testing. The factors that emerge when tests are inter-correlated represent to some extent these formal characteristics rather than desired intellectual functions. In particular this applies to current educational tests which, for sound educational reasons, avoid simpler, more factual, items and prefer complex comprehension-type or problem items. Part II shows that, while tests of the same objectives employing different forms tend to give discrepant results (e.g., essay and new-type), tests in the same form which are aimed at different school subjects or different intellectual functions inter-correlate very highly. This detracts from their validity, especially in counseling or differentiating abilities along different lines. For many purposes the simpler tests show superior validity, and it is doubtful how far the more complex ones do bring in the ‘higher’ intellectual functions at which they are aimed. Part III argues that, in like manner, most of the cognitive factors established by psychologists show little relevance to educational or other external criteria, possibly because they represent factors within paper-and-pencil tests rather than genuine factors of intellect. Part IV. A first essential is to map out more precisely the extent and nature of form influences, so that, in so far as they prove to be important, account can be taken of them in test construction. A number of types of predictive measures are discussed which might add more to the validity of educational selection, particularly at college level, than do complex forms of reading tests. These include: better use of essay examinations and school grades, listening comprehension tests, speed and error scores, indices of study habits, tests based on learning situations, and tests of rational thinking, flexibility and creativity. When tests are used primarily for evaluation rather than prediction, investigations should be guided by the conception of intrinsic or construct validity, which – unlike logical or face validity – involves the collection of empirical evidence. Some of the more promising approaches to such problems are through factorial studies of educational tests along with tests of psychological factors and ‘real-life’ criteria, and experimental studies of group differences and of the gains brought about by different types of teaching.

Journal ArticleDOI
TL;DR: The authors compared a quartimax rotation of the centroid factor loadings for Thurstone's Primary Mental Abilities Test Battery with factorings of the same correlation matrix by Thurstone (simple structure), Zimmerman (revised simple structure), Holzinger and Harman (bi-factor analysis), and Eysenck (group factor analysis).
Abstract: This study compares a quartimax rotation of the centroid factor loadings for Thurstone's Primary Mental Abilities Test Battery with factorings of the same correlation matrix by Thurstone (simple structure), Zimmerman (revised simple structure), Holzinger and Harman (bi-factor analysis), and Eysenck (group factor analysis). The quartimax results agree very closely with the solutions of Holzinger and Harman and of Eysenck, and reasonably well with the two simple structure analyses. The principal difference is the general factor provided by the quartimax solution. Reproduction of the factorial structure is sufficiently good to justify its use at least as the first stage of rotation. More extensive trial of the method will be needed with more varied data before it will be possible to decide whether quartimax factors meet psychological requirements sufficiently well without further rotation.

Journal ArticleDOI
TL;DR: In this article, a generalized split-half Spearman-brown coefficient is used to derive the Kuder-Richardson case IV of the case series, and the basic assumption employed is shown to be sufficient to justify the various assumptions used in derivations by other authors.
Abstract: Case IV of the Kuder-Richardson series, their formula (21), is derived as a generalized split-half Spearman-Brown coefficient. The basic assumption employed is shown to be sufficient to justify the various assumptions used in derivations by other authors. Some of the implications of this assumption are discussed.

Journal ArticleDOI
TL;DR: In this article, the expected value of mean square concept is used to determine the effects of the presence of interactions in the single Latin square design on F tests, and the results indicate that as the number of random effects included in the experiment increase, more F tests are unbiased, and that some of these are valid F tests.
Abstract: The expected value of mean square concept is used to determine the effects of the presence of interactions in the single Latin square design onF tests. The results indicate that as the number of random effects included in the experiment increase, moreF tests are unbiased, and that some of these are validF tests. However, whenF test bias does occur it is almost always of a negative nature so that the conclusions stated are conservative ones. PositiveF test bias may occur when the triple interaction is extant and when zero or one random variate is included in the experiment.

Journal ArticleDOI
TL;DR: In this article, the average Spearman rank correlation between independent rankings and an untied criterion ranking, corrected for ties in any or all of the independent rankings, is presented. But the correlation is not robust.
Abstract: This note presents the average Spearman rank correlation betweenm independent rankings and an untied criterion ranking, corrected for ties in any or all of the independent rankings

Journal ArticleDOI
TL;DR: In this article, the ambiguity of paired comparison judgments may be measured by the quantity √σ¯¯¯¯ 2 + σ� 2 − 2ργγεργεγε − 2rγε βγεβε βεγα βεβα σiσi, which is termed the comparatal dispersion.
Abstract: It is suggested that the ambiguity of aset of paired comparison judgments may be measured by the quantity √σ 2 + σ j 2 − 2r ij σiσi, This quantity is termed thecomparatal dispersion. A simultaneous solution for scale values and ratios of comparatal dispersions has been presented and applied to some data on food preferences.


Journal ArticleDOI
TL;DR: In this paper, a stochastic process applicable to the learning behavior of an individual subject is discussed, which describes both the response times and the sequence of choices obtained from a situation involving two alternatives.
Abstract: A stochastic process applicable to the learning behavior of an individual subject is discussed. The process describes both the response times and the sequence of choices obtained from a situation involving two alternatives. Parameter estimates and techniques for assessing goodness of fit are considered.