Showing papers on "Item response theory published in 1978"

PDF

Open Access

Journal Article•DOI•

Application of a psychometric rating model to ordered categories which are scored with successive integers

[...]

David Andrich¹•Institutions (1)

01 Oct 1978-Applied Psychological Measurement

TL;DR: In this paper, a latent trait measurement model in which ordered response categories are both parameterized and scored with successive integers is investigated and applied to a summated rating or Likert questionnaire.

...read moreread less

Abstract: A latent trait measurement model in which ordered response categories are both parameterized and scored with successive integers is investigated and applied to a summated rating or Likert ques tionnaire In addition to each category, each item of the questionnaire and each subject are para meterized in the model; and maximum likelihood estimates for these parameters are derived Among the features of the model which make it attractive for applications to Likert questionnaires is that the total score is a sufficient statistic for a subject's at titude measure Thus, the model provides a formal ization of a familiar and practical procedure for measuring attitudes

...read moreread less

403 citations

Journal Article•DOI•

Relationships between the Thurstone and Rasch approaches to item scaling

[...]

David Andrich¹•Institutions (1)

University of Western Australia¹

01 Jul 1978-Applied Psychological Measurement

TL;DR: When the logistic function is substituted for the normal, Thurstone's Case V specialization of the law of comparative judgment for paired comparison responses gives an identical equation for the especial case as discussed by the authors.

...read moreread less

Abstract: When the logistic function is substituted for the normal, Thurstone's Case V specialization of the law of comparative judgment for paired comparison responses gives an identical equation for the es...

...read moreread less

134 citations

Journal Article•DOI•

Developments in Latent Trait Theory: Models, Technical Issues, and Applications

[...]

Ronald K. Hambleton, Hariharan Swaminathan, Linda L. Cook, Daniel R. Eignor, Janice A. Gifford - Show less +1 more

01 Dec 1978-Review of Educational Research

TL;DR: In this article, the authors point out the shortcomings of standard testing and measurement technology, such as the values of standard item parameters (item difficulty and item discrimination) are not invariant across groups of examinees that differ in ability.

...read moreread less

Abstract: There are many shortcomings of standard testing and measurement technology.' For one, the values of standard item parameters (item difficulty and item discrimination) are not invariant across groups of examinees that differ in ability. This means that standard item statistics are only useful in test construction for examinee populations very similar to the sample of examinees in which the item statistics were obtained. Another shortcoming is that comparisons of examinees on an ability measured by a set of test items comprising a test are limited to situations where examinees are administered the same (or parallel) test items. Finally, standard testing technology provides no basis for determining what a particular examinee might do when confronted with a

...read moreread less

95 citations

A primer of item response theory

[...]

Thomas A Warm

01 Dec 1978

TL;DR: Item Response Theory (IRT) as mentioned in this paper is a latent trait theory based on test characteristic curve theory, which is used for the testing practitioner with minimum training in statistics and psychometrics.

...read moreread less

Abstract: : This book is an introduction to Item Response Theory (IRT) (also called Item Characteristic Curve Theory, or latent trait theory) It is written for the testing practitioner with minimum training in statistics and psychometrics It presents in simple language and with examples the basic mathematical concepts needed to understand the theory Then, building upon those concepts, it develops the basic concepts of Item Response Theory: item parameters, item response function, test characteristic curve, item information functions, test information curve, relative efficiency curve, and score information curve The maximum likelihood and Bayesian modal estimates of ability are described with illustrative examples After a discussion of assumptions and available computer programs, some practical applications are presented, ie equating scales, tailored testing, item cultural bias, and setting pass-fail cut-offs (Author)

...read moreread less

37 citations

An Evaluation of Select Approaches For Biased Item Identification.

[...]

Lawrence M. Rudner, John J. Convey

01 Mar 1978

32 citations

Journal Article•DOI•

Forgetting, Guessing, and Mastery: The Macready and Dayton Models Revisited and Compared with a Latent Trait Approach

[...]

Willem J. van der Linden

21 Dec 1978-Journal of Educational and Behavioral Statistics

TL;DR: In this paper, a correction is proposed that takes account of the fact that a master who is not able to produce the right answer to an item may guess and the meaning of this correction and its consequences for estimating the model parameters are discussed.

...read moreread less

Abstract: Macready and Dayton (1977) introduced two probabilistic models for mastery assessment based on an idealistic all-or-none conception of mastery. Although these models are in statistical respects complete, the question is whether they are a plausible rendering of what happens when an examinee responds to an item. First, a correction is proposed that takes account of the fact that a master who is not able to produce the right answer to an item may guess. The meaning of this correction and its consequences for estimating the model parameters are discussed. Second, Macready and Dayton’s latent class models are confronted with the three-parameter logistic model extended with the conception of mastery as a region on a latent variable. It appears that from a latent trait theoretic point of view, the Macready and Dayton models assume item characteristic curves that have the unrealistic form of a step function with a single step. The implications of the all-or-none conception of mastery for the learning process wil...

...read moreread less

17 citations

A Factor Analytic Investigation of Seven Experimental Analytical Item Types.

[...]

Donald E. Powers

01 Jun 1978

14 citations

Journal Article•DOI•

The use of latent partition analysis to identify homogeneity of an item population

[...]

Alan R. Hartke¹•Institutions (1)

The American College of Financial Services¹

01 Mar 1978-Journal of Educational Measurement

TL;DR: In this paper, the authors suggest latent partition analysis as a means of empirically demonstrating the conceptual homogeneity of an item population and use it to identify the most salient features of a population.

...read moreread less

Abstract: The purpose of this study is to suggest latent partition analysis (Wiley, 1967) as a means of empirically demonstrating the conceptual homogeneity of an item population. Throughout the psychometric literature, there is general agreement that homogeneity would be a desirable characteristic of an item population. However, the question of exactly what homogeneity should mean or how it should be measured has never been resolved.

...read moreread less

13 citations

Journal Article•DOI•

The Ineffectiveness of Multiple True-False Test Items:

[...]

Robert L. Ebel¹•Institutions (1)

Michigan State University¹

01 Apr 1978-Educational and Psychological Measurement

TL;DR: In this paper, the same items were presented in clusters (32 triads, 2 dyads) with instructions to pick the one true (or the one false) statement, and the relation between the difficulty of the component true-false items and of the multiple choice clusters was examined.

...read moreread less

Abstract: Graduate students in education responded twice at the same sitting to 100 true-false questions on educational measurement. The items were presented as Part 1 and Part 2 of a midterm test. In Part 1, the items were presented separately with instructions to the students to mark each statement true or false. In Part 2, the same items were presented in clusters (32 triads, 2 dyads) with instructions to pick the one true (or the one false) statement. Scores on Part 1 were much more reliable than scores on Part 2. These results support the suggestion from test specialists that test constructors should avoid use of multiple true-false items. The relation between the difficulty of the component true-false items and of the multiple choice clusters was examined.

...read moreread less

13 citations

Journal Article•DOI•

A comprehensive system for item analysis in psychological scale construction

[...]

Steven A. Schwartz¹•Institutions (1)

University of Windsor¹

01 Jun 1978-Journal of Educational Measurement

TL;DR: In this paper, the authors present a survey of the steps involved in item analysis in personality scales, questionnaires, and inventories, and present a series of items which appear on the surface to be tapping the construct, and put the test into practice.

...read moreread less

Abstract: Although hundreds of published and unpublished personality scales, questionnaires, and inventories have been developed since World War II, relatively little formal exposition is available concerning the steps involved in item analysis. It is felt that this situation has been compounded by past adherence to a strictly rational or empirical construction schema, since each has implied that only certain statistical item-analytic techniques are appropriate. At its extreme, rational scale construction has involved only a few agreed upon steps: (a) select and define a construct of interest, (b) write a series of items which appear on the surface to be tapping the construct, and (c) put the test into practice, perhaps attempting to differentiate criterion groups on the basis of obtained score. In many instances, only the most rudimentary item-analytic procedures, if any, have been used.

...read moreread less

11 citations

Journal Article•DOI•

A Method for Increasing the Reliability of a Short Multiple-Choice Test

[...]

Ronald C. Serlin¹, Henry F. Kaiser¹•Institutions (1)

University of California, Berkeley¹

01 Jul 1978-Educational and Psychological Measurement

TL;DR: In this article, a method for utilizing this information is suggested that weights each alternative, including no response, on the test to yield maximum coefficient alpha (generalized Kuder-Richardson formula 20).

...read moreread less

Abstract: When multiple-choice tests are scored in the usual manner, giving each correct answer one point, information concerning response patterns is lost. A method for utilizing this information is suggested that weights each alternative, including no response, on the test to yield maximum coefficient alpha (generalized Kuder-Richardson formula 20). An example is presented, and the suggested method of scoring is compared with two conventional methods of scoring for this example.

...read moreread less

Journal Article•DOI•

Convergent and Discriminant Validity in Item Analysis.

[...]

David J. Krus¹, Robert G. Ney¹•Institutions (1)

Arizona State University¹

01 Apr 1978-Educational and Psychological Measurement

TL;DR: In this article, an alternative algorithm for item analysis is described, in which item discrimination indices have been defined for item distractors in addition to their traditional definition for the scored alternative, and the Campbell and Fiske concept of convergent and discriminant validity was reconceptualized from the test to the item level and proposed as an aid in interpretation of results of item analysis.

...read moreread less

Abstract: Described is an alternative algorithm for item analysis in which item discrimination indices have been defined for item distractors in addition to their traditional definition for the scored alternative. Also, the Campbell and Fiske concept of convergent and discriminant validity was reconceptualized from the test to the item level and proposed as an aid in interpretation of results of item analysis.

...read moreread less

Journal Article•DOI•

The effect of guessing on item reliability under answer-until-correct scoring

[...]

Michael T. Kane¹, James M. Moloney²•Institutions (2)

National League for Nursing¹, State University of New York at Brockport²

01 Jan 1978-Applied Psychological Measurement

TL;DR: In this paper, a modified version of Horst's model for examinee behavior was used to compare the effect of guessing on item reliability for the answer-until-correct (AUC) and zero-one (ZO) scoring procedures.

...read moreread less

Abstract: The answer-until-correct (AUC) procedure re quires that examinees respond to a multiple-choice item until they answer it correctly. The examinee's score on the item is then based on the number of responses required for the item. It was expected that the additional responses obtained under the AUC procedure would improve reliability by pro viding additional information on those examinees who fail to choose the correct alternative on their first attempt. However, when compared to the zero- one (ZO) scoring procedure, the AUC procedure has failed to yield consistent improvements in relia bility. Using a modified version of Horst's model for examinee behavior, this paper compares the ef fect of guessing on item reliability for the AUC pro cedure and the ZO procedure. The analysis shows that the relative efficiency of the two procedures de pends strongly on the nature of the item alterna tives and implies that the appropriate criteria for item selection are different for each procedure. Conflicting results rep...

...read moreread less

A Primer of Item Response Theory. Technical Report 940279.

[...]

Thomas A. Warm

01 Dec 1978

Invariance of Rasch Model Ability Parameter Estimates Over Different Collections of Items.

[...]

Allen R. Curry

01 Mar 1978

A Comparison of the One- and Three-Parameter Logistic Models for Item Calibration.

[...]

Mark D. Reckase

01 Mar 1978

Journal Article•DOI•

A Comparison of Three Algorithms for Item Analysis to Maximize Coefficient Alpha

[...]

John D. Morris¹•Institutions (1)

Georgia Southern University¹

01 Oct 1978-Educational and Psychological Measurement

TL;DR: Three algorithms for selecting a subset of originally available items to maximize coefficient alpha, including one advanced by Serlin and Kaiser (1976), were compared on the size of the resulting alpha and computation time required with nine sets of data.

...read moreread less

Abstract: Three algorithms for selecting a subset of originally available items to maximize coefficient alpha, including one advanced by Serlin and Kaiser (1976), were compared on the size of the resulting alpha and computation time required with nine sets of data. Results indicated that a combination of the two alternate algorithms proposed would perform better than the Serlin-Kaiser method. The characteristics of a computer program to perform these item analyses are described.

...read moreread less

Journal Article•DOI•

Answer Changing on Objective Tests

[...]

Peter F. Jackson

01 Jul 1978-Journal of Educational Research

TL;DR: It is shown that answer changes were more likely to be made on items occurring early in a group of items and toward the end of a test.

...read moreread less

Abstract: In an attempt to identify some of the causes of answer changing behavior, the effects of four tests and item specific variables were evaluated. Three samples of New Zealand school children of different ages were administered tests of study skills. The number of answer changes per item was compared with the position of each item in a group of items, the position of each item in the test, the discrimination index and the difficulty index of each item. It is shown that answer changes were more likely to be made on items occurring early in a group of items and toward the end of a test. There was also a tendency for difficult items and items with poor discriminations to be changed more frequently. Some implications of answer changing in the design of tests are discussed.

...read moreread less

Estimation of the Operating Characteristics of Item Response Categories. VI. Proportioned Sum Procedure in the Conditional P.D.F. Approach.

[...]

Fumiko Samejima

10 Dec 1978

TL;DR: In this paper, the Pearson System method and the Two-Parameter Beta Method are used for both Degree 3 and 4 cases, and the results are compared with the previous ones. But the results of the two item parameters in the normal ogive model are also estimated for each item, and compared with those obtained by the other two procedures, and mean square errors are adopted in evaluating both the estimated item characteristic functions and probability density functions of ability.

...read moreread less

Abstract: : Following Simple Sum Procedure and Weighted Sum Procedure, another method, Proportioned Sum Procedure, is introduced in the context of the Conditional P.D.F. Approach. The new method is somewhat different from the previous two, however, in the sense that the set of conditional density functions is not exclusively recategorized into the item score groups, but they are proportioned into each item score category. The same hypothetical data, i.e., the maximum likelihood estimates of the five hundred hypothetical subjects and their responses to the ten binary items, each of which follows the normal ogive model, are used to try the method. The criterion item characteristic function for each binary item is obtained and compared with those obtained by the other two procedures. The Pearson System Method and the Two-Parameter Beta Method are used for both Degree 3 and 4 Cases, and the results are compared with the previous ones. The mean square errors are adopted in evaluating both the resultant estimated item characteristic functions and probability density functions of ability. The two item parameters in the normal ogive model are also estimated for each item, and the results are compared with the previous ones. (Author)

...read moreread less

Journal Article•DOI•

The Use of Control Charts to Display Item Analysis Data.

[...]

Paul W. Rogers¹•Institutions (1)

Syracuse University¹

01 Dec 1978-Educational and Psychological Measurement

TL;DR: This procedure provides for a more efficient display of test data and could be used to supplement computerized item-analysis programs.

...read moreread less

Abstract: A procedure is described in this paper for the display of item analysis data using two types of control charts. The item control chart consists of a plot of item difficulty versus each item with action and warning limits drawn to denote boundaries for the 95% and 99% confidence intervals. A reference line is drawn at a mean item difficulty of 50%. The second control chart consists of a plot of item discrimination versus item difficulty for all test items. This procedure provides for a more efficient display of test data and could be used to supplement computerized item-analysis programs.

...read moreread less

Journal Article•DOI•

Test performance under the condition of known item difficulty

[...]

Schuyler W. Huck¹•Institutions (1)

University of Tennessee¹

01 Mar 1978-Journal of Educational Measurement

TL;DR: A response set (or response style) has been defined as "a habit or a momentary set causing the subject to earn a different score from the one he would earn if the same items were presented in a different form".

...read moreread less

Abstract: A response-set (or response-style) has been defined as "a habit or a momentary set causing the subject to earn a different score from the one he would earn if the same items were presented in a different form" (Cronbach, 1970, p 148) Several different types of response sets (eg, acquiescence, willingness-to-guess, evasiveness, etc) have been postulated and supported by empirical research In personality instruments, the operation of response sets is often desirable to permit the identification of individuals possessing certain traits In ability and achievement tests, however, response sets are to be avoided since they "dilute a test with factors not intended to form part of the test content, and so reduce its logical validity" (Cronbach, 1950, p 3) In administering objective classroom examinations, this author has observed that some students miss test items, not because they lack the information or skills necessary to answer correctly, but rather because they read the question with insufficient care They think they know what the question says and respond accordingly, when the question poses a different problem For example, a true-false test item might be presented as follows:

...read moreread less