Showing papers on "Differential item functioning published in 1991"

PDF

Open Access

Book•

[...]

Ronald K. Hambleton, Hariharan Swaminathan, H. Jane Rogers

23 Jul 1991

TL;DR: This research attacked the mode-based approach to item response theory with a model- data fit approach, and found that the model-Data Fit approach proved to be more accurate than the other approaches.

...read moreread less

Abstract: Background Concepts, Models, and Features Ability and Item Parameter Estimation Assessment of Model-Data Fit The Ability Scale Item and Test Information and Efficiency Functions Test Construction Identification of Potentially Biased Test Items Test Score Equating Computerized Adaptive Testing Future Directions of Item Response Theory

...read moreread less

2,583 citations

Journal Article•DOI•

Empirical Comparison between Factor Analysis and Multidimensional Item Response Models.

[...]

Dirk L. Knol, Martijn P.F. Berger

01 Jul 1991-Multivariate Behavioral Research

TL;DR: It is concluded that for multiddimensional data a common factor analysis on the matrix of tetrachoric correlations performs at least as well as the theoretically appropriate multidimensional item response models.

...read moreread less

Abstract: Many factor analysis and multidimensional item response models for dichotomous variables have been proposed in literature. The models and various methods for estimating the item parameters are reviewed briefly. In a simulation study these methods are compared with respect to their estimates of the item parameters both in terms of an item response theory formulation and in terms of a factor analysis formulation. It is concluded that for multidimensional data a common factor analysis on the matrix of tetrachoric correlations performs at least as well as the theoretically appropriate multidimensional item response models.

...read moreread less

189 citations

Journal Article•DOI•

Constructed response and differential item functioning: a pragmatic approach1

[...]

Neil J. Dorans, Alicia P. Schmitt

01 Dec 1991-ETS Research Report Series

TL;DR: Differential item functioning (DIF) assessment attempts to identify items or item types for which subpopulations of examinees exhibit performance differentials that are not consistent with the performance differences typically seen for those sub-populations on collections of items that purport to measure a common construct.

...read moreread less

Abstract: Differential item functioning (DIF) assessment attempts to identify items or item types for which subpopulations of examinees exhibit performance differentials that are not consistent with the performance differentials typically seen for those subpopulations on collections of items that purport to measure a common construct. DIF assessment requires a rule for scoring items and a matching variable on which different subpopulations can be viewed as comparable for purposes of assessing their performance on items. Typically, DIF is operationally defined as a difference in item performance between subpopulations, e.g., Blacks and Whites, that exists after members of the different subpopulations have been matched on some total score. Constructed-response items move beyond traditional multiple-choice items, for which DIF methodology is well-defined, towards item types involving selection or identification, reordering or rearrangement, substitution or correction, completion, construction, and performance or presentation. This paper defines DIF, describes two standard procedures for measuring DIF and indicates how DIF might be assessed for certain constructed-response item types. The description of DIF assessment presented in this paper is applicable to computer-delivered constructed-response items as well as paper and pencil delivered items.

...read moreread less

99 citations

Journal Article•DOI•

A comparison of two area measures for detecting differential item functioning

[...]

Seock-Ho Kim¹, Allan S. Cohen¹•Institutions (1)

University of Wisconsin-Madison¹

01 Sep 1991-Applied Psychological Measurement

TL;DR: The area between two item response functions is often used as a measure of differential item functioning under item response theory as mentioned in this paper, and this area can be measured over either an open interval (i.e., ex...

...read moreread less

Abstract: The area between two item response functions is often used as a measure of differential item functioning under item response theory. This area can be measured over either an open interval (i.e., ex...

...read moreread less

53 citations

Journal Article•DOI•

A comparison of two methods for detecting differential I item functioning in an ESL placement test

[...]

Miyuki Sasaki¹•Institutions (1)

Nagoya Gakuin University¹

01 Dec 1991-Language Testing

TL;DR: The authors compared two approximation techniques for detecting differential item functioning (DIF) in an English as a second language (ESL) placement test when the group sizes are too small to use other possible methods (e.g., the three parameter item response theory method).

...read moreread less

Abstract: This paper compares two approximation techniques for detecting differential item functioning (DIF) in an English as a second language (ESL) placement test when the group sizes are too small to use other possible methods (e.g., the three parameter item response theory method). An application of the Angoff delta- plot method (Angoff and Ford, 1973) utilizing the one parameter Rasch model adopted in Chen and Henning (1985), and Scheuneman's chi-square method (Scheuneman, 1979) were chosen because they are among the few methods appropriate for a sample size smaller than 100. Two linguistically and culturally diverse groups (Chinese and Spanish speaking) served as the subjects of this study. The results reveal that there was only marginal overlap between DIF items detected by Chen and Henning's method and Scheuneman's method; the former detected fewer DIF items with less variety than the latter. Moreover, Chen and Henning's method tended to detect easier items with smaller differ ences in p-value between the t...

...read moreread less

49 citations

Journal Article•DOI•

Maximum marginal likelihood estimation for semiparametric item analysis

[...]

James O. Ramsay¹, S. Winsberg•Institutions (1)

McGill University¹

01 Sep 1991-Psychometrika

TL;DR: In this article, the item characteristic curve (ICC) is estimated by using polynomial regression splines, which provide a more flexible family of functions than is given by the three-parameter logistic family.

...read moreread less

Abstract: The item characteristic curve (ICC), defining the relation between ability and the probability of choosing a particular option for a test item, can be estimated by using polynomial regression splines. These provide a more flexible family of functions than is given by the three-parameter logistic family. The estimation of spline ICCs is described by maximizing the marginal likelihood formed by integrating ability over a beta prior distribution. Some simulation results compare this approach with the joint estimation of ability and item parameters.

...read moreread less

34 citations

Journal Article•DOI•

Use of Restricted Item Response Theory Models for Examining the Stability of Item Parameter Estimates over Time

[...]

Clement A. Stone, Suzanne Lane

01 Apr 1991-Applied Measurement in Education

TL;DR: In this paper, a model-testing approach for evaluating the stability of IRT item parameter estimates in a pretest-posttest design was presented, where a random sample of pretest and posttest responses to a 19-item math test was assessed for a group of children assessed on two different occasions.

...read moreread less

Abstract: Item parameter instability can threaten the validity of inferences about changes in student achievement when using Item Response Theory- (IRT) based test scores obtained on different occasions. This article illustrates a model-testing approach for evaluating the stability of IRT item parameter estimates in a pretest-posttest design. Stability of item parameter estimates was assessed for a random sample of pretest and posttest responses to a 19-item math test. Using MULTILOG (Thissen, 1986), IRT models were estimated in which item parameter estimates were constrained to be equal across samples (reflecting stability) and item parameter estimates were free to vary across samples (reflecting instability). These competing models were then compared statistically in order to test the invariance assumption. The results indicated a moderately high degree of stability in the item parameter estimates for a group of children assessed on two different occasions.

...read moreread less

17 citations

Journal Article•DOI•

Influence of the Criterion Variable on the Identification of Differentially Functioning Test Items Using the Mantel-Haenszel Statistic.

[...]

Brian E. Clauser¹, Kathleen M. Mazor¹, Ronald K. Hambleton¹•Institutions (1)

University of Massachusetts Amherst¹

01 Dec 1991-Applied Psychological Measurement

TL;DR: In this paper, the effectiveness of the Mantel-Haenszel (MH) statistic in detecting differentially functioning test items when the internal criterion was varied was investigated, and the results revealed that the choice of criterion, total test score versus subtest score, had a substantial influence on the classification of items as to whether or not they were differentially functional in the American and Native American groups.

...read moreread less

Abstract: This study investigated the effectiveness of the Mantel-Haenszel (MH) statistic in detecting dif ferentially functioning (DIF) test items when the internal criterion was varied. Using a dataset from a statewide administration of a life skills examina tion, a sample of 1,000 Anglo-American and 1,000 Native American examinee item response sets were analyzed. The MH procedure was first applied to all the items involved. The items were then cate gorized as belonging to one or more of four subtests based on the skills or knowledge needed to select the correct response. Each subtest was then analyzed as a separate test, using the MH pro cedure. Three control subtests were also established using random assignment of test items and were analyzed using the MH procedure. The results revealed that the choice of criterion, total test score versus subtest score, had a substantial influence on the classification of items as to whether or not they were differentially functioning in the American and Native American group...

...read moreread less

17 citations

Differential Item Functioning: Performance by Sex on Reading Comprehension Tests.

[...]

Naomi Gafni

01 Jan 1991

TL;DR: In this paper, items in the verbal (Hebrew and English) sections of the psychometric entrance test (PET) administered for university admission in Israel were studied for differential item functioning (DIF) between the sexes.

...read moreread less

Abstract: Items in the verbal (Hebrew and English) sections of the Psychometric Entrance Test (PET) administered for university admission in Israel were studied for differential item functioning (DIF) between the sexes. Analyses were conducted for 4,354 males and 4,901 females taking Form 3 of the PET in April 1984, and 3,785 males and 3,615 females taking Form 17 of the PET in April 1987. Three subtests were examined: (1) veral reasoning; (2) English; and (3) mathematical reasoning (a control non-verbal test). DIF was determined for the 1984 population through: the weighted sum of the differences between the twc groups and across all ability groups; and the root of the mean squared differences as defined above. These two indices and a Mantel-Haenszel chi square test examined DIF for the 1987 group. About one-third of the items in the verbal and mathematics reasoning 1-.rts were found to have DIE', but few English subtest items did so. The content of some of the items exhibiting DIF was clearly related to stereotypical perceptions of feminine and masculine areas of interest. Implications for test content are discussed. (SLD)

...read moreread less

13 citations

Journal Article•DOI•

Differential speededness and item omit patterns on the sat

[...]

Alicia P. Schmitt, Neil J. Dorans, Carolyn R. Crone, Behroz T. Maneckshana

01 Dec 1991-ETS Research Report Series

TL;DR: In this paper, two editions of the Verbal and Mathematical portions of the Scholastic Aptitude Test (SAT) were used to study differential speededness and differential omission and the relationships among differential item functioning (DIF), differential omission, and item difficulty for Asian-Americans, Blacks, Hispanics, and females.

...read moreread less

Abstract: Two editions of the Verbal and Mathematical portions of the Scholastic Aptitude Test (SAT) were used to study differential speededness and differential omission and the relationships among differential item functioning (DIF), differential omission, and item difficulty for Asian-Americans, Blacks, Hispanics, and females. Consistent and replicable evidence of differential speededness was found for Blacks and Hispanics. Use of an unspeeded criterion for matching in place of the traditional total score, which contains speeded items, does not affect the DIF analyses of the speeded items. A strong artifactual negative relationship between DIF and differential omission was found. The relationship between differential omission and difficulty was consistently positive on the Verbal sections for all comparison groups except the Asian-American group, for whom it was consistently negative. On the Mathematical sections, this relationship was only consistently found for the female/male comparison, for whom it was negative. Finally, the relationship between difficulty and DIF was negative but smaller than previously observed.

...read moreread less

12 citations

Journal Article•DOI•

Influence of Prior Distributions on Detection of DIF.

[...]

Allan S. Cohen¹, Seock-Ho Kim¹, Michael J. Subkoviak¹•Institutions (1)

University of Wisconsin-Madison¹

01 Mar 1991-Journal of Educational Measurement

TL;DR: In this paper, the authors investigated the detection of differential item functioning (DIF) on items intentionally constructed to favor one group over another in two item response theory-based computer programs, LOGIST and BILOG.

...read moreread less

Abstract: Detection of differential item functioning (DIF) on items intentionally constructed to favor one group over another was investigated on item parameter estimates obtained from two item response theory-based computer programs, LOGIST and BILOG. Signed- and unsigned-area measures based on joint maximum likelihood estimation, marginal maximum likelihood estimation, and two marginal maximum a posteriori estimation procedures were compared with each other to determine whether detection of DIF could be improved using prior distributions. Results indicated that item parameter estimates obtained using either prior condition were less deviant than when priors were not used. Differences in detection of DIF appeared to be related to item parameter estimation condition and to some extent to sample size.

...read moreread less

Journal Article•DOI•

Alternative mathematical aptitude item types: dif issues

[...]

Alicia P. Schmitt, Carolyn R. Crone

01 Dec 1991-ETS Research Report Series

TL;DR: In this paper, the authors compared the performance of algebra-placement and student-produced response (SPR) items with the SAT-Math items, and found that the SPR items had higher levels of differential speededness than the AP-M.

...read moreread less

Abstract: Alternative mathematical items administered as prototypes at the Spring 1989 Field Trials are evaluated for differential item functioning (DIF) and differential speededness. Results for Algebra Placement (AP) and Student Produced Response (SPR) items are presented and contrasted to results obtained on the two current SAT-Math items: Regular Math and Quantitative Comparison. Analyses on comparisons between female and comparable male examinees, and between Asian-American, Black, and Hispanic examinees in comparison to comparable White examinees indicate that both of these alternative items appear to have DIF. Additional DD? analyses comparing the use of an internal versus an external matching criteria for the SPR items show evidence of negative DIF with either criteria. Results using the MH D-DIF statistic are more extreme than DIF results using the STD P-DIF index. The metric used to calculate the DIF indices may be accountable for the differences observed. Differential speededness results indicate that the two Math prototypes have slightly higher levels of differential speededness than the SAT-M. The SPR items pose an interesting problem for DIF. The definition of an appropriate DIF matching criterion for constructed response item types needs more study. Metric differences between methods and their effect on difficult or easy items also needs further exploration. Until these methodological issues are resolved, results of DD? studies on constructed response items should be interpreted with caution.

...read moreread less

Non-Linear Transformation of IRT Scale To Account for the Effect of Non-Normal Ability Distribution of the Item Parameter Estimation.

[...]

Kentaro Yamamoto, Eiji Muraki

01 Apr 1991

Journal Article•DOI•

DIF: A Computer Program for the Analysis of Differential Item Performance:

[...]

Eckhard Klieme, Heinrich Stumpf

01 Sep 1991-Educational and Psychological Measurement

TL;DR: A FORTRAN 77 program is presented in this paper, which performs analyses of differential item performance in psychometric tests and computes various additional classical indices of differential items functioning (including discrimination indices) as well as associated effect size measures.

...read moreread less

Abstract: A FORTRAN 77 program is presented which performs analyses of differential item performance in psychometric tests. The program performs the Mantel-Haenszel procedure and computes various additional classical indices of differential item functioning (including discrimination indices) as well as associated effect size measures.

...read moreread less

Examination of Various Influences on the Mantel-Haenszel Statistic.

[...]

Brian E. Clauser

01 Apr 1991

Journal Article•DOI•

An alternative three‐parameter logistic item response model

[...]

Peter J. Pashley

01 Jun 1991-ETS Research Report Series

TL;DR: In this article, an alternative three-parameter logistic model was proposed, in which the asymptote parameter is a linear component within the logit of the function.

...read moreread less

Abstract: Birnbaum's three-parameter logistic function has become a common basis for item response theory modeling, especially within situations where significant guessing behavior is evident This model is formed through a linear transformation of the two-parameter logistic function in order to facilitate a lower asymptote This paper discusses an alternative three-parameter logistic model in which the asymptote parameter is a linear component within the logit of the function This alternative is derived from a more general four-parameter model based on a transformed hyperbola

...read moreread less

Examination of Differential Item Functioning in Likert-Type Items Using Log-Linear Models.

[...]

L. Suzanne Dancer

01 Apr 1991

TL;DR: Dancer et al. as mentioned in this paper presented an examination of differential item functioning in Likert-type items using log-linear models, which was presented at the 1991 American Educational Research Association (AERA) Conference.

...read moreread less

Abstract: AUTHOR Dancer, L. Suzanne; And Others TITLE Examination of Differential Item Functioning in Likert-Type Items Using Log-Linear Models. SPONS AGENCY Wisconsin Univ., Milwaukee. PUB DATE Apr 91 NOTE 20p.; Paper presented at the Annual Meeting of the American Educational Research Association (Chicago, IL, April 3-7, 1991). PUB TYPE Reports Research/Technica) (143) -Speeches/Conference Papers (150)

...read moreread less

Estimating the latent trait from likert-type data: A comparison of factor analysis, item response theory, and multidimensional scaling

[...]

詹志禹, Jason C. Chan

01 Dec 1991

Rasch-Based Factor Analysis of Dichotomously-Scored Item Response Data.

[...]

Randall E. Schumacker, Rickey Fluke

01 Apr 1991

Journal Article•DOI•

Item response theory for item response time

[...]

Keizo Nagaoka, Maomi Ueno

01 Jan 1991-The Japanese Journal of Behaviormetrics

TL;DR: In this article, the authors proposed IRT for item response time and showed the utility of this theory by applying it to practical data and showed that it can be used to evaluate examinee's ability for response time.

...read moreread less

Abstract: We can obtain recently item response time data easily by Computer Testing. And we can evaluate examinees not only for test score but for response time. It is well known that IRT(Item Response Theory)is useful in item analysis for evaluation of test score. Similiarly we can apply item analysis for evaluation of examinee's response time to the idea of IRT. The authors proposed IRT for item response time. In this paper, the authors showed; 1. validity of the theory, 2. item analysis by the theory, 3. estimated examinee's ability for response time. And the authors showed the utilities of this theory by applying to practical data.

...read moreread less