Showing papers on "Item response theory published in 1989"

PDF

Open Access

Book•

Health Measurement Scales: A Practical Guide to Their Development and Use

[...]

David L. Streiner¹, Geoffrey R. Norman¹, John Cairney¹•Institutions (1)

07 Dec 1989

TL;DR: In this article, the authors propose three basic concepts: devising the items, selecting the items and selecting the responses, from items to scales, reliability and validity of the responses.

...read moreread less

Abstract: 1. Introduction 2. Basic concepts 3. Devising the items 4. Scaling responses 5. Selecting the items 6. Biases in responding 7. From items to scales 8. Reliability 9. Generalizability theory 10. Validity 11. Measuring change 12. Item response theory 13. Methods of administration 14. Ethical considerations 15. Reporting test results Appendices

...read moreread less

9,316 citations

Educational measurement, 3rd ed.

[...]

Robert L. Linn

01 Jan 1989

3,037 citations

Journal Article•DOI•

Weighted likelihood estimation of ability in item response theory

[...]

Thomas A. Warm

01 Sep 1989-Psychometrika

TL;DR: Weighted likelihood estimation (WLE) as discussed by the authors removes the first order bias term from MLE and proved to be less biased than MLE with the same asymptotic variance and normal distribution.

...read moreread less

Abstract: Applications of item response theory, which depend upon its parameter invariance property, require that parameter estimates be unbiased. A new method, weighted likelihood estimation (WLE), is derived, and proved to be less biased than maximum likelihood estimation (MLE) with the same asymptotic variance and normal distribution. WLE removes the first order bias term from MLE. Two Monte Carlo studies compare WLE with MLE and Bayesian modal estimation (BME) of ability in conventional tests and tailored tests, assuming the item parameters are known constants. The Monte Carlo studies favor WLE over MLE and BME on several criteria over a wide range of the ability scale.

...read moreread less

965 citations

Journal Article•DOI•

Construction of a Job in General scale: A comparison of global, composite, and specific measures.

[...]

Gail Ironson¹, Patricia C. Smith², Michael T. Brannick³, W. M. Gibson², Karen B. Paul² - Show less +1 more•Institutions (3)

Stanford University¹, Bowling Green State University², University of South Florida³

01 Apr 1989-Journal of Applied Psychology

TL;DR: In this article, the authors describe the construction of a Job in General (JIG) scale, a global scale to accompany the facetscales of the Job Descriptive Index.

...read moreread less

Abstract: We describe the construction of a Job in General (JIG) scale, a global scale to accompany the facetscales of the Job Descriptive Index. We applied both traditional and item response theory proceduresfor item analysis to data from three large heterogeneous samples (N = 1,149, 3,566, and 4,490).Alpha was .91 and above for the resulting 18-item scale in successive samples. Convergent and dis-criminant validity and differential response to treatments were demonstrated. Global scales are con-trasted with composite and with facet scales in psychological measurement. We show that globalscales are not equivalent to summated facet scales. Both facet and global scales were useful in anotherorganization (N = 648). Some principles are suggested for choosing specific (facet), composite, orglobal measures for practical and theoretical problems. The correlations between global and facetscales suggest that work may be the most important facet in relation to general job satisfaction.

...read moreread less

740 citations

Journal Article•DOI•

Item bias and item response theory

[...]

Gideon J. Mellenbergh¹•Institutions (1)

University of Amsterdam¹

01 Jan 1989-International Journal of Educational Research

TL;DR: In this paper, the authors discuss the definition, detection, and explanation of item bias, and four strategies are described: qualitative, correlational, quasi-experimental, and experimental research.

...read moreread less

474 citations

Book•

Modern Psychometrics: The Science of Psychological Assessment

[...]

John Rust, Susan Golombok¹•Institutions (1)

University of Cambridge¹

17 Nov 1989

TL;DR: In this paper, the authors present a step by step guide on how to construct a psychometric questionnaire, which progresses through all the stages of test construction from definition of original purpose to its eventual validation, as well as knowledge-based tests of ability, aptitude, achievement and person based tests of personality, clinical symptoms, mood and attitude.

...read moreread less

Abstract: John Rust and Susan Golombok provide a readable introduction to modern psychometrics. The first part deals with theoretical and more general issues in psychometrics and acknowledges that if psychometrics is to fulfil its function of fair assessment and selection it must take a stand on issues of racism and injustice. The second part is a step by step guide on how to construct a psychometric questionnaire. This progresses through all the stages of test construction from definition of original purpose to its eventual validation. Item response theory, criterion reference testing, profiling and minimum competency testing, are included, as are knowledge based tests of ability, aptitude, achievement and person based tests of personality, clinical symptoms, mood and attitude.

...read moreread less

450 citations

Journal Article•DOI•

Using restricted latent class models to map the skill structure of achievement items

[...]

Edward H. Haertel¹•Institutions (1)

Stanford University¹

01 Dec 1989-Journal of Educational Measurement

TL;DR: In this article, a new method for using certain restricted latent class models, referred to as binary skills models, to determine the skills required by a set of test items is presented.

...read moreread less

Abstract: This paper presents a new method for using certain restricted latent class models, referred to as binary skills models, to determine the skills required by a set o f test items. The method is applied to reading achievement data from a nationally representative sample o f fourth-grade students and offers useful perspectives on test structure and examinee ability, distinct from those provided by other methods o f analysis. Models fitted to small, overlapping sets o f items are integrated into a common skill map, and the nature o f each skill is then inferred from the characteristics o f the items for which it is required. The reading comprehension items examined conform closely to a unidimensional scale with six discrete skill levels that range from an inability to comprehend or match isolated words in a reading passage to the abilities required to integrate passage content with general knowledge and to recognize the main ideas o f the most difficult passages on the test.

...read moreread less

389 citations

Principles and selected applications of item response theory.

[...]

Ronald K. Hambleton

01 Jan 1989

371 citations

Journal Article•DOI•

Trace Lines for Testlets: A Use of Multiple-Categorical-Response Models.

[...]

David Thissen¹, Lynne Steinberg², Jo Ann Mooney¹•Institutions (2)

University of Kansas¹, Indiana University²

01 Sep 1989-Journal of Educational Measurement

TL;DR: It is shown that local independence fails at the level of the individual questions of a test of reading comprehension, and the application to testlet scoring of some multiple-category models originally developed for individual items is discussed.

...read moreread less

Abstract: It is not always convenient or appropriate to construct tests in which individual items are fungible. There are situations in which small clusters of items (testlets) are the units that are assembled to create a test. Using data from a test of reading comprehension constructed of four passages with several questions following each passage, we show that local independence fails at the level of the individual questions. The questions following each passage, however, constitute a testlet. We discuss the application to testlet scoring of some multiple-category models originally developed for individual items, In the example examined, the concurrent validity of the testlet scoring equaled or exceeded that of individual-item-level scoring

...read moreread less

195 citations

Book•

Constructing Test Items

[...]

Steven J. Osterlind

01 Jan 1989

TL;DR: In this paper, the authors focus on the problem of constructing test items for standardized tests of achievement, ability, and aptitude, which is a task of enormous importance and one fraught with difficulty.

...read moreread less

Abstract: Constructing test items for standardized tests of achievement, ability, and aptitude is a task of enormous importance—and one fraught with difficulty. The task is important because test items are the foundation of written tests of mental attributes, and the ideas they express must be articulated precisely and succinctly. Being able to draw valid and reliable inferences from a test’s scores rests in great measure upon attention to the construction of test items. If a test’s scores are to yield valid inferences about an examinee’s mental attributes, its items must reflect a specific psychological construct or domain of content. Without a strong association between a test item and a psychological construct or domain of content, the test item lacks meaning and purpose, like a mere free-floating thought on a page with no rhyme or reason for being there at all.

...read moreread less

176 citations

Patent•

Computerized mastery testing system, a computer administered variable length sequential testing system for making pass/fail decisions

[...]

Charles Lewis¹, Kathleen M. Sheehan¹, Richard N. DeVore¹, Leonard C. Swanson¹•Institutions (1)

Princeton University¹

26 Oct 1989

TL;DR: A computerized mastery testing system providing for the computerized implementation of sequential testing in order to reduce test length without sacrificing mastery classification accuracy is described in this article, where test item units are randomly and sequentially presented to the examinee by a computer test administrator.

...read moreread less

Abstract: A computerized mastery testing system providing for the computerized implementation of sequential testing in order to reduce test length without sacrificing mastery classification accuracy. The mastery testing system is based on Item Response Theory and Bayesian Decision Theory which are used to qualify collections of test items, administered as a unit, and determine the decision rules regarding examinee's responses thereto. The test item units are randomly and sequentially presented to the examinee by a computer test administrator. The administrator periodically determines, based on previous responses, whether the examinee may be classified as a nonmaster or master or whether more responses are necessary. If more responses are necessary it will present as many additional test item units as required for classification. The method provides for determining the test specifications, creating an item pool, obtaining IRT statistics for each item, determining ability values, assembling items into testlets, verifying the testlets, selecting loss functions and prior probability of mastery, estimating cutscores, packaging the test for administration, randomly and sequentially administering testlets to the examinee until a pass/fail decision can be made.

...read moreread less

Journal Article•DOI•

A maximin model for test design with practical constraints

[...]

Willem J. van der Linden¹, Ellen Boekkooi-Timminga¹•Institutions (1)

University of Twente¹

01 Jun 1989-Psychometrika

TL;DR: A maximin model for IRT-based test design is proposed that serves as a constraint subject to which a linear programming algorithm maximizes the information in the test.

...read moreread less

Abstract: A maximin model for IRT-based test design is proposed. In the model only the relative shape of the target test information function is specified. It serves as a constraint subject to which a linear programming algorithm maximizes the information in the test. In the practice of test construction, several demands as linear constraints in the model. A worked example of a text construction problem with practical constraints is presented. The paper concludes with a discussion of some alternative models of test construction.

...read moreread less

Journal Article•DOI•

Comparison of 1-, 2-, and 3-Parameter IRT Models

[...]

Deborah J. Harris¹•Institutions (1)

The American College of Financial Services¹

01 Mar 1989-Educational Measurement: Issues and Practice

TL;DR: In this paper, the 1-, 2-, and 3-parameter logistic item response theory models are discussed, and the effects of changing the a, b, or c parameters are compared.

...read moreread less

Abstract: This module discusses the 1-, 2-, and 3-parameter logistic item response theory models. Mathematical formulas are given for each model, and comparisons among the three models are made. Figures are included to illustrate the effects of changing the a, b, or c parameter, and a single data set is used to illustrate the effects of estimating parameter values (as opposed to the true parameter values) and to compare parameter estimates achieved though applying the different models. The estimation procedure itself is discussed briefly. Discussions of model assumptions, such as dimensionality and local independence, can be found in many of the annotated references (e.g., Hambleton, 1988).

...read moreread less

Journal Article•DOI•

Differential item functioning: implications for test translations

[...]

Barbara B. Ellis

01 Dec 1989-Journal of Applied Psychology

TL;DR: In this article, a method is proposed to evaluate l'equivalence de mesure des traductions des tests d'intelligence americains and allemands dans les deux sens.

...read moreread less

Abstract: Utilisation des methodes basees sur la theorie de la reponse par item pour evaluer l'equivalence de mesure des traductions des tests d'intelligence americains et allemands dans les deux sens. Identification des items jouant un role differentiel et analyse de contenu pour en determiner la cause (culturelle ou linguistique)

...read moreread less

Journal Article•DOI•

The General Health Questionnaire: a psychometric analysis using latent trait theory.

[...]

David Andrich, Lesley Van Schoubroeck

01 May 1989-Psychological Medicine

TL;DR: Results from 1967 teachers in Western Australia who completed the 30-item form of the GHQ show that the items conform reasonably well to the model at a general or macro-level of analysis, and the original ordering of categories is supported.

...read moreread less

Abstract: This study examines the Likert-style successive integer scoring of Goldberg's (1972, 1978) General Health Questionnaire (GHQ) with a psychometric model in which the thresholds between successive categories within each item can be estimated. The model is particularly appropriate because the scoring of the successive categories, which are not named in the same way across items, by successive integers has received substantial discussion in the literature. Results from 1967 teachers in Western Australia who completed the 30-item form of the GHQ show that the items conform reasonably well to the model at a general or macro-level of analysis. In particular, the original ordering of categories is supported. However, as expected, there are systematic differences between distances among threshold within items and systematic differences among thresholds between items. The differences between positively and negatively orientated items confirm a suggestion in the literature that these two classes of items form sufficiently different scales so that they could be treated as separate, though reasonably correlated, scales.

...read moreread less

Journal Article•DOI•

Five decades of item response modelling

[...]

Harvey Goldstein¹, Robert Wood•Institutions (1)

Institute of Education¹

01 Nov 1989-British Journal of Mathematical and Statistical Psychology

TL;DR: This paper looks at 50 years of IRM and finds a disappointing lack of advance, and it is shown how a linear model framework, involving different response transformations, unifies separate approaches to the study of test item responses.

...read moreread less

Abstract: An historical and theoretical review is provided of so called item response theory (IRT), more accurately described as item response modelling (IRM). This paper looks at 50 years of IRM and finds a disappointing lack of advance. It is shown how a linear model framework, involving different response transformations, unifies separate approaches to the study of test item responses.

...read moreread less

Journal Article•DOI•

Multiple-Choice Models: The Distractors Are also Part of the Item.

[...]

David Thissen, Lynne Steinberg¹, Anne R. Fitzpatrick²•Institutions (2)

Indiana University¹, CTB/McGraw Hill²

01 Jun 1989-Journal of Educational Measurement

TL;DR: In this article, an item response model for multiple-choice items and its application in item analysis is described. Butler et al. used the model for the detection of flawed items, for item design and development, and for test construction.

...read moreread less

Abstract: This paper describes an item response model for multiple-choice items and illustrates its application in item analysis. The model provides parametric and graphical summaries of the performance of each alternative associated with a multiple-choice item; the summaries describe each alternative's relationship to the proficiency being measured. The interpretation of the parameters of the multiple-choice model and the use of the model in item analysis are illustrated using data obtained from a pilot test of mathematics achievement items. The use of such item analysis for the detection of flawed items, for item design and development, and for test construction is discussed.

...read moreread less

Journal Article•DOI•

Computerized adaptive personality assessment: an illustration with the Absorption scale.

[...]

Niels G. Waller¹, Steve P. Reise¹•Institutions (1)

University of Minnesota¹

01 Dec 1989-Journal of Personality and Social Psychology

TL;DR: The theory behind and applications of adaptive personality assessment based on the item response theory and two adaptive testing strategies were compared: fixed test length and clinical decision.

...read moreread less

Abstract: This article introduces the theory behind and applications of adaptive personality assessment based on the item response theory. Two adaptive testing strategies were compared: (a) fixed test length and (b) clinical decision. Real-data simulations, based on the item responses from 1,000 subjects who had previously taken the 34-item Absorption scale (Tellegen, 1982) by means of paper-and-pencil format, were used to illustrate these strategies. Results suggest that computerized adaptive personality assessment works impressively well. With the fixed-test-length strategy, a 50% savings in administered items was achieved with little loss of measurement precision. In the clinical-decision testing strategy, individuals who were extreme on the Absorption trait were identified with perfect accuracy using, on average, 25% of the available items. The implications of these results for personality research and assessment are discussed.

...read moreread less

What Combination of Sampling and Equating Methods Works Best? Revised.

[...]

Samuel A. Livingston

26 Apr 1989

Abstract: Combinations of five methods of equating test forms and two methods of selecting samples of students for equating were compared for accuracy. The two sampling methods were representative sampling from the population and matching samples on the anchor test score. The equating methods were the Tucker, Levine equally reliable, chained equipercentile, frequency estimation, and item response theory (IRT) 3PL methods. The tests were the Verbal and Mathematical sections of the Scholastic Aptitude Test. The criteria for accuracy were measures of agreement with an equivalent-groups equating based on more than 115,000 students taking each form. Much of the inaccuracy in the equatings could be attributed to overall bias. The results for all equating methods in the matched samples were similar to those for the Tucker and frequency estimation methods in the representative samples; these equatings made too small an adjustment for the difference in the difficulty of the test forms. In the representative samples, the cha...

...read moreread less

Journal Article•DOI•

Extensions of the partial credit model

[...]

C. A. W. Glas, N. D. Verhelst

01 Sep 1989-Psychometrika

TL;DR: In this article, a marginal maximum likelihood estimation procedure is developed which allows for incomplete data and linear restrictions on both the item and the population parameters, and two statistical tests for evaluating model fit are presented: the former test has power against violation of the assumption about the ability distribution, the latter test offers the possibility of identifying specific items that do not fit the model.

...read moreread less

Abstract: The partial credit model, developed by Masters (1982), is a unidimensional latent trait model for responses scored in two or more ordered categories. In the present paper some extensions of the model are presented. First, a marginal maximum likelihood estimation procedure is developed which allows for incomplete data and linear restrictions on both the item and the population parameters. Secondly, two statistical tests for evaluating model fit are presented: the former test has power against violation of the assumption about the ability distribution, the latter test offers the possibility of identifying specific items that do not fit the model.

...read moreread less

The Construction of Parallel Tests from IRT-Based Item Banks. Project Psychometric Aspects of Item Banking No. 43. Research Report 89-2.

[...]

Ellen Boekkooi-Timminga

01 Mar 1989

TL;DR: Simultaneous and sequential parallel test construction methods based on the use of 0–1 programming are examined for the Rasch and 3-parameter logistic model.

...read moreread less

Journal Article•DOI•

An Evaluation of Marginal Maximum Likelihood Estimation for the Two-Parameter Logistic Model

[...]

Fritz Drasgow

01 Mar 1989-Applied Psychological Measurement

TL;DR: In this paper, the authors investigated the accuracy of marginal maximum likelihood estimation of the item parameters of the two-parameter LoGistic model and found that marginal estima tion was substantially better than joint maximum likelihood estimation for items with extreme diffi culty or discrimination parameters.

...read moreread less

Abstract: The accuracy of marginal maximum likelihood esti mates of the item parameters of the two-parameter lo gistic model was investigated. Estimates were obtained for four sample sizes and four test lengths; joint maxi mum likelihood estimates were also computed for the two longer test lengths. Each condition was replicated 10 times, which allowed evaluation of the accuracy of estimated item characteristic curves, item parameter estimates, and estimated standard errors of item pa rameter estimates for individual items. Items that are typical of a widely used job satisfaction scale and moderately easy tests had satisfactory marginal esti mates for all sample sizes and test lengths. Larger samples were required for items with extreme diffi culty or discrimination parameters. Marginal estima tion was substantially better than joint maximum like lihood estimation. Index terms: Fletcher-Powell algorithm, item parameter estimation, item response theory, joint maximum likelihood estimation, marginal maximum likelihood...

...read moreread less

Journal Article•DOI•

Criterion-related construct validity

[...]

Paul R. Rosenbaum¹•Institutions (1)

University of Pennsylvania¹

01 Sep 1989-Psychometrika

TL;DR: This article applied latent variable models to the study of the validity of a psychological test and found that the test predicts a criterion by measuring a unidimensional latent construct, not only must the total score predict the criterion, but the joint distribution of criterion scores and item responses must exhibit a certain pattern.

...read moreread less

Abstract: Established results on latent variable models are applied to the study of the validity of a psychological test. When the test predicts a criterion by measuring a unidimensional latent construct, not only must the total score predict the criterion, but the joint distribution of criterion scores and item responses must exhibit a certain pattern. The presence of this population pattern may be tested with sample data using the stratified Wilcoxon rank sum test. Often, criterion information is available only for selected examinees, for instance, those who are admitted or hired. Three cases are discussed: (i) selection at random, (ii) selection based on the current test, and (iii) selection based on other measures of the latent construct. Discriminant validity is also discussed.

...read moreread less

Journal Article•DOI•

Operational Characteristics of Adaptive Testing Procedures Using the Graded Response Model

[...]

Barbara G. Dodd¹, William R. Koch¹, Ralph J. De Ayala²•Institutions (2)

University of Texas at Austin¹, University of Maryland, College Park²

01 Jun 1989-Applied Psychological Measurement

Abstract: The purpose of the present research was to develop general guidelines to assist practitioners in setting up operational computerized adaptive testing (CAT) sys tems based on the graded response mod...

...read moreread less

Journal Article•DOI•

Binomial Regression with Monotone Splines: A Psychometric Application

[...]

James O. Ramsay¹, Michal Abrahamowicz¹•Institutions (1)

McGill University¹

01 Dec 1989-Journal of the American Statistical Association

TL;DR: In this paper, the authors used monotone regression splines to define p(x, θ) and applied them to the representation of test items as functions of examinee ability.

...read moreread less

Abstract: A binomial regression function p(x, θ) models the probability of rj successes in nj trials as a function of the values of an observed covariate xj and/or a latent variable θj (j = 1, …, J). This article explores the use of monotone regression splines to define p, and applies them to the representation of test items as functions of examinee ability. Some illustrative data suggest that the flexibility of monotone splines permits the detection of item characteristics not observable using logistic-based or log-linear approaches. A simulation study indicates that estimates of both item-characteristic curves and ability are reasonably precise for numbers of items and examinees typical of large university lectures. Given a set of such binomial regression functions, it can be useful to study the principal components of functional variation. The extension of multivariate principal-components analysis to permit the analysis of many item-characteristic curves is described.

...read moreread less

Journal Article•DOI•

An irt-based model for dichotomous longitudinal data

[...]

Gerhard Fischer¹•Institutions (1)

University of Vienna¹

01 Sep 1989-Psychometrika

TL;DR: In this article, the authors extend the generalized Rasch model to designs with any number of time points and even with different sets of items presented on different occasions, provided that one unidimensional subscale is available per latent trait.

...read moreread less

Abstract: The LLRA (linear logistic model with relaxed assumptions; Fischer, 1974, 1977a, 1977b, 1983a) was developed, within the framework of generalized Rasch models, for assessing change in dichotomous item score matrices between two points in time; it allows to quantify change on latent trait dimensions and to explain change in terms of treatment effects, treatment interactions, and a trend effect. A remarkable feature of the model is that unidimensionality of the item set is not required. The present paper extends this model to designs with any number of time points and even with different sets of items presented on different occasions, provided that one unidimensional subscale is available per latent trait. Thus unidimensionality assumptions within subscales are combined with multidimensionality of the item set. Conditional maximum likelihood methods for parameter estimation and hypothesis testing are developed, and a necessary and sufficient condition for unique identification of the model, given the data, is derived. Finally, a sample application is presented.

...read moreread less

Journal Article•DOI•

Item bias detection using loglinear IRT

[...]

Henk Kelderman¹•Institutions (1)

University of Twente¹

01 Sep 1989-Psychometrika

TL;DR: In this article, a method for the detection of item bias with respect to observed or unobserved subgroups is proposed, which uses quasi-loglinear models for the incomplete subgroup × test score × item 1 ×... × itemk contingency table.

...read moreread less

Abstract: A method is proposed for the detection of item bias with respect to observed or unobserved subgroups. The method uses quasi-loglinear models for the incomplete subgroup × test score × Item 1 × ... × itemk contingency table. If subgroup membership is unknown the models are Haberman's incomplete-latent-class models. The (conditional) Rasch model is formulated as a quasi-loglinear model. The parameters in this loglinear model, that correspond to the main effects of the item responses, are the conditional estimates of the parameters in the Rasch model. Item bias can then be tested by comparing the quasi-loglinear-Rasch model with models that contain parameters for the interaction of item responses and the subgroups.

...read moreread less

Journal Article•DOI•

Adaptive Testing: The Evolution of a Good Idea

[...]

Mark D. Reckase¹•Institutions (1)

The American College of Financial Services¹

01 Sep 1989-Educational Measurement: Issues and Practice

TL;DR: In this paper, the psychometric requirements for adaptive testing are reviewed and the historical antecedents are considered and an analysis of these two factors reveals the importance of the concept of the item/person interaction.

...read moreread less

Abstract: The psychometric requirements for adaptive testing are reviewed and the historical antecedents are considered. An analysis of these two factors reveals the importance of the concept of the item/person interaction. Future areas for advancement of adaptive testing are discussed.

...read moreread less

Journal Article•DOI•

Adaptive estimation when the unidimensionality assumption of IRT is violated

[...]

Valerie Greaud Folk¹, Bert F. Green²•Institutions (2)

Syracuse University¹, Johns Hopkins University²

01 Dec 1989-Applied Psychological Measurement

TL;DR: In this paper, the effects of using a unidi mensional IRT model when the assumption of unidi milliorientality was violated was examined and the adaptive and non-adaptive tests were formed from two-dimensional item sets.

...read moreread less

Abstract: This study examined some effects of using a unidi mensional IRT model when the assumption of unidi mensionality was violated. Adaptive and nonadaptive tests were formed from two-dimensional item sets. The tests were administered to simulated examinee populations with different correlations of the two un derlying abilities. Scores from the adaptive tests tended to be related to one or the other ability rather than to a composite. Similar but less disparate results were obtained with IRT scoring of nonadaptive tests, whereas the conventional standardized number-correct score was equally related to both abilities. Differences in item selection from the adaptive administration and in item parameter estimation were also examined and related to differences in ability estimation. Index terms: ability estimation, adaptive testing, item pa rameter estimation, item response theory, multidimen sionality.

...read moreread less

Journal Article•DOI•

An Introduction to Item Response Theory.

[...]

Robert L. McKinley¹•Institutions (1)

Princeton University¹

01 Apr 1989-Measurement and Evaluation in Counseling and Development

TL;DR: An overview of item response theory is presented in this article, where basic models are described, and various applications are illustrated Applications discussed include instrument construction and computerized testing, as well as instrument testing.

...read moreread less

Abstract: An overview of item response theory is presented Basic models are described, and various applications are illustrated Applications discussed include instrument construction and computerized testing

...read moreread less