Statistical issues in computerized adaptive testing

Open AccessBook

Statistical issues in computerized adaptive testing

Chats0

TLDR

The main goal of this thesis is to critique and improve the efficiency of proficiency testing by finding optimal statistical designs for CAT procedures by focusing on the problem of deciding when to cease testing in order to make the classification decision.

Abstract:

Computerized adaptive testing (CAT) serves as a more efficient alternative to traditional pencil-and-paper tests by (i) adaptively selecting questions that are appropriate for the examinee being tested; and (ii) ending the examination once enough information has accumulated. An important use of CAT is so-called “proficiency testing”, where the goal is a pass/fail classification of each examinee rather than accurate estimation of that examinee's ability. The main goal of this thesis is to critique and improve the efficiency of proficiency testing by finding optimal statistical designs for CAT procedures. Attention focuses on the problem of deciding when to cease testing in order to make the classification decision. Because questions are not interchangeable with one another, item response theory (IRT) is used to account for the fact that different test takers have not necessarily received questions with similar measurement properties. Statistical theory is developed without the common sequential analysis assumption of iid variables. A traditional method in sequential analysis, the truncated sequential probability ratio test (Wald, 1947), is shown not to be supported by theoretical results. More modern sequential analysis procedures are suggested, including those based on stochastic curtailment (Lan, Simon, & Halperin, 1982) and self-tuning methods (Lai & Shih, 2003). Different methods of ending the examination, referred to as “stopping rules” in sequential analysis, are compared in simulation, the results of which indicate that the traditional method can be shortened without compromising error rates. Practical challenges of implementing CAT, such as a possible lack of test security, are also incorporated into analysis of the methods' properties so that the different statistical approaches can be compared in a realistic setting. Theory and simulation both suggest considerable practical gain from stochastic curtailment and self-tuning methods in relation to the traditional method. Another section of the thesis introduces sequential confidence intervals as a way to make inferences about an examinee's ability once the test is over. Finally, the link between CAT methods and other applications of sequential analysis, such as clinical studies and psychological diagnosis, is explored.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

On Using Stochastic Curtailment to Shorten the SPRT in Sequential Mastery Testing

Matthew Finkelman

- 01 Dec 2008 -

Journal of Educational and Behavioral St...

TL;DR: The application of stochastic curtailment in SMT to shorten the truncated sequential probability ratio test without substantially compromising error rates is introduced.

...read moreread less

Computerized classification testing in more than two categories by using stochastic curtailment

Theodorus Johannes Hendrikus Maria Eggen, +1 more

TL;DR: The current study replicates Finkelman's results, replicates it in realistic settings, and subsequently generalizes the SCSPRT to three categories while using adaptive item selection, showing increased efficiency both when using one and two cut points.

...read moreread less

Book ChapterDOI

Sequential Probability Ratio Test

T.J.H.M. Eggen

TL;DR: The sequential probability ratio test (SPRT) as mentioned in this paper is a sequential statistical test developed by Wald (Wald, A. (1947) and used in computerized adaptive testing (CAT).

...read moreread less