scispace - formally typeset
Search or ask a question

Showing papers by "National League for Nursing published in 1978"


Journal ArticleDOI
TL;DR: For instance, Kane and Crooks as mentioned in this paper showed that the variability of student responses is consistently greater than the variability within students, indicating that increasing the number of students generally has a much greater impact on dependability than increasing the total number of items.
Abstract: Previous studies (Gillmore, Kane, and Naccarato, 1976; Kane, Crooks, and Gillmore, 1976; Kane, Gillmore, and Crooks, 1976) have shown that generalizability theory (Cronbach, Gleser, Nanda, and Rajaratnam, 1972) can be a valuable tool in analyzing the dependability of student ratings of instruction. Specifically these studies have discussed three distinct generalizability coefficients within the common paradigm under which these ratings are collected. The coefficient which is appropriate in a given setting depends upon whether one wishes to generalize over both students and items, or only over one or the other. Furthermore, the characteristics of each of these coefficients can be easily traced as a function of changes in the number of raters and items. In addition, the multifaceted interpretation which underlies generalizability theory was useful in showing that the variability of student responses is consistently greater than the variability within students, indicating that increasing the number of students generally has a much greater impact on dependability than increasing the number of items. Perhaps the most important practical conclusion arising from the series of analyses reported was that, even for the most conservative case, when generalization is over both students and items, generalizability coefficients of .70 and greater were found when a class was rated by fifteen or more students, responding to four or more items. This result was reported in the above studies for three distinct instruments in two locations for various disciplinary groupings. The object of measurement for the above studies was the class, defined as a particular course (e.g., Psychology 101) taught by a particular teacher. However, the class as the object of measurement confounds the teacher effect with the course effect. This confounding is acceptable for some uses of student ratings, e.g. feedback to the teacher on his performance in a particular course, but does not permit estimation of the dependability of student ratings for two important applications of such ratings. First in evaluating a teacher's instructional effectiveness for a promotional decision, one would probably want to generalize over all courses that the teacher might teach rather than focusing on results of one or more specific courses. Second, in evaluating a specific curriculum offering, one would probably want to assess the course's rating by students, generalizing over the universe of all possible teachers who might teach it. Analyses in which the teacher and course effects are confounded do not allow an assessment of the dependability of student rating data for either of these purposes.

106 citations


Journal ArticleDOI
TL;DR: In this paper, a modified version of Horst's model for examinee behavior was used to compare the effect of guessing on item reliability for the answer-until-correct (AUC) and zero-one (ZO) scoring procedures.
Abstract: The answer-until-correct (AUC) procedure re quires that examinees respond to a multiple-choice item until they answer it correctly. The examinee's score on the item is then based on the number of responses required for the item. It was expected that the additional responses obtained under the AUC procedure would improve reliability by pro viding additional information on those examinees who fail to choose the correct alternative on their first attempt. However, when compared to the zero- one (ZO) scoring procedure, the AUC procedure has failed to yield consistent improvements in relia bility. Using a modified version of Horst's model for examinee behavior, this paper compares the ef fect of guessing on item reliability for the AUC pro cedure and the ZO procedure. The analysis shows that the relative efficiency of the two procedures de pends strongly on the nature of the item alterna tives and implies that the appropriate criteria for item selection are different for each procedure. Conflicting results rep...

8 citations