scispace - formally typeset
Search or ask a question

Showing papers by "John E. Hunter published in 2000"


Journal ArticleDOI
TL;DR: The authors showed that fixed effects (FE) and random effects (RE) meta-analysis models have a substantial Type I bias in significance tests for mean effect sizes and for moderator variables (interactions), while RE models do not.
Abstract: Research conclusions in the social sciences are increasingly based on meta-analysis, making questions of the accuracy of meta-analysis critical to the integrity of the base of cumulative knowledge. Both fixed effects (FE) and random effects (RE) meta-analysis models have been used widely in published meta-analyses. This article shows that FE models typically manifest a substantial Type I bias in significance tests for mean effect sizes and for moderator variables (interactions), while RE models do not. Likewise, FE models, but not RE models, yield confidence intervals for mean effect sizes that are narrower than their nominal width, thereby overstating the degree of precision in meta-analysis findings. This article demonstrates analytically that these biases in FE procedures are large enough to create serious distortions in conclusions about cumulative knowledge in the research literature. We therefore recommend that RE methods routinely be employed in meta-analysis in preference to FE methods.

684 citations


Journal ArticleDOI
TL;DR: For example, Hartigan and Wigdor as discussed by the authors found that there is no evidence that items on currently used tests function differently in different racial and gender groups than the test as a whole.
Abstract: The study of potential racial and gender bias in individual test items is a major research area today. The fact that research has established that total scores on ability and achievement tests are predictively unbiased raises the question of whether there is in fact any real bias at the item level. No theoretical rationale for expecting such bias has been advanced. It appears that findings of item bias (differential item functioning; DIP) can be explained by three factors: failure to control for measurement error in ability estimates, violations of the unidimensionality assumption required by DIP detection methods, and reliance on significance testing (causing tiny artifactual DIP effects to be statistically significant because sample sizes are very large). After taking into account these artifacts, there appears to be no evidence that items on currently used tests function differently in different racial and gender groups. For the past 30 years, civil rights lawyers, journalists, and others have alleged that cognitive ability and educational achievement tests are predictively biased against minorities. That is, they have argued that when test scores are equal, minorities have higher average levels of educational and work performance, meaning that test scores underestimate the real world performance of minorities. Thousands of test bias studies have been conducted, and these studies have disconfirmed that hypothesis. The National Academy of Sciences appointed two blue ribbon committees to study the data from these studies, and both committees concluded that professionally developed tests are not predictively biased (Hartigan & Wigdor, 1989; Wigdor & Garner, 1982). Thus, the issue of test bias is scientifically dead. But for the past 15 years, there have been repeated claims that while tests may not be biased, many individual test items are biased. These claims have spawned major efforts to identify and remove biased test items. These claims are logically inconsistent. Consider the issue of racial bias. If a large portion of items on a test were biased against Blacks, then the test as a whole would be biased against Blacks. Because the test as a whole is known to be unbiased, there must be something wrong with the claim that there are a large number of biased items on the test. The current proposed solution to this logical problem is this: Current claims do not postulate that items are biased only against minority groups. Rather, they say that there are items biased against Whites as well as items biased against

53 citations