scispace - formally typeset
Journal ArticleDOI

Are there benefits from NHST

TLDR
Although induction (hypothesis testing) is central to science, neither NHST nor any other form of statistical significance testing is required for hypothesis testing, and this leads to one of two false conclusions about the meaning of the literature.
Abstract
Krueger’s (January 2001) thesis can be summarized as follows: Induction is central to science (true). From a philosophical perspective, induction cannot be defended logically but can be defended pragmatically— because it leads to progress in science (true). Null hypothesis significance testing (NHST) is a good tool—perhaps the essential tool— for induction and inference from research data (false). NHST is like induction: It cannot be defended logically (true) but can be defended pragmatically (false). The pragmatic defense Krueger offered is the contention that NHST has “proven useful” (p. 24) and that it “rewards the pragmatic scientist” (p. 23; both false). Induction as used by Krueger (2001) is the same as hypothesis testing; it is undisputed that hypothesis testing is indispensable for science. Krueger’s position, then, reduces to the proposition that NHST is the best procedure—and perhaps the essential procedure— for testing hypotheses. This false argument has long been offered as a defense of significance testing (Schmidt & Hunter, 1997, pp. 42–44). In its strong form, the argument is that without significance testing, psychologists could not have a science because they would no longer be able to test hypotheses. The physical sciences, such as physics and chemistry, do not use NHST or statistical significance tests of any kind, yet these sciences test hypotheses and have done so for centuries. In fact, in contrast to many psychologists, most researchers in the physical sciences regard reliance on significance testing as unscientific (Schmidt, 1996). If the argument is that hypothesis testing requires the use of significance tests, then the logical implications are that physicists and chemists are not really testing hypotheses and that their research is not really scientific. How plausible is this? If the argument is that significance testing is the best method of testing hypotheses in science, then the logical implication is that the hypothesis-testing methods used in physics and chemistry are suboptimal and inferior to those based on NHST and typically used in psychology. Does anyone really believe this? Hence, although induction (hypothesis testing) is central to science, neither NHST nor any other form of statistical significance testing is required for hypothesis testing. Significance testing almost invariably retards the search for knowledge by producing false conclusions about research literature. The evidence is strong that the null hypothesis is almost always false in psychological research. For example, Lipsey and Wilson (1993) examined 302 meta-analyses of psychological interventions of all kinds in many areas of psychology. In only 2 of these meta-analyses were the effect sizes (ESs) zero or near zero (less than 1%). An examination of all published meta-analyses would produce a similar figure for psychology as a whole. If the null hypothesis is typically false, then Type I error is not important because it is impossible to make a Type I error when the null is false. What is important is Type II error: failing to detect the effect or relation that is there. One minus the Type II error rate is the statistical power of the study: the probability of detecting the effect or relation. The evidence is clear that the average level of statistical power in psychological research is between .40 and .60 (e.g., see Cohen, 1962, 1994; Sedlmeier & Gigerenzer, 1989). The operational decision rule used by researchers is “if it is significant, it is real; if it is not significant, it is zero” (Schmidt, 1996). Hence, the error rate in the typical psychological research literature is approximately 50%—that is, half of all studies reach false conclusions about the null hypothesis, a situation of maximal apparent conflict in the literature. As discussed by Schmidt (1996), this leads to one of two false conclusions about the meaning of the literature. The first is that the literature is so conflicting that nothing can be concluded. The second is that there are interactions or moderator variables that cause the effect to exist in some studies and to be nonexistent in others and that research should be directed at finding these moderator variables. Meta-analysis typically indicates that both of these conclusions are false—by revealing that the effect exists in all studies (Schmidt, 1996). Significance tests are a disastrous method for testing hypotheses, but a better method does exist: use of point estimates (ESs) and confidence intervals (CIs). First, unlike significance tests, CIs hold the real error rate to .05 (or whatever confidence level is set); there is no possibility of a higher error rate as with significance tests. In particular, the true error rate will never be 50% when the researcher thinks it is 5% (because the alpha level is set at .05). Second, almost all of the CIs from different studies overlap each other, correctly suggesting that the studies are not contradictory. Third, the CI clearly reveals the level of uncertainty in the study results; unlike the significance test, the CI provides an index of the effects of sampling error on the results. Finally, the ES provides the information needed for subsequent meta-analyses, whereas the significance test does not. Krueger (2001) stated, “In daily research activities, NHST has proven useful. Researchers make decisions concerning the validity of hypotheses, and although their decisions sometimes disagree, they are not random or arbitrary” (p. 24). First, Krueger presented no evidence to support his assertion of NHST’s usefulness. As shown above, significance testing creates confusion and false conclusions about research literature. How is this useful? Second, researchers’ conclusions disagree more often than sometimes—in the typical research literature, they disagree 50% of

read more

Citations
More filters
Journal ArticleDOI

Does the Positive Psychology Movement Have Legs

TL;DR: For example, the authors states that if individuals engage in positive thinking and feeling and abandon or minimize their preoccupation with the harsh and tragic, the stressful side of lifethey will have found a magic elixir of health and well-being.
Journal ArticleDOI

Conspiracy Beliefs, Rejection of Vaccination, and Support for hydroxychloroquine: A Conceptual Replication-Extension in the COVID-19 Pandemic Context

TL;DR: COVID-19 conspiracy beliefs (among which, conspiracy beliefs about chloroquine), as well as a conspiracy mentality (i.e., predisposition to believe in conspiracy theories) negatively predicted participants’ intentions to be vaccinated against CO VID-19 in the future.
Journal ArticleDOI

The Chrysalis Effect How Ugly Initial Results Metamorphosize Into Beautiful Articles

TL;DR: In this article, the authors outline the means, motives, and opportunities for researchers to better their chances of publication independent of rigor and relevance, and assess the frequency of questionable research practices in management research by tracking differences between dissertations and their resulting journal publications.
Journal ArticleDOI

Statistical inference with plsc using bootstrap confidence intervals

TL;DR: Evidence is provided of the value of employing bootstrap confidence intervals in conjunction with PLSc, which is a more appropriate alternative than PLS for many of the research scenarios that are of interest to the field.
Journal ArticleDOI

Different lengths of times for progressions in adolescent substance involvement

TL;DR: Results suggested that faster transitions were more due to drug-related constructs than intrapersonal constructs, and the shortest transition times were for opiates followed respectively by cocaine, cannabis, tobacco, and alcohol.
References
More filters
Book

Methods of Meta-Analysis: Correcting Error and Bias in Research Findings

TL;DR: In this article, the authors present a meta-analysis of Artifact Distributions and their impact on study outcomes. But they focus mainly on the second-order sampling error and related issues.
Journal ArticleDOI

The earth is round (p < .05)

TL;DR: The authors reviewed the problems with null hypothesis significance testing, including near universal misinterpretation of p as the probability that H is false, the misinterpretation that its complement is the probability of successful replication, and the mistaken assumption that if one rejects H₀ one thereby affirms the theory that led to the test.
Journal ArticleDOI

The efficacy of psychological, educational, and behavioral treatment. Confirmation from meta-analysis.

TL;DR: In contrast, meta-analytic reviews show a strong, dramatic pattern of positive overall effects that cannot readily be explained as artifacts of metaanalytic technique or generalized placebo effects.
Journal ArticleDOI

Statistical Significance Testing and Cumulative Knowledge in Psychology: Implications for Training of Researchers

TL;DR: This article showed that the benefits that they believe flow from use of significance testing are illusory and should be replaced with point estimates and confidence intervals in individual studies and with meta-analyses in the integration of multiple studies.