scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Using Effect Size-or Why the P Value Is Not Enough.

01 Sep 2012-Journal of Graduate Medical Education (J Grad Med Educ)-Vol. 4, Iss: 3, pp 279-282
TL;DR: Effect size helps readers understand the magnitude of differences found, whereas statistical significance examines whether the findings are likely to be due to chance and is essential for readers to understand the full impact of your work.
Abstract: Effect size helps readers understand the magnitude of differences found, whereas statistical significance examines whether the findings are likely to be due to chance. Both are essential for readers to understand the full impact of your work. Report both in the Abstract and Results sections.

Content maybe subject to copyright    Report

Citations
More filters
01 Jan 2014
TL;DR: In this article, the main objectives of this contribution are to promote various effect size measures in sport sciences through, once again, bringing to the readers' attention the benefits of reporting them, and to present examples of such estimates with a greater focus on those that can be calculated for non-parametric tests.
Abstract: Recent years have witnessed a growing number of published reports that point out the need for reporting various effect size estimates in the context of null hypothesis testing (H0) as a response to a tendency for reporting tests of statistical significance only, with less attention on other important aspects of statistical analysis. In the face of considerable changes over the past several years, neglect to report effect size estimates may be noted in such fields as medical science, psychology, applied linguistics, or pedagogy. Nor have sport sciences managed to totally escape the grips of this suboptimal practice: here statistical analyses in even some of the current research reports do not go much further than computing p-values. The p-value, however, is not meant to provide information on the actual strength of the relationship between variables, and does not allow the researcher to determine the effect of one variable on another. Effect size measures serve this purpose well. While the number of reports containing statistical estimates of effect sizes calculated after applying parametric tests is steadily increasing, reporting effect sizes with non-parametric tests is still very rare. Hence, the main objectives of this contribution are to promote various effect size measures in sport sciences through, once again, bringing to the readers’ attention the benefits of reporting them, and to present examples of such estimates with a greater focus on those that can be calculated for non-parametric tests

732 citations


Cites background from "Using Effect Size-or Why the P Valu..."

  • ...Relying on the p-value alone for statistical inference does not permit an evaluation of the magnitude and importance of the obtained result [10, 12, 13]....

    [...]

  • ...Sometimes a result that is statistically significant mainly indicates that a huge sample size was used [10, 11]....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors provided evidence-based recommendations on early intervention in clinical high risk (CHR) states of psychosis, assessed according to the EPA guidance on early detection, derived from a meta-analysis of current empirical evidence on the efficacy of psychological and pharmacological interventions in CHR samples.

432 citations

Journal ArticleDOI
TL;DR: This study describes and compares several Bayesian indices, provide intuitive visual representation of their “behavior” in relationship with common sources of variance such as sample size, magnitude of effects and also frequentist significance, and contributes to the development of an intuitive understanding of the values that researchers report, critical for the standardization of scientific reporting.
Abstract: Turmoil has engulfed psychological science. Causes and consequences of the reproducibility crisis are in dispute. With the hope of addressing some of its aspects, Bayesian methods are gaining increasing attention in psychological science. Some of their advantages, as opposed to the frequentist framework, are the ability to describe parameters in probabilistic terms and explicitly incorporate prior knowledge about them into the model. These issues are crucial in particular regarding the current debate about statistical significance. Bayesian methods are not necessarily the only remedy against incorrect interpretations or wrong conclusions, but there is an increasing agreement that they are one of the keys to avoid such fallacies. Nevertheless, its flexible nature is its power and weakness, for there is no agreement about what indices of "significance" should be computed or reported. This lack of a consensual index or guidelines, such as the frequentist p-value, further contributes to the unnecessary opacity that many non-familiar readers perceive in Bayesian statistics. Thus, this study describes and compares several Bayesian indices, provide intuitive visual representation of their "behavior" in relationship with common sources of variance such as sample size, magnitude of effects and also frequentist significance. The results contribute to the development of an intuitive understanding of the values that researchers report, allowing to draw sensible recommendations for Bayesian statistics description, critical for the standardization of scientific reporting.

327 citations


Cites background from "Using Effect Size-or Why the P Valu..."

  • ...It is interesting to note that this perspective unites significance testing with the focus on effect size (involving a discrete separation between at least two categories: negligible and non-negligible), which finds an echo in recent statistical recommendations (Ellis and Steyn, 2003; Sullivan and Feinn, 2012; Simonsohn et al., 2014)....

    [...]

Journal ArticleDOI
TL;DR: In conclusion, insect tasting sessions are important to decrease food neophobia, as they encourage people to “take the first step” and become acquainted with entomophagy.

315 citations

01 Jan 2008

274 citations

References
More filters
Journal ArticleDOI
TL;DR: The use of effect size reporting in the analysis of social science data remains inconsistent and interpretation of the effect size estimates continues to be confused as discussed by the authors, and clinicians also may have little guidance in the interpretation of effect sizes relevant for clinical practice.
Abstract: Increasing emphasis has been placed on the use of effect size reporting in the analysis of social science data. Nonetheless, the use of effect size reporting remains inconsistent, and interpretation of effect size estimates continues to be confused. Researchers are presented with numerous effect sizes estimate options, not all of which are appropriate for every research question. Clinicians also may have little guidance in the interpretation of effect sizes relevant for clinical practice. The current article provides a primer of effect size estimates for the social sciences. Common effect sizes estimates, their use, and interpretations are presented as a guide for researchers.

2,680 citations

Journal ArticleDOI
Jacob Cohen1
TL;DR: The application of statistics to psychology and the other sociobiomedical sciences has been studied extensively as discussed by the authors, including the principles "less is more" (fewer variables, more highly targeted issues, sharp rounding off), "simple is better" (graphic representation, unit weighting for linear composites), and "some things you learn aren't so."
Abstract: This is an account of what I have learned (so far) about the application of statistics to psychology and the other sociobiomedical sciences. It includes the principles "less is more" (fewer variables, more highly targeted issues, sharp rounding off), "simple is better" (graphic representation, unit weighting for linear composites), and "some things you learn aren't so." I have learned to avoid the many misconceptions that surround Fisherian null hypothesis testing. I have also learned the importance of power analysis and the determination of just how big (rather than how statistically significant) are the effects that we study. Finally, I have learned that there is no royal road to statistical induction, that the informed judgment of the investigator is the crucial element in the interpretation of data, and that things take time.

1,764 citations

BookDOI
01 Jan 2004
TL;DR: Beyond Significance Testing as mentioned in this paper provides integrative and clear presentations about the limitations of statistical tests and reviews alternative methods of data analysis, such as effect size estimation (at both the group and case levels) and interval estimation (i.e., confidence intervals).
Abstract: Practices of data analysis in psychology and related disciplines are changing. This is evident in the longstanding controversy about statistical tests in the behavioral sciences and the increasing number of journals requiring effect size information. Beyond Significance Testing offers integrative and clear presentations about the limitations of statistical tests and reviews alternative methods of data analysis, such as effect size estimation (at both the group and case levels) and interval estimation (i.e., confidence intervals). Written in a clear and accessible style, the book is intended for applied researchers and students who may not have strong quantitative backgrounds. Readers will learn how to measure effect size on continuous or dichotomous outcomes in comparative studies with independent or dependent samples. They will also learn how to calculate and correctly interpret confidence intervals for effect sizes. Numerous research examples from a wide range of areas illustrate the application of these principles and how to estimate substantive significance instead of just statistical significance.

924 citations

01 Jan 2008

274 citations

Journal ArticleDOI
TL;DR: Kline as discussed by the authors reviewed the controversy regarding significance testing, and offered methods for effect size and confidence interval estimation, and suggested some alternative methodologies, and concluded that there is no "magical alternative" to statistical tests and that such tests are appropriate in some circumstances when applied correctly.
Abstract: REX B. KLINE Beyond Significance Testing: Reforming Data Analysis Methods in Behavioral Research Washington, DC: American Psychological Association, 2004, 336 pages (ISBN 1-59147-118-4, US$49.95 Hardcover) In 1999, a blue-ribbon task force assembled by the American Psychological Association published their findings with regards to the long-standing controversy pertaining to null hypothesis significance testing (NHST). The task force dictated effect sizes and confidence intervals be reported, and p values and dichotomous accept-reject decisions be given less weight. Editorial policies in a number of journals came to reflect the views of the task force as did a subsequent revision to the American Psychological Association Publication Manual. Rex B. Kline wrote Beyond Significance Testing. Reforming Data Analysis Methods in Behavioral Research as a follow-up to both the task force recommendations and the revision to the publication manual. Kline's 1998 book Principles and Practice of Structural Equation Modeling (Guilford Press) was well received and a second edition is being published this fall. In Beyond Significance Testing, Kline reviews the controversy regarding significance testing, offers methods for effect size and confidence interval estimation, and suggests some alternative methodologies. There is an accompanying website that includes resources for instructors and students. Part I of the book is a review of fundamental concepts and the debate regarding significance testing. Part II provides statistics for effect size and confidence interval estimation for parametric and nonparametric two-group, oneway, and factorial designs. Part III examines metaanalysis, resampling, and Bayesian estimation procedures. In the first chapter, Kline provides a scholarly summary of the null hypothesis testing debate concluding with the APA task force findings and what Kline regards as ambiguous recommendations in the publication manual. Kline predicts the future will see a smaller role for traditional statistical testing (p values) in psychology. This change will take time and may not occur until the next generation of researchers are trained, but Kline anticipates the social sciences will then become more like the natural sciences in that "we will report the directions and magnitudes of our effects, determine whether they replicate, and evaluate them for their theoretical, clinical, or practical significance" (p. 15). Chapter 2 is a review of fundamental concepts of research design, including sampling and estimation, the logic of statistical significance testing, and t, F, and chi-square tests. The problems with statistical tests are revisited in Chapter 3. What follows is a long list of errors in interpretation of p values and conclusions made after null hypothesis testing. The emphasis on null hypothesis significance testing in psychology is also argued to inhibit advancement of the discipline. To be fair, Kline recognizes there is yet no "magical alternative" to statistical tests and that such tests are appropriate in some circumstances when applied correctly. Nonetheless, Kline envisions a future where effect sizes and confidence intervals are reported, substantive rather than statistical significance predominates, and "NHST-Centric" thinking has diminished. Part II covers effect size and confidence interval calculations. Chapter 4 is a presentation of parametric effect size indexes. Independent and dependent sample statistics are covered separately. The textbook's website has a supplementary chapter on twogroup multivariate designs. Group difference indexes such as d are distinguished from measures of association such as r. Case level analyses of group differences are also reviewed. Sections not relevant to a reader's needs can be skipped without loss of continuity. Interpretive guidelines for effect size magnitude and how one might be fooled by effect size estimation are sections that should not be passed over. …

174 citations