The essential guide to effect sizes : statistical power, meta-analysis, and the interpretation of research results

doi:10.1017/CBO9780511761676

Home
/
Papers
/
The essential guide to effect sizes : statistical power, meta-analysis, and the interpretation of research results

Monograph•DOI•

The essential guide to effect sizes : statistical power, meta-analysis, and the interpretation of research results

Paul D. Ellis¹•Institutions (1)

Hong Kong Polytechnic University¹

01 Jul 2010-

TL;DR: This book discusses effect sizes, meta-Analysis, and the interpretation of results in the context of meta-analysis, which addresses the role of sample sizes in the analysis of power research.

read less

Abstract: List of figures List of tables List of boxes Introduction Part I. Effect Sizes and the Interpretation of Results: 1. Introduction to effect sizes 2. Interpreting effects Part II. The Analysis of Statistical Power: 3. Power analysis and the detection of effects 4. The painful lessons of power research Part III. Meta-Analysis: 5. Drawing conclusions using meta-analysis 6. Minimizing bias in meta-analysis Last word: thirty recommendations for researchers Appendices: 1. Minimum sample sizes 2. Alternative methods for meta-analysis Bibliography Index.

...read moreread less

Citations

PDF

Open Access

More filters

Book•

Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs

[...]

Daniel Lakens¹•Institutions (1)

Eindhoven University of Technology¹

01 Jun 2015

TL;DR: A practical primer on how to calculate and report effect sizes for t-tests and ANOVA's such that effect sizes can be used in a-priori power analyses and meta-analyses and a detailed overview of the similarities and differences between within- and between-subjects designs is provided.

...read moreread less

Abstract: Effect sizes are the most important outcome of empirical studies. Most articles on effect sizes highlight their importance to communicate the practical significance of results. For scientists themselves, effect sizes are most useful because they facilitate cumulative science. Effect sizes can be used to determine the sample size for follow-up studies, or examining effects across studies. This article aims to provide a practical primer on how to calculate and report effect sizes for t-tests and ANOVA’s such that effect sizes can be used in a-priori power analyses and meta-analyses. Whereas many articles about effect sizes focus on between-subjects designs and address within-subjects designs only briefly, I provide a detailed overview of the similarities and differences between within- and between-subjects designs. I suggest that some research questions in experimental psychology examine inherently intra-individual effects, which makes effect sizes that incorporate the correlation between measures the best summary of the results. Finally, a supplementary spreadsheet is provided to make it as easy as possible for researchers to incorporate effect size calculations into their workflow.

...read moreread less

5,374 citations

Cites methods from "The essential guide to effect sizes..."

...To report this analysis, researchers could write in the procedure section that: “Twenty participants evaluated either Movie 1 (n = 10) or Movie 2 (n = 10). Participants reported higher evaluations of Movie 1 (M = 8.7, SD = 0.82) than Movie 2 (M = 7.7, SD = 0.95), F(1, 18) = 6.34, p = 0.022, η(2)p = 0.26, 90% CI [0.02, 0.48].” Whereas in a t-test, we compare two groups, and can therefore calculate a confidence interval for the mean difference, we can perform F-tests for comparisons between more than two groups. To be able to communicate the uncertainty in the data, we should still report a confidence interval, but now we report the confidence interval around the effect size. An excellent explanation of confidence intervals around effect size estimates for F-tests, which is accompanied by easy to use syntax files for a range of statistical software packages (including SPSS) is provided by Smithson (2001) 2....
[...]
...Although η2 is an efficient way to compare the sizes of effects within a study (given that every effect is interpreted in relation to the total variance, all η2 from a single study sum to 100%), eta squared cannot easily be compared between studies, because the total variability in a study (SStotal) depends on the design of a study, and increases when additional variables are manipulated. Keppel (1991) has recommended partial eta squared (η(2)p) to improve the comparability of effect sizes between studies, which expresses the sum of squares of the effect in relation to the sum of squares of the effect and the sum of squares of the error associated with the effect....
[...]

Journal Article•DOI•

Effect size estimates: Current use, calculations, and interpretation.

[...]

Catherine O. Fritz¹, Peter E. Morris¹, Jennifer J. Richler²•Institutions (2)

Lancaster University¹, Vanderbilt University²

01 Feb 2012-Journal of Experimental Psychology: General

TL;DR: A straightforward guide to understanding, selecting, calculating, and interpreting effect sizes for many types of data and to methods for calculating effect size confidence intervals and power analysis is provided.

...read moreread less

Abstract: The Publication Manual of the American Psychological Association (American Psychological Association, 2001, American Psychological Association, 2010) calls for the reporting of effect sizes and their confidence intervals. Estimates of effect size are useful for determining the practical or theoretical importance of an effect, the relative contributions of factors, and the power of an analysis. We surveyed articles published in 2009 and 2010 in the Journal of Experimental Psychology: General, noting the statistical analyses reported and the associated reporting of effect size estimates. Effect sizes were reported for fewer than half of the analyses; no article reported a confidence interval for an effect size. The most often reported analysis was analysis of variance, and almost half of these reports were not accompanied by effect sizes. Partial η2 was the most commonly reported effect size estimate for analysis of variance. For t tests, 2/3 of the articles did not report an associated effect size estimate; Cohen's d was the most often reported. We provide a straightforward guide to understanding, selecting, calculating, and interpreting effect sizes for many types of data and to methods for calculating effect size confidence intervals and power analysis.

...read moreread less

3,117 citations

Cites background from "The essential guide to effect sizes..."

...Readers may prefer to consult specialized statistics books addressing effect sizes (e.g., Cumming, 2012; Ellis, 2010; Grissom & Kim, 2005; Rosenthal et al., 2000)....
[...]
...…columns of a contingency table represent a predictor and a predicted variable, Goodman–Kruskal’s lambda (L) describes how much the prediction is improved by knowing the category for the predictor, a potentially useful description of the size of the effect (Ellis, 2010; Siegel & Castellan, 1988)....
[...]
...Rosnow and Rosenthal (1989), for example, illustrated how a very small effect relating to life-threatening situations, such as the reduction of heart attacks, is important in the context of saving lives on a worldwide basis (see Table 12 and Ellis, 2010)....
[...]
...They provide a description of the size of observed effects that is independent of the possibly misleading influences of sample size....
[...]
...A discussion of the rather confusing history of the chosen symbols for these statistics can be found in Ellis (2010)....
[...]

Journal Article•DOI•

The New Statistics Why and How

[...]

Geoff Cumming¹•Institutions (1)

La Trobe University¹

01 Jan 2014-Psychological Science

TL;DR: An eight-step new-statistics strategy for research with integrity is described, which starts with formulation of research questions in estimation terms, has no place for NHST, and is aimed at building a cumulative quantitative discipline.

...read moreread less

Abstract: We need to make substantial changes to how we conduct research. First, in response to heightened concern that our published research literature is incomplete and untrustworthy, we need new requirements to ensure research integrity. These include prespecification of studies whenever possible, avoidance of selection and other inappropriate data- analytic practices, complete reporting, and encouragement of replication. Second, in response to renewed recognition of the severe flaws of null-hypothesis significance testing (NHST), we need to shift from reliance on NHST to estimation and other preferred techniques. The new statistics refers to recommended practices, including estimation based on effect sizes, confidence intervals, and meta-analysis. The techniques are not new, but adopting them widely would be new for many researchers, as well as highly beneficial. This article explains why the new statistics are important and offers guidance for their use. It describes an eight-step new-statistics strategy for research with integrity, which starts with formulation of research questions in estimation terms, has no place for NHST, and is aimed at building a cumulative quantitative discipline.

...read moreread less

2,339 citations

Cites background from "The essential guide to effect sizes..."

...Ellis (2010) provided an accessible introduction to a range of ES measures....
[...]

Journal Article•DOI•

Cross-national epidemiology of DSM-IV major depressive episode

[...]

Evelyn J. Bromet¹, Laura Helena Andrade², Irving Hwang³, Nancy A. Sampson³, Jordi Alonso, Giovanni de Girolamo, Ron de Graaf⁴, Koen Demyttenaere⁵, Chiyi Hu, Noboru Iwata⁶, A. N. Karam⁷, Jagdish Kaur, Stanislav Kostyuchenko, Jean-Pierre Lépine⁸, Daphna Levinson⁹, Herbert Matschinger¹⁰, Maria Elena Medina Mora, Mark Anthony Oakley Browne¹¹, Jose Posada-Villa, Maria Carmen Viana², David R. Williams³, Ronald C. Kessler³ - Show less +18 more•Institutions (11)

Stony Brook University¹, University of São Paulo², Harvard University³, Utrecht University⁴, Katholieke Universiteit Leuven⁵, International University, Cambodia⁶, Saint George Hospital⁷, Paris Diderot University⁸, Mental Health Services⁹, Leipzig University¹⁰, University of Tasmania¹¹

26 Jul 2011-BMC Medicine

TL;DR: Data is presented on the prevalence, impairment and demographic correlates of depression from 18 high and low- to middle-income countries in the World Mental Health Survey Initiative to investigate the combination of demographic risk factors that are most strongly associated with MDE in the specific countries included in the WMH.

...read moreread less

Abstract: Major depression is one of the leading causes of disability worldwide, yet epidemiologic data are not available for many countries, particularly low- to middle-income countries. In this paper, we present data on the prevalence, impairment and demographic correlates of depression from 18 high and low- to middle-income countries in the World Mental Health Survey Initiative. Major depressive episodes (MDE) as defined by the Diagnostic and Statistical Manual of Mental Disorders, fourth edition (DMS-IV) were evaluated in face-to-face interviews using the World Health Organization Composite International Diagnostic Interview (CIDI). Data from 18 countries were analyzed in this report (n = 89,037). All countries surveyed representative, population-based samples of adults. The average lifetime and 12-month prevalence estimates of DSM-IV MDE were 14.6% and 5.5% in the ten high-income and 11.1% and 5.9% in the eight low- to middle-income countries. The average age of onset ascertained retrospectively was 25.7 in the high-income and 24.0 in low- to middle-income countries. Functional impairment was associated with recency of MDE. The female: male ratio was about 2:1. In high-income countries, younger age was associated with higher 12-month prevalence; by contrast, in several low- to middle-income countries, older age was associated with greater likelihood of MDE. The strongest demographic correlate in high-income countries was being separated from a partner, and in low- to middle-income countries, was being divorced or widowed. MDE is a significant public-health concern across all regions of the world and is strongly linked to social conditions. Future research is needed to investigate the combination of demographic risk factors that are most strongly associated with MDE in the specific countries included in the WMH.

...read moreread less

1,681 citations

Journal Article•DOI•

Evidence-based guidelines on the therapeutic use of repetitive transcranial magnetic stimulation (rTMS)

[...]

Jean Pascal Lefaucheur¹, Nathalie André-Obadia², Andrea Antal³, Samar S. Ayache¹, Chris Baeken⁴, David H. Benninger⁵, Roberto Cantello, Massimo Cincotta, Mamede de Carvalho⁶, Dirk De Ridder, Hervé Devanne⁷, Vincenzo Di Lazzaro⁸, Saša R. Filipović⁹, Friedhelm C. Hummel¹⁰, Satu K. Jääskeläinen¹¹, Vasilios K. Kimiskidis¹², Giacomo Koch, Berthold Langguth¹³, Thomas Nyffeler¹⁴, A. Oliviero, Frank Padberg¹⁵, Emmanuel Poulet, Simone Rossi¹⁶, Paolo Maria Rossini¹⁷, John C. Rothwell¹⁸, Carlos Schönfeldt-Lecuona¹⁹, Hartwig R. Siebner²⁰, Christina W. Slotema, Charlotte J. Stagg²¹, Josep Valls-Solé²², Ulf Ziemann²³, Walter Paulus³, Luis Garcia-Larrea² - Show less +29 more•Institutions (23)

Paris 12 Val de Marne University¹, French Institute of Health and Medical Research², University of Göttingen³, Ghent University⁴, University Hospital of Lausanne⁵, University of Lisbon⁶, university of lille⁷, Università Campus Bio-Medico⁸, University of Belgrade⁹, University of Hamburg¹⁰, Turku University Hospital¹¹, Aristotle University of Thessaloniki¹², University of Regensburg¹³, University of Bern¹⁴, Ludwig Maximilian University of Munich¹⁵, University of Siena¹⁶, The Catholic University of America¹⁷, University College London¹⁸, University of Ulm¹⁹, Copenhagen University Hospital²⁰, University of Oxford²¹, University of Barcelona²², University of Tübingen²³

01 Nov 2014-Clinical Neurophysiology

TL;DR: There is a sufficient body of evidence to accept with level A (definite efficacy) the analgesic effect of high-frequency rTMS of the primary motor cortex (M1) contralateral to the pain and the antidepressant effect of HF-rT MS of the left dorsolateral prefrontal cortex (DLPFC).

...read moreread less

1,554 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences

[...]

Franz Faul¹, Edgar Erdfelder², Albert Georg Lang³, Axel Buchner³•Institutions (3)

University of Kiel¹, University of Mannheim², University of Düsseldorf³

01 May 2007-Behavior Research Methods

TL;DR: G*Power 3 provides improved effect size calculators and graphic options, supports both distribution-based and design-based input modes, and offers all types of power analyses in which users might be interested.

...read moreread less

Abstract: G*Power (Erdfelder, Faul, & Buchner, 1996) was designed as a general stand-alone power analysis program for statistical tests commonly used in social and behavioral research. G*Power 3 is a major extension of, and improvement over, the previous versions. It runs on widely used computer platforms (i.e., Windows XP, Windows Vista, and Mac OS X 10.4) and covers many different statistical tests of thet, F, and χ2 test families. In addition, it includes power analyses forz tests and some exact tests. G*Power 3 provides improved effect size calculators and graphic options, supports both distribution-based and design-based input modes, and offers all types of power analyses in which users might be interested. Like its predecessors, G*Power 3 is free.

...read moreread less

40,195 citations

Journal Article•DOI•

A power primer.

[...]

Jacob Cohen¹•Institutions (1)

York University¹

01 Jul 1992-Psychological Bulletin

TL;DR: A convenient, although not comprehensive, presentation of required sample sizes is providedHere the sample sizes necessary for .80 power to detect effects at these levels are tabled for eight standard statistical tests.

...read moreread less

Abstract: One possible reason for the continued neglect of statistical power analysis in research in the behavioral sciences is the inaccessibility of or difficulty with the standard material. A convenient, although not comprehensive, presentation of required sample sizes is provided here. Effect-size indexes and conventional values for these are given for operationally defined small, medium, and large effects. The sample sizes necessary for .80 power to detect effects at these levels are tabled for eight standard statistical tests: (a) the difference between independent means, (b) the significance of a product-moment correlation, (c) the difference between independent rs, (d) the sign test, (e) the difference between independent proportions, (f) chi-square tests for goodness of fit and contingency tables, (g) one-way analysis of variance, and (h) the significance of a multiple or multiple partial correlation.

...read moreread less

38,291 citations

Journal Article•DOI•

Bias in meta-analysis detected by a simple, graphical test

[...]

Matthias Egger¹, George Davey Smith, Martin Schneider, Christoph E. Minder•Institutions (1)

University of Bristol¹

13 Sep 1997-BMJ

TL;DR: Funnel plots, plots of the trials' effect estimates against sample size, are skewed and asymmetrical in the presence of publication bias and other biases Funnel plot asymmetry, measured by regression analysis, predicts discordance of results when meta-analyses are compared with single large trials.

...read moreread less

Abstract: Objective: Funnel plots (plots of effect estimates against sample size) may be useful to detect bias in meta-analyses that were later contradicted by large trials. We examined whether a simple test of asymmetry of funnel plots predicts discordance of results when meta-analyses are compared to large trials, and we assessed the prevalence of bias in published meta-analyses. Design: Medline search to identify pairs consisting of a meta-analysis and a single large trial (concordance of results was assumed if effects were in the same direction and the meta-analytic estimate was within 30% of the trial); analysis of funnel plots from 37 meta-analyses identified from a hand search of four leading general medicine journals 1993-6 and 38 meta-analyses from the second 1996 issue of the Cochrane Database of Systematic Reviews . Main outcome measure: Degree of funnel plot asymmetry as measured by the intercept from regression of standard normal deviates against precision. Results: In the eight pairs of meta-analysis and large trial that were identified (five from cardiovascular medicine, one from diabetic medicine, one from geriatric medicine, one from perinatal medicine) there were four concordant and four discordant pairs. In all cases discordance was due to meta-analyses showing larger effects. Funnel plot asymmetry was present in three out of four discordant pairs but in none of concordant pairs. In 14 (38%) journal meta-analyses and 5 (13%) Cochrane reviews, funnel plot asymmetry indicated that there was bias. Conclusions: A simple analysis of funnel plots provides a useful test for the likely presence of bias in meta-analyses, but as the capacity to detect bias will be limited when meta-analyses are based on a limited number of small trials the results from such analyses should be treated with considerable caution. Key messages Systematic reviews of randomised trials are the best strategy for appraising evidence; however, the findings of some meta-analyses were later contradicted by large trials Funnel plots, plots of the trials9 effect estimates against sample size, are skewed and asymmetrical in the presence of publication bias and other biases Funnel plot asymmetry, measured by regression analysis, predicts discordance of results when meta-analyses are compared with single large trials Funnel plot asymmetry was found in 38% of meta-analyses published in leading general medicine journals and in 13% of reviews from the Cochrane Database of Systematic Reviews Critical examination of systematic reviews for publication and related biases should be considered a routine procedure

...read moreread less

37,989 citations

Book•

Statistical methods for rates and proportions

[...]

Joseph L. Fleiss¹•Institutions (1)

New York State Department of Mental Hygiene¹

01 Jan 1981

TL;DR: In this paper, the basic theory of Maximum Likelihood Estimation (MLE) is used to detect a difference between two different proportions of a given proportion in a single proportion.

...read moreread less

Abstract: Preface.Preface to the Second Edition.Preface to the First Edition.1. An Introduction to Applied Probability.2. Statistical Inference for a Single Proportion.3. Assessing Significance in a Fourfold Table.4. Determining Sample Sizes Needed to Detect a Difference Between Two Proportions.5. How to Randomize.6. Comparative Studies: Cross-Sectional, Naturalistic, or Multinomial Sampling.7. Comparative Studies: Prospective and Retrospective Sampling.8. Randomized Controlled Trials.9. The Comparison of Proportions from Several Independent Samples.10. Combining Evidence from Fourfold Tables.11. Logistic Regression.12. Poisson Regression.13. Analysis of Data from Matched Samples.14. Regression Models for Matched Samples.15. Analysis of Correlated Binary Data.16. Missing Data.17. Misclassification Errors: Effects, Control, and Adjustment.18. The Measurement of Interrater Agreement.19. The Standardization of Rates.Appendix A. Numerical Tables.Appendix B. The Basic Theory of Maximum Likelihood Estimation.Appendix C. Answers to Selected Problems.Author Index.Subject Index.

...read moreread less

16,435 citations

Book Chapter•

Case study research

[...]

Jean Hartley

28 Apr 2004

TL;DR: The comprehensive and accessible nature of this collection will make it an essential and lasting handbook for researchers and students studying organizations.

...read moreread less

Abstract: Essential Guide to Qualitative Methods in Organizational Research is an excellent resource for students and researchers in the areas of organization studies, management research and organizational psychology, bringing together in one volume the range of methods available for undertaking qualitative data collection and analysis. The volume includes 30 chapters, each focusing on a specific technique. The chapters cover traditional research methods, analysis techniques, and interventions as well as the latest developments in the field. Each chapter reviews how the method has been used in organizational research, discusses the advantages and disadvantages of using the method, and presents a case study example of the method in use. A list of further reading is supplied for those requiring additional information about a given method. The comprehensive and accessible nature of this collection will make it an essential and lasting handbook for researchers and students studying organizations.

...read moreread less

16,383 citations