Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse.
Reads0
Chats0
TLDR
Full model tests and P value adjustments can be used as a guide to how frequently type I errors arise by sampling variation alone, and favour the presentation of full models, since they best reflect the range of predictors investigated and ensure a balanced representation also of non-significant results.Abstract:
Fitting generalised linear models (GLMs) with more than one predictor has become the standard method of analysis in evolutionary and behavioural research. Often, GLMs are used for exploratory data analysis, where one starts with a complex full model including interaction terms and then simplifies by removing non-significant terms. While this approach can be useful, it is problematic if significant effects are interpreted as if they arose from a single a priori hypothesis test. This is because model selection involves cryptic multiple hypothesis testing, a fact that has only rarely been acknowledged or quantified. We show that the probability of finding at least one ‘significant’ effect is high, even if all null hypotheses are true (e.g. 40% when starting with four predictors and their two-way interactions). This probability is close to theoretical expectations when the sample size (N) is large relative to the number of predictors including interactions (k). In contrast, type I error rates strongly exceed even those expectations when model simplification is applied to models that are over-fitted before simplification (low N/k ratio). The increase in false-positive results arises primarily from an overestimation of effect sizes among significant predictors, leading to upward-biased effect sizes that often cannot be reproduced in follow-up studies (‘the winner's curse’). Despite having their own problems, full model tests and P value adjustments can be used as a guide to how frequently type I errors arise by sampling variation alone. We favour the presentation of full models, since they best reflect the range of predictors investigated and ensure a balanced representation also of non-significant results.read more
Citations
More filters
Journal ArticleDOI
The rate of telomere loss is related to maximum lifespan in birds
Gianna M Tricola,Mirre J. P. Simons,Els Atema,Raoul K. Boughton,Jerram L. Brown,Donald C. Dearborn,George J. Divoky,John A. Eimes,Charles E. Huntington,Alexander S. Kitaysky,Frans A. Juola,David B. Lank,Hannah P. Litwa,Ellis Mulder,Ian C. T. Nisbet,Kazuo Okanoya,Rebecca J. Safran,Stephan J. Schoech,E. A. Schreiber,Paul M. Thompson,Simon Verhulst,Nathaniel T. Wheelwright,David W. Winkler,Rebecca C. Young,Carol M. Vleck,Mark F. Haussmann +25 more
TL;DR: It is found that bird species with longer lifespans lose fewer telomeric repeats each year compared with species with shorter Lifespans, which suggests that the physiological causes of telomere shortening, or the ability to maintain telomeres, are features that may be responsible for, or co-evolved with, differentlifespans observed across species.
Journal ArticleDOI
High dose deferoxamine in intracerebral hemorrhage (HI-DEF) trial: rationale, design, and methods.
TL;DR: The Hi-Def trial is expected to advance the understanding of the pathopgysiology of secondary neuronal injury in ICH and will provide a crucial “Go/No Go” signal as to whether a Phase III trial to investigate the efficacy of DFO is warranted.
Journal ArticleDOI
Developing multiple hypotheses in behavioral ecology
TL;DR: This work outlines and provides examples of three approaches for multiple hypothesis evaluation, and discusses two practical issues behavioral ecologists are likely to face.
Journal ArticleDOI
Information-theoretic approaches to statistical analysis in behavioural ecology: an introduction
TL;DR: This special issue examines the suitability of the IT method for analysing data with multiple predictors, which researchers encounter in the authors' field and brings together different viewpoints to aid behavioural ecologists in understanding the method.
Journal ArticleDOI
Uncertainty and Surprise Jointly Predict Musical Pleasure and Amygdala, Hippocampus, and Auditory Cortex Activity
Vincent Ka Ming Cheung,Peter M. C. Harrison,Lars Meyer,Marcus T. Pearce,Marcus T. Pearce,John-Dylan Haynes,Stefan Koelsch,Stefan Koelsch +7 more
TL;DR: It is demonstrated that pleasure varies nonlinearly as a function of the listener's uncertainty when anticipating a musical event, and the surprise it evokes when it deviates from expectations.
References
More filters
Journal ArticleDOI
Controlling the false discovery rate: a practical and powerful approach to multiple testing
Yoav Benjamini,Yosef Hochberg +1 more
TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.
Book
Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach
TL;DR: The second edition of this book is unique in that it focuses on methods for making formal statistical inference from all the models in an a priori set (Multi-Model Inference).
Book
Multiple Regression: Testing and Interpreting Interactions
Leona S. Aiken,Stephen G. West +1 more
TL;DR: In this article, the effects of predictor scaling on the coefficients of regression equations are investigated. But, they focus mainly on the effect of predictors scaling on coefficients of regressions.
Book
Discovering Statistics Using SPSS
Andy P. Field,Jeremy N.V. Miles +1 more
TL;DR: Suitable for those new to statistics as well as students on intermediate and more advanced courses, the book walks students through from basic to advanced level concepts, all the while reinforcing knowledge through the use of SAS(R).
Journal ArticleDOI
A Simple Sequentially Rejective Multiple Test Procedure
TL;DR: In this paper, a simple and widely accepted multiple test procedure of the sequentially rejective type is presented, i.e. hypotheses are rejected one at a time until no further rejections can be done.