Cross-validation failure: Small sample sizes lead to large error bars.

doi:10.1016/J.NEUROIMAGE.2017.06.061

Open AccessJournal ArticleDOI

Cross-validation failure: Small sample sizes lead to large error bars.

Gaël Varoquaux

- 24 Jun 2017 -

NeuroImage

- Vol. 180, pp 68-77

TLDR

In this article, the authors raise awareness on error bars of cross-validation, which are often underestimated and propose solutions to increase sample size, tackling possible increases in heterogeneity of the data.

About:

This article is published in NeuroImage.The article was published on 2017-06-24 and is currently open access. It has received 408 citations till now. The article focuses on the topics: Sample size determination & Cross-validation.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Consolidation alters motor sequence-specific distributed representations.

Basile Pinsard, +6 more

- 18 Mar 2019 -

eLife

TL;DR: It is shown, for the first time in humans, that complementary sequence-specific motor representations evolve distinctively during critical phases of skill acquisition and consolidation.

...read moreread less

Journal ArticleDOI

Machine learning algorithm validation with a limited sample size

Andrius Vabalas, +3 more

- 07 Nov 2019 -

PLOS ONE

TL;DR: The authors' simulations show that K-fold Cross-Validation (CV) produces strongly biased performance estimates with small sample sizes, and the bias is still evident with sample size of 1000, while Nested CV and train/test split approaches produce robust and unbiased performance estimates regardless of sample size.

...read moreread less

Journal ArticleDOI

Reproducible brain-wide association studies require thousands of individuals

Scott Marek, +44 more

- 16 Mar 2022 -

Visual education

TL;DR: In this article , the authors used three of the largest neuroimaging datasets currently available, with a total sample size of around 50,000 individuals, to quantify brain-wide association studies effect sizes and reproducibility as a function of sample size.

...read moreread less

Journal ArticleDOI

Reproducible brain-wide association studies require thousands of individuals

Scott Marek, +44 more

- 16 Mar 2022 -

Visual education

TL;DR: In this paper , the authors used three of the largest neuroimaging datasets currently available, with a total sample size of around 50,000 individuals, to quantify brain-wide association studies effect sizes and reproducibility as a function of sample size.

...read moreread less

Journal ArticleDOI

Machine Learning for Precision Psychiatry: Opportunities and Challenges.

Danilo Bzdok, +2 more

- 01 Mar 2018 -

Biological Psychiatry: Cognitive Neurosc...

TL;DR: This primer aims to introduce clinicians and researchers to the opportunities and challenges in bringing machine intelligence into psychiatric practice.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book

The Elements of Statistical Learning

Trevor Hastie, +2 more

Journal Article

Statistical Comparisons of Classifiers over Multiple Data Sets

Janez Demšar

- 01 Dec 2006 -

Journal of Machine Learning Research

TL;DR: A set of simple, yet safe and robust non-parametric tests for statistical comparisons of classifiers is recommended: the Wilcoxon signed ranks test for comparison of two classifiers and the Friedman test with the corresponding post-hoc tests for comparisons of more classifiers over multiple data sets.

...read moreread less

Journal ArticleDOI

The file drawer problem and tolerance for null results

Robert Rosenthal

- 01 May 1979 -

Psychological Bulletin

TL;DR: Quantitative procedures for computing the tolerance for filed and future null results are reported and illustrated, and the implications are discussed.

...read moreread less

Journal ArticleDOI

Power failure: why small sample size undermines the reliability of neuroscience

Katherine S. Button, +6 more

- 01 May 2013 -

Nature Reviews Neuroscience

TL;DR: It is shown that the average statistical power of studies in the neurosciences is very low, and the consequences include overestimates of effect size and low reproducibility of results.

...read moreread less

Journal ArticleDOI

Why Most Published Research Findings Are False

John P. A. Ioannidis

- 01 Aug 2005 -

Chance

TL;DR: In this paper, the authors discuss the implications of these problems for the conduct and interpretation of research and conclude that the probability that a research claim is true may depend on study power and bias, the number of other studies on the same question, and the ratio of true to no relationships among the relationships probed in each scientifi c fi eld.

...read moreread less