scispace - formally typeset
Open AccessJournal ArticleDOI

Sample size, statistical power, and false conclusions in infant looking-time research.

Lisa M. Oakes
- 01 Jul 2017 - 
- Vol. 22, Iss: 4, pp 436-469
TLDR
Examining the effect of sample size on statistical power and the conclusions drawn from infant looking time research revealed that despite clear results with the original large samples, the results with smaller subsamples were highly variable, yielding both false positive and false negative outcomes.
Abstract
Infant research is hard. It is difficult, expensive, and time consuming to identify, recruit and test infants. As a result, ours is a field of small sample sizes. Many studies using infant looking time as a measure have samples of 8 to 12 infants per cell, and studies with more than 24 infants per cell are uncommon. This paper examines the effect of such sample sizes on statistical power and the conclusions drawn from infant looking time research. An examination of the state of the current literature suggests that most published looking time studies have low power, which leads in the long run to an increase in both false positive and false negative results. Three data sets with large samples (>30 infants) were used to simulate experiments with smaller sample sizes; 1000 random subsamples of 8, 12, 16, 20, and 24 infants from the overall samples were selected, making it possible to examine the systematic effect of sample size on the results. This approach revealed that despite clear results with the original large samples, the results with smaller subsamples were highly variable, yielding both false positive and false negative outcomes. Finally, a number of emerging possible solutions are discussed.

read more

Citations
More filters
Journal ArticleDOI

Quantifying sources of variability in infancy research using the infant-directed-speech preference

Michael C. Frank, +148 more
TL;DR: In this paper, a large-scale, multisite study aimed at assessing the overall replicability of a single theoretically important phenomenon and examining methodological, cultural, and developmental moderators was conducted.
Journal ArticleDOI

Promoting Replicability in Developmental Research Through Meta‐analyses: Insights From Language Acquisition Research

TL;DR: Analyzing a collection of 12 standardized meta‐analyses on language development between birth and 5 years concludes with a discussion on how to increase replicability in both language acquisition studies specifically and developmental research more generally.
Journal ArticleDOI

Infants' evaluation of prosocial and antisocial agents: A meta-analysis.

TL;DR: There was evidence of a publication bias, suggesting that the effect size in published studies is likely to be inflated and the distribution of children who chose the prosocial agent in experiments with N = 16 suggested a file-drawer problem.

Outlier removal, sum scores, and the inflation of the Type I error rate

TL;DR: Results of simulations of artificial and actual psychological data are presented, which show that the removal of outliers based on commonly used Z value thresholds severely increases the Type I error rate.
Journal ArticleDOI

Meta-analytic review of the development of face discrimination in infancy: Face race, face gender, infant age, and methodology moderate face discrimination.

TL;DR: Infants’ capacity to discriminate faces is sensitive to face characteristics including race, gender, and emotion as well as the methods used, including task timing, coding method, and visual angle.
References
More filters
Journal ArticleDOI

A power primer.

TL;DR: A convenient, although not comprehensive, presentation of required sample sizes is providedHere the sample sizes necessary for .80 power to detect effects at these levels are tabled for eight standard statistical tests.
Journal ArticleDOI

Power failure: why small sample size undermines the reliability of neuroscience

TL;DR: It is shown that the average statistical power of studies in the neurosciences is very low, and the consequences include overestimates of effect size and low reproducibility of results.
Journal ArticleDOI

Estimating the reproducibility of psychological science

Alexander A. Aarts, +290 more
- 28 Aug 2015 - 
TL;DR: A large-scale assessment suggests that experimental reproducibility in psychology leaves a lot to be desired, and correlational tests suggest that replication success was better predicted by the strength of original evidence than by characteristics of the original and replication teams.
Book

Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs

TL;DR: A practical primer on how to calculate and report effect sizes for t-tests and ANOVA's such that effect sizes can be used in a-priori power analyses and meta-analyses and a detailed overview of the similarities and differences between within- and between-subjects designs is provided.
Journal ArticleDOI

False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant

TL;DR: It is shown that despite empirical psychologists’ nominal endorsement of a low rate of false-positive findings, flexibility in data collection, analysis, and reporting dramatically increases actual false- positive rates, and a simple, low-cost, and straightforwardly effective disclosure-based solution is suggested.
Related Papers (5)

Estimating the reproducibility of psychological science

Alexander A. Aarts, +290 more
- 28 Aug 2015 -