scispace - formally typeset
Search or ask a question

Why Most Published Research Findings Are False

15 Aug 2006-Vol. 1, Iss: 4, pp 1-08
TL;DR: In this paper, the authors discuss the implications of these problems for the conduct and interpretation of research and suggest that claimed research findings may often be simply accurate measures of the prevailing bias.
Abstract: There is increasing concern that most current published research findings are false. The probability that a research claim is true may depend on study power and bias, the number of other studies on the same question, and, importantly, the ratio of true to no relationships among the relationships probed in each scientific field. In this framework, a research finding is less likely to be true when the studies conducted in a field are smaller; when effect sizes are smaller; when there is a greater number and lesser pre-selection of tested relationships; where there is greater flexibility in designs, definitions, outcomes, and analytical modes; when there is greater financial and other interest and prejudice; and when more teams are involved in a scientific field in chase of statistical significance. Simulations show that for most study designs and settings, it is more likely for a research claim to be false than true. Moreover, for many current scientific fields, claimed research findings may often be simply accurate measures of the prevailing bias. In this essay, I discuss the implications of these problems for the conduct and interpretation of research.
Citations
More filters
Journal ArticleDOI
22 Apr 2013-PLOS ONE
TL;DR: The phyloseq project for R is a new open-source software package dedicated to the object-oriented representation and analysis of microbiome census data in R, which supports importing data from a variety of common formats, as well as many analysis techniques.
Abstract: Background The analysis of microbial communities through DNA sequencing brings many challenges: the integration of different types of data with methods from ecology, genetics, phylogenetics, multivariate statistics, visualization and testing. With the increased breadth of experimental designs now being pursued, project-specific statistical analyses are often needed, and these analyses are often difficult (or impossible) for peer researchers to independently reproduce. The vast majority of the requisite tools for performing these analyses reproducibly are already implemented in R and its extensions (packages), but with limited support for high throughput microbiome census data. Results Here we describe a software project, phyloseq, dedicated to the object-oriented representation and analysis of microbiome census data in R. It supports importing data from a variety of common formats, as well as many analysis techniques. These include calibration, filtering, subsetting, agglomeration, multi-table comparisons, diversity analysis, parallelized Fast UniFrac, ordination methods, and production of publication-quality graphics; all in a manner that is easy to document, share, and modify. We show how to apply functions from other R packages to phyloseq-represented data, illustrating the availability of a large number of open source analysis techniques. We discuss the use of phyloseq with tools for reproducible research, a practice common in other fields but still rare in the analysis of highly parallel microbiome census data. We have made available all of the materials necessary to completely reproduce the analysis and figures included in this article, an example of best practices for reproducible research. Conclusions The phyloseq project for R is a new open-source software package, freely available on the web from both GitHub and Bioconductor.

11,272 citations

Journal ArticleDOI
TL;DR: It is shown that the average statistical power of studies in the neurosciences is very low, and the consequences include overestimates of effect size and low reproducibility of results.
Abstract: A study with low statistical power has a reduced chance of detecting a true effect, but it is less well appreciated that low power also reduces the likelihood that a statistically significant result reflects a true effect. Here, we show that the average statistical power of studies in the neurosciences is very low. The consequences of this include overestimates of effect size and low reproducibility of results. There are also ethical dimensions to this problem, as unreliable research is inefficient and wasteful. Improving reproducibility in neuroscience is a key priority and requires attention to well-established but often ignored methodological principles.

5,683 citations

Journal ArticleDOI
28 Aug 2015-Science
TL;DR: A large-scale assessment suggests that experimental reproducibility in psychology leaves a lot to be desired, and correlational tests suggest that replication success was better predicted by the strength of original evidence than by characteristics of the original and replication teams.
Abstract: Reproducibility is a defining feature of science, but the extent to which it characterizes current research is unknown. We conducted replications of 100 experimental and correlational studies published in three psychology journals using high-powered designs and original materials when available. Replication effects were half the magnitude of original effects, representing a substantial decline. Ninety-seven percent of original studies had statistically significant results. Thirty-six percent of replications had statistically significant results; 47% of original effect sizes were in the 95% confidence interval of the replication effect size; 39% of effects were subjectively rated to have replicated the original result; and if no bias in original results is assumed, combining original and replication results left 68% with statistically significant effects. Correlational tests suggest that replication success was better predicted by the strength of original evidence than by characteristics of the original and replication teams.

5,532 citations

Journal ArticleDOI
TL;DR: It is shown that despite empirical psychologists’ nominal endorsement of a low rate of false-positive findings, flexibility in data collection, analysis, and reporting dramatically increases actual false- positive rates, and a simple, low-cost, and straightforwardly effective disclosure-based solution is suggested.
Abstract: In this article, we accomplish two things. First, we show that despite empirical psychologists' nominal endorsement of a low rate of false-positive findings (≤ .05), flexibility in data collection, analysis, and reporting dramatically increases actual false-positive rates. In many cases, a researcher is more likely to falsely find evidence that an effect exists than to correctly find evidence that it does not. We present computer simulations and a pair of actual experiments that demonstrate how unacceptably easy it is to accumulate (and report) statistically significant evidence for a false hypothesis. Second, we suggest a simple, low-cost, and straightforwardly effective disclosure-based solution to this problem. The solution involves six concrete requirements for authors and four guidelines for reviewers, all of which impose a minimal burden on the publication process.

4,727 citations

Journal ArticleDOI
TL;DR: It is found that the most common software packages for fMRI analysis (SPM, FSL, AFNI) can result in false-positive rates of up to 70%.
Abstract: The most widely used task functional magnetic resonance imaging (fMRI) analyses use parametric statistical methods that depend on a variety of assumptions. In this work, we use real resting-state data and a total of 3 million random task group analyses to compute empirical familywise error rates for the fMRI software packages SPM, FSL, and AFNI, as well as a nonparametric permutation method. For a nominal familywise error rate of 5%, the parametric statistical methods are shown to be conservative for voxelwise inference and invalid for clusterwise inference. Our results suggest that the principal cause of the invalid cluster inferences is spatial autocorrelation functions that do not follow the assumed Gaussian shape. By comparison, the nonparametric permutation test is found to produce nominal results for voxelwise as well as clusterwise inference. These findings speak to the need of validating the statistical methods being used in the field of neuroimaging.

2,946 citations

References
More filters
Journal ArticleDOI
19 Apr 2000-JAMA
TL;DR: A checklist contains specifications for reporting of meta-analyses of observational studies in epidemiology, including background, search strategy, methods, results, discussion, and conclusion should improve the usefulness ofMeta-an analyses for authors, reviewers, editors, readers, and decision makers.
Abstract: ObjectiveBecause of the pressure for timely, informed decisions in public health and clinical practice and the explosion of information in the scientific literature, research results must be synthesized. Meta-analyses are increasingly used to address this problem, and they often evaluate observational studies. A workshop was held in Atlanta, Ga, in April 1997, to examine the reporting of meta-analyses of observational studies and to make recommendations to aid authors, reviewers, editors, and readers.ParticipantsTwenty-seven participants were selected by a steering committee, based on expertise in clinical practice, trials, statistics, epidemiology, social sciences, and biomedical editing. Deliberations of the workshop were open to other interested scientists. Funding for this activity was provided by the Centers for Disease Control and Prevention.EvidenceWe conducted a systematic review of the published literature on the conduct and reporting of meta-analyses in observational studies using MEDLINE, Educational Research Information Center (ERIC), PsycLIT, and the Current Index to Statistics. We also examined reference lists of the 32 studies retrieved and contacted experts in the field. Participants were assigned to small-group discussions on the subjects of bias, searching and abstracting, heterogeneity, study categorization, and statistical methods.Consensus ProcessFrom the material presented at the workshop, the authors developed a checklist summarizing recommendations for reporting meta-analyses of observational studies. The checklist and supporting evidence were circulated to all conference attendees and additional experts. All suggestions for revisions were addressed.ConclusionsThe proposed checklist contains specifications for reporting of meta-analyses of observational studies in epidemiology, including background, search strategy, methods, results, discussion, and conclusion. Use of the checklist should improve the usefulness of meta-analyses for authors, reviewers, editors, readers, and decision makers. An evaluation plan is suggested and research areas are explored.

17,663 citations

Journal ArticleDOI
15 Oct 1999-Science
TL;DR: A generic approach to cancer classification based on gene expression monitoring by DNA microarrays is described and applied to human acute leukemias as a test case and suggests a general strategy for discovering and predicting cancer classes for other types of cancer, independent of previous biological knowledge.
Abstract: Although cancer classification has improved over the past 30 years, there has been no general approach for identifying new cancer classes (class discovery) or for assigning tumors to known classes (class prediction). Here, a generic approach to cancer classification based on gene expression monitoring by DNA microarrays is described and applied to human acute leukemias as a test case. A class discovery procedure automatically discovered the distinction between acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL) without previous knowledge of these classes. An automatically derived class predictor was able to determine the class of new leukemia cases. The results demonstrate the feasibility of cancer classification based solely on gene expression monitoring and suggest a general strategy for discovering and predicting cancer classes for other types of cancer, independent of previous biological knowledge.

12,530 citations

Journal ArticleDOI
TL;DR: The revised CONSORT statement is intended to improve the reporting of an RCT, enabling readers to understand a trial's conduct and to assess the validity of its results.

4,977 citations

Journal ArticleDOI
TL;DR: This report hopes this report will generate further thought about ways to improve the quality of reports of meta-analyses of RCTs and that interested readers, reviewers, researchers, and editors will use the QUOROM statement and generate ideas for its improvement.

4,767 citations

Journal ArticleDOI
TL;DR: The revised CONSORT statement is intended to improve the reporting of an RCT, enabling readers to understand a trial's conduct and to assess the validity of its results.
Abstract: To comprehend the results of a randomised controlled trial (RCT), readers must understand its design, conduct, analysis, and interpretation. That goal can be achieved only through total transparency from authors. Despite several decades of educational efforts, the reporting of RCTs needs improvement. Investigators and editors developed the original CONSORT (Consolidated Standards of Reporting Trials) statement to help authors improve reporting by use of a checklist and flow diagram. The revised CONSORT statement presented here incorporates new evidence and addresses some criticisms of the original statement. The checklist items pertain to the content of the Title, Abstract, Introduction, Methods, Results, and Discussion. The revised checklist includes 22 items selected because empirical evidence indicates that not reporting this information is associated with biased estimates of treatment effect, or because the information is essential to judge the reliability or relevance of the findings. We intended the flow diagram to depict the passage of participants through an RCT. The revised flow diagram depicts information from four stages of a trial (enrollment, intervention allocation, follow- up, and analysis). The diagram explicitly shows the number of participants, for each intervention group, included in the primary data analysis. Inclusion of these numbers allows the reader to judge whether the authors have done an intention- to-treat analysis. In sum, the CONSORT statement is intended to improve the reporting of an RCT, enabling readers to understand a trial's conduct and to assess the validity of its results.

2,011 citations

Trending Questions (2)
Why Most Published Research Findings Are False?

The paper explains that most published research findings are false due to factors such as small study sizes, smaller effect sizes, pre-selection of tested relationships, flexibility in designs and outcomes, financial interests, and statistical significance chasing.

Why Most Published Research Findings Are False?

Most published research findings are false due to factors such as small study sizes, smaller effect sizes, bias, and financial interests.