scispace - formally typeset
Search or ask a question
Journal ArticleDOI

The control of the false discovery rate in multiple testing under dependency

01 Aug 2001-Annals of Statistics (Institute of Mathematical Statistics)-Vol. 29, Iss: 4, pp 1165-1188
TL;DR: In this paper, it was shown that a simple FDR controlling procedure for independent test statistics can also control the false discovery rate when test statistics have positive regression dependency on each of the test statistics corresponding to the true null hypotheses.
Abstract: Benjamini and Hochberg suggest that the false discovery rate may be the appropriate error rate to control in many applied multiple testing problems. A simple procedure was given there as an FDR controlling procedure for independent test statistics and was shown to be much more powerful than comparable procedures which control the traditional familywise error rate. We prove that this same procedure also controls the false discovery rate when the test statistics have positive regression dependency on each of the test statistics corresponding to the true null hypotheses. This condition for positive dependency is general enough to cover many problems of practical interest, including the comparisons of many treatments with a single control, multivariate normal test statistics with positive correlation matrix and multivariate $t$. Furthermore, the test statistics may be discrete, and the tested hypotheses composite without posing special difficulties. For all other forms of dependency, a simple conservative modification of the procedure controls the false discovery rate. Thus the range of problems for which a procedure with proven FDR control can be offered is greatly increased.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: This paper introduces to the neuroscience literature statistical procedures for controlling the false discovery rate (FDR) and demonstrates this approach using both simulations and functional magnetic resonance imaging data from two simple experiments.

4,838 citations

Journal ArticleDOI
TL;DR: The Biological Networks Gene Ontology tool (BiNGO) is an open-source Java tool to determine whichGene Ontology terms are significantly overrepresented in a set of genes.
Abstract: Summary: The Biological Networks Gene Ontology tool (BiNGO) is an open-source Java tool to determine which Gene Ontology (GO) terms are significantly overrepresented in a set of genes. BiNGO can be used either on a list of genes, pasted as text, or interactively on subgraphs of biological networks visualized in Cytoscape. BiNGO maps the predominant functional themes of the tested gene set on the GO hierarchy, and takes advantage of Cytoscape's versatile visualization environment to produce an intuitive and customizable visual representation of the results. Availability: http://www.psb.ugent.be/cbd/papers/BiNGO/ Contact: martin.kuiper@psb.ugent.be

3,884 citations


Cites background from "The control of the false discovery ..."

  • ...For Permissions, please email: journals.permissions@oupjournals.org BiNGO dependency of the test statistics (Benjamini and Yekutieli, 2001)....

    [...]

  • ...dependency of the test statistics (Benjamini and Yekutieli, 2001)....

    [...]

Journal ArticleDOI
TL;DR: The False Discovery Rate (FDR) is the expected proportion of false discoveries among the discoveries, and controlling the FDR goes a long way towards controlling the increased error from multiplicity while losing less in the ability to discover real differences.

3,504 citations


Cites background from "The control of the false discovery ..."

  • ...Which procedure is more appropriate? The first procedure requires that the tests be based on behavioral endpoints that are either statistically independent or positively dependent [6]....

    [...]

Journal ArticleDOI
01 Jul 2016-Medicine
TL;DR: According to the analysis, old men plus gastric fundus or antrum of CFB were strongly suggested to perform ESD if precancerous lesions were found and young women with low-grade intraepithelial neoplasia could select regular follow-up.

3,491 citations


Cites background from "The control of the false discovery ..."

  • ...The work cannot be used commercially without permission from the journal....

    [...]

Journal ArticleDOI
TL;DR: A web server, KOBAS 2.0, is reported, which annotates an input set of genes with putative pathways and disease relationships based on mapping to genes with known annotations, which allows for both ID mapping and cross-species sequence similarity mapping.
Abstract: High-throughput experimental technologies often identify dozens to hundreds of genes related to, or changed in, a biological or pathological process. From these genes one wants to identify biological pathways that may be involved and diseases that may be implicated. Here, we report a web server, KOBAS 2.0, which annotates an input set of genes with putative pathways and disease relationships based on mapping to genes with known annotations. It allows for both ID mapping and cross-species sequence similarity mapping. It then performs statistical tests to identify statistically significantly enriched pathways and diseases. KOBAS 2.0 incorporates knowledge across 1327 species from 5 pathway databases (KEGG PATHWAY, PID, BioCyc, Reactome and Panther) and 5 human disease databases (OMIM, KEGG DISEASE, FunDO, GAD and NHGRI GWAS Catalog). KOBAS 2.0 can be accessed at http://kobas.cbi.pku.edu.cn.

3,293 citations


Cites methods from "The control of the false discovery ..."

  • ...0, we add two more popular FDR correction methods: Benjamini-Hochberg (40) and Benjamini-Yekutieli (41)....

    [...]

  • ...In KOBAS 2.0, we add two more popular FDR correction methods: Benjamini-Hochberg (40) and Benjamini-Yekutieli (41)....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.
Abstract: SUMMARY The common approach to the multiplicity problem calls for controlling the familywise error rate (FWER). This approach, though, has faults, and we point out a few. A different approach to problems of multiple significance testing is presented. It calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate. This error rate is equivalent to the FWER when all hypotheses are true but is smaller otherwise. Therefore, in problems where the control of the false discovery rate rather than that of the FWER is desired, there is potential for a gain in power. A simple sequential Bonferronitype procedure is proved to control the false discovery rate for independent test statistics, and a simulation study shows that the gain in power is substantial. The use of the new procedure and the appropriateness of the criterion are illustrated with examples.

83,420 citations


"The control of the false discovery ..." refers background or methods in this paper

  • ...The false discovery rate (FDR), suggested by Benjamini and Hochberg (1995) is a new and different point of view for how the errors in multiple testing could be considered. The FDR is the expected proportion of erroneous rejections among all rejections. If all tested hypotheses are true, controlling the FDR controls the traditional FWE. But when many of the tested hypotheses are rejected, indicating that many hypotheses are not true, the error from a single erroneous rejection is not always as crucial for drawing conclusions from the family tested, and the proportion of errors is controlled instead. Thus we are ready to bear with more errors when many hypotheses are rejected, but with less when fewer are rejected. (This frequentist goal has a Bayesian flavor.) In many applied problems it has been argued that the control of the FDR at some specified level is the more appropriate response to the multiplicity concern: examples are given in Section 2.1 and discussed in Section 4. The practical difference between the two approaches is neither trivial nor small and the larger the problem the more dramatic the difference is. Let us demonstrate this point by comparing two specific procedures, as applied to Example 1.1. To fix notation, let us assume that of the m hypotheses tested H0 1 H0 2 H0 m m0 are true null hypotheses, the number and identity of which are unknown. The other m−m0 hypotheses are false. Denote the corresponding random vector of test statistics X1 X2 Xm , and the corresponding p-values (observed significance levels) by P1 P2 Pm where Pi = 1−FH0 i Xi . Benjamini and Hochberg (1995) showed that when the test statistics are independent the following procedure controls the FDR at level q ·m0/m ≤ q....

    [...]

  • ...Otherwise, when some of the hypotheses are true and some are false, the FDR is smaller [Benjamini and Hochberg (1995)]....

    [...]

  • ...The FDR controlling multiple testing procedure [Benjamini and Hochberg (1995)], given by (1), is a step-up procedure that involves a linear set of constants on the p-value scale (step-up in terms of test statistics, not p-values). The FDR controlling procedure is related to the global test for the intersection hypothesis, which is defined in terms of the same set of constants: reject the single intersection hypothesis if there exist an i s.t. p i ≤ i mα. Simes (1986) showed that when the test statistics are continuous and independent, and all hypotheses are true, the level of the test is α. The equality is referred to as Simes’ equality, and the test has been known in recent years as Simes’ global test. However the result had already been proved by Seeger (1968) [Shaffer (1995) brought this forgotten reference to the current literature....

    [...]

  • ...Formally, as in Benjamini and Hochberg (1995), let V denote the number of true null hypotheses rejected and R the total number of hypotheses rejected, and let Q be the unobservable random quotient,...

    [...]

  • ...The FDR controlling multiple testing procedure [Benjamini and Hochberg (1995)], given by (1), is a step-up procedure that involves a linear set of constants on the p-value scale (step-up in terms of test statistics, not p-values)....

    [...]

Journal ArticleDOI
TL;DR: In this paper, a simple and widely accepted multiple test procedure of the sequentially rejective type is presented, i.e. hypotheses are rejected one at a time until no further rejections can be done.
Abstract: This paper presents a simple and widely ap- plicable multiple test procedure of the sequentially rejective type, i.e. hypotheses are rejected one at a tine until no further rejections can be done. It is shown that the test has a prescribed level of significance protection against error of the first kind for any combination of true hypotheses. The power properties of the test and a number of possible applications are also discussed.

20,459 citations


"The control of the false discovery ..." refers methods in this paper

  • ...Still, even if only a small proportion of the tested hypotheses are detected as not true [approximately log� m� /m], the procedure is more powerful than the comparable FWE controlling procedure of Holm (1979) ....

    [...]

  • ...Still, even if only a small proportion of the tested hypotheses are detected as not true [approximately log m /m], the procedure is more powerful than the comparable FWE controlling procedure of Holm (1979)....

    [...]

Book
01 Jun 1980
TL;DR: Observations probability sampling from a normal distribution comparisons involving two sample means principles of experimental design analysis of variance.
Abstract: Observations probability sampling from a normal distribution comparisons involving two sample means principles of experimental design analysis of variance I - the one-way classification mutiple comparisons analysis of variance II - multiway classification linear regression linear correlation matrix notation linear regression in matrix notation multiple and partial regression and correlation analysis of variance III - factorial experiments analysis of variance analysis of covariance IV analysis of covariance analysis of variance V - unequal subclass numbers some uses of chi-square enumeration data I - one-way classifications enumeration data II - contingency tables categorical models some discrete distributions nonparametric statistics sampling finite populations.

15,571 citations


"The control of the false discovery ..." refers background in this paper

  • ...The study of uterine weights of mice reported by Steel and Torrie (1980) and discussed in Westfall and Young (1993) comprised a comparison of six groups receiving different solutions to one control group....

    [...]

Journal ArticleDOI
TL;DR: Specific standards designed to maintain rigor while also promoting communication are proposed for the interpretation of linkage results in genetic studies under way for many complex traits.
Abstract: Genetic studies are under way for many complex traits, spurred by the recent feasibility of whole genome scans. Clear guidelines for the interpretation of linkage results are needed to avoid a flood of false positive claims. At the same time, an overly cautious approach runs the risk of causing true hints of linkage to be missed. We address this problem by proposing specific standards designed to maintain rigor while also promoting communication.

5,317 citations


"The control of the false discovery ..." refers background in this paper

  • ...In genetics research, the need for multiplicity control has been recognized as one of the fundamental questions, especially since entire genome scans are now common [see Lander and Botstein (1989) , Barinaga (1994), Lander and Kruglyak (1995), Weller, Song, Heyen, Lewin and Ron (1998)]....

    [...]

  • ...In genetics research, the need for multiplicity control has been recognized as one of the fundamental questions, especially since entire genome scans are now common [see Lander and Botstein (1989), Barinaga (1994), Lander and Kruglyak (1995), Weller, Song, Heyen, Lewin and Ron (1998)]....

    [...]

  • ...The appropriate balance between lack of type I error control and low power [“the choice between Scylla and Charybdis” in Lander and Kruglyak (1995)] has been heavily debated....

    [...]