Estimating False Discovery Proportion Under Arbitrary Covariance Dependence

Open AccessPosted Content

Estimating False Discovery Proportion Under Arbitrary Covariance Dependence

Jianqing Fan, +2 more

- 28 Oct 2010 -

arXiv: Methodology

Chats0

TLDR

An approximate expression for false discovery proportion (FDP) in large-scale multiple testing when a common threshold is used and a consistent estimate of realized FDP is provided, which has important applications in controlling false discovery rate and FDP.

Abstract:

Multiple hypothesis testing is a fundamental problem in high dimensional inference, with wide applications in many scientific fields. In genome-wide association studies, tens of thousands of tests are performed simultaneously to find if any SNPs are associated with some traits and those tests are correlated. When test statistics are correlated, false discovery control becomes very challenging under arbitrary dependence. In the current paper, we propose a novel method based on principal factor approximation, which successfully subtracts the common dependence and weakens significantly the correlation structure, to deal with an arbitrary dependence structure. We derive an approximate expression for false discovery proportion (FDP) in large scale multiple testing when a common threshold is used and provide a consistent estimate of realized FDP. This result has important applications in controlling FDR and FDP. Our estimate of realized FDP compares favorably with Efron (2007)'s approach, as demonstrated in the simulated examples. Our approach is further illustrated by some real data applications. We also propose a dependence-adjusted procedure, which is more powerful than the fixed threshold procedure.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Challenges of Big Data Analysis

Jianqing Fan, +2 more

- 07 Aug 2013 -

arXiv: Machine Learning

TL;DR: In this article, the authors provide an overview of the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures, and provide various new perspectives on the Big Data analysis and computation.

...read moreread less

Posted Content

Large Covariance Estimation by Thresholding Principal Orthogonal Complements

Jianqing Fan, +2 more

- 30 Dec 2011 -

arXiv: Statistics Theory

TL;DR: In this article, the Principal Orthogonal Factorization Thresholding (POET) method was introduced to explore an approximate factor structure with sparsity, which includes the sample covariance matrix, the factor-based covariance matrices, the thresholding estimator, and their factor loadings.

...read moreread less

Journal ArticleDOI

svaseq: removing batch effects and other unwanted noise from sequencing data

Jeffrey T. Leek

- 01 Dec 2014 -

Nucleic Acids Research

TL;DR: A version of the sva approach specifically created for count data or FPKMs from sequencing experiments based on appropriate data transformation is described, and the addition of supervised sva (ssva) for using control probes to identify the part of the genomic data only affected by artifacts is described.

...read moreread less

Journal ArticleDOI

Estimation of high dimensional mean regression in the absence of symmetry and light tail assumptions

Jianqing Fan, +2 more

- 01 Jan 2017 -

Journal of The Royal Statistical Society...

TL;DR: A penalized Huber loss with diverging parameter to reduce biases created by the traditional Huer loss is proposed and a penalized robust approximate (RA) quadratic loss is called the RA lasso, which is compared with other regularized robust estimators based on quantile regression and least absolute deviation regression.

...read moreread less

Journal ArticleDOI

Asymptotics of empirical eigenstructure for high dimensional spiked covariance.

Weichen Wang, +2 more

- 01 Jun 2017 -

Annals of Statistics

TL;DR: These results are a natural extension of those in Paul (2007) to a more general setting and solve the rates of convergence problems in Shen et al. (2013) and lead to a new covariance estimator for the approximate factor model, called shrinkage principal orthogonal complement thresholding (S-POET), that corrects the biases.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Controlling the false discovery rate: a practical and powerful approach to multiple testing

Yoav Benjamini, +1 more

- 01 Jan 1995 -

Journal of the royal statistical society...

TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.

...read moreread less

Journal ArticleDOI

The control of the false discovery rate in multiple testing under dependency

Yoav Benjamini, +1 more

- 01 Aug 2001 -

Annals of Statistics

TL;DR: In this paper, it was shown that a simple FDR controlling procedure for independent test statistics can also control the false discovery rate when test statistics have positive regression dependency on each of the test statistics corresponding to the true null hypotheses.

...read moreread less

Journal ArticleDOI

Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties

Jianqing Fan, +1 more

- 01 Dec 2001 -

Journal of the American Statistical Asso...

TL;DR: In this article, penalized likelihood approaches are proposed to handle variable selection problems, and it is shown that the newly proposed estimators perform as well as the oracle procedure in variable selection; namely, they work as well if the correct submodel were known.

...read moreread less

Journal ArticleDOI

A direct approach to false discovery rates

John D. Storey

- 01 Aug 2002 -

Journal of The Royal Statistical Society...

TL;DR: The calculation of the q‐value is discussed, the pFDR analogue of the p‐value, which eliminates the need to set the error rate beforehand as is traditionally done, and can yield an increase of over eight times in power compared with the Benjamini–Hochberg FDR method.

...read moreread less

Journal ArticleDOI

Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach

John D. Storey, +2 more

- 01 Feb 2004 -

Journal of The Royal Statistical Society...

TL;DR: In this article, it was shown that the goal of the two approaches are essentially equivalent, and that the FDR point estimates can be used to define valid FDR controlling procedures in both finite sample and asymptotic settings.

...read moreread less