Estimating False Discovery Proportion Under Arbitrary Covariance Dependence
Jianqing Fan,Xu Han,Weijie Gu +2 more
TLDR
In this article, a principal factor approximation (PFA) based method was proposed to solve the problem of false discovery control in large-scale multiple hypothesis testing, where a common threshold is used and a consistent estimate of realized FDP is provided.Abstract:
Multiple hypothesis testing is a fundamental problem in high-dimensional inference, with wide applications in many scientific fields. In genome-wide association studies, tens of thousands of tests are performed simultaneously to find if any single-nucleotide polymorphisms (SNPs) are associated with some traits and those tests are correlated. When test statistics are correlated, false discovery control becomes very challenging under arbitrary dependence. In this article, we propose a novel method—based on principal factor approximation—that successfully subtracts the common dependence and weakens significantly the correlation structure, to deal with an arbitrary dependence structure. We derive an approximate expression for false discovery proportion (FDP) in large-scale multiple testing when a common threshold is used and provide a consistent estimate of realized FDP. This result has important applications in controlling false discovery rate and FDP. Our estimate of realized FDP compares favorably with Efr...read more
Citations
More filters
Journal ArticleDOI
Challenges of Big Data analysis
Jianqing Fan,Fang Han,Han Liu +2 more
TL;DR: In this paper, the authors provide an overview of the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures, and provide various new perspectives on the Big Data analysis and computation.
Journal ArticleDOI
Large covariance estimation by thresholding principal orthogonal complements
TL;DR: In this paper, the Principal Orthogonal Factor Factorization Thresholding (POET) method was introduced to explore an approximate factor structure for high-dimensional covariance with a conditional sparsity structure, which is the composition of a low rank matrix plus a sparse matrix.
Journal ArticleDOI
svaseq: removing batch effects and other unwanted noise from sequencing data
TL;DR: A version of the sva approach specifically created for count data or FPKMs from sequencing experiments based on appropriate data transformation is described, and the addition of supervised sva (ssva) for using control probes to identify the part of the genomic data only affected by artifacts is described.
Journal ArticleDOI
Cauchy combination test: a powerful test with analytic p-value calculation under arbitrary dependency structures.
Yaowu Liu,Jun Xie +1 more
TL;DR: Combining individual p-values to aggregate multiple small effects has a long-standing interest in statistics, dating back to the classic Fisher's combination test in modern large-scale data sets as discussed by the authors.
Journal ArticleDOI
Estimation of high dimensional mean regression in the absence of symmetry and light tail assumptions
TL;DR: A penalized Huber loss with diverging parameter to reduce biases created by the traditional Huer loss is proposed and a penalized robust approximate (RA) quadratic loss is called the RA lasso, which is compared with other regularized robust estimators based on quantile regression and least absolute deviation regression.
References
More filters
Journal ArticleDOI
Controlling the false discovery rate: a practical and powerful approach to multiple testing
Yoav Benjamini,Yosef Hochberg +1 more
TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.
Journal ArticleDOI
The control of the false discovery rate in multiple testing under dependency
Yoav Benjamini,Daniel Yekutieli +1 more
TL;DR: In this paper, it was shown that a simple FDR controlling procedure for independent test statistics can also control the false discovery rate when test statistics have positive regression dependency on each of the test statistics corresponding to the true null hypotheses.
Journal ArticleDOI
Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties
Jianqing Fan,Runze Li +1 more
TL;DR: In this article, penalized likelihood approaches are proposed to handle variable selection problems, and it is shown that the newly proposed estimators perform as well as the oracle procedure in variable selection; namely, they work as well if the correct submodel were known.
Journal ArticleDOI
A direct approach to false discovery rates
TL;DR: The calculation of the q‐value is discussed, the pFDR analogue of the p‐value, which eliminates the need to set the error rate beforehand as is traditionally done, and can yield an increase of over eight times in power compared with the Benjamini–Hochberg FDR method.
Journal ArticleDOI
Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach
TL;DR: In this article, it was shown that the goal of the two approaches are essentially equivalent, and that the FDR point estimates can be used to define valid FDR controlling procedures in both finite sample and asymptotic settings.
Related Papers (5)
Controlling the false discovery rate: a practical and powerful approach to multiple testing
Yoav Benjamini,Yosef Hochberg +1 more
The control of the false discovery rate in multiple testing under dependency
Yoav Benjamini,Daniel Yekutieli +1 more