scispace - formally typeset
Open AccessJournal ArticleDOI

Estimating False Discovery Proportion Under Arbitrary Covariance Dependence

TLDR
In this article, a principal factor approximation (PFA) based method was proposed to solve the problem of false discovery control in large-scale multiple hypothesis testing, where a common threshold is used and a consistent estimate of realized FDP is provided.
Abstract
Multiple hypothesis testing is a fundamental problem in high-dimensional inference, with wide applications in many scientific fields. In genome-wide association studies, tens of thousands of tests are performed simultaneously to find if any single-nucleotide polymorphisms (SNPs) are associated with some traits and those tests are correlated. When test statistics are correlated, false discovery control becomes very challenging under arbitrary dependence. In this article, we propose a novel method—based on principal factor approximation—that successfully subtracts the common dependence and weakens significantly the correlation structure, to deal with an arbitrary dependence structure. We derive an approximate expression for false discovery proportion (FDP) in large-scale multiple testing when a common threshold is used and provide a consistent estimate of realized FDP. This result has important applications in controlling false discovery rate and FDP. Our estimate of realized FDP compares favorably with Efr...

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Challenges of Big Data analysis

TL;DR: In this paper, the authors provide an overview of the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures, and provide various new perspectives on the Big Data analysis and computation.
Journal ArticleDOI

Large covariance estimation by thresholding principal orthogonal complements

TL;DR: In this paper, the Principal Orthogonal Factor Factorization Thresholding (POET) method was introduced to explore an approximate factor structure for high-dimensional covariance with a conditional sparsity structure, which is the composition of a low rank matrix plus a sparse matrix.
Journal ArticleDOI

svaseq: removing batch effects and other unwanted noise from sequencing data

TL;DR: A version of the sva approach specifically created for count data or FPKMs from sequencing experiments based on appropriate data transformation is described, and the addition of supervised sva (ssva) for using control probes to identify the part of the genomic data only affected by artifacts is described.
Journal ArticleDOI

Cauchy combination test: a powerful test with analytic p-value calculation under arbitrary dependency structures.

TL;DR: Combining individual p-values to aggregate multiple small effects has a long-standing interest in statistics, dating back to the classic Fisher's combination test in modern large-scale data sets as discussed by the authors.
Journal ArticleDOI

Estimation of high dimensional mean regression in the absence of symmetry and light tail assumptions

TL;DR: A penalized Huber loss with diverging parameter to reduce biases created by the traditional Huer loss is proposed and a penalized robust approximate (RA) quadratic loss is called the RA lasso, which is compared with other regularized robust estimators based on quantile regression and least absolute deviation regression.
References
More filters
Journal ArticleDOI

Controlling the false discovery rate: a practical and powerful approach to multiple testing

TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.
Journal ArticleDOI

The control of the false discovery rate in multiple testing under dependency

TL;DR: In this paper, it was shown that a simple FDR controlling procedure for independent test statistics can also control the false discovery rate when test statistics have positive regression dependency on each of the test statistics corresponding to the true null hypotheses.
Journal ArticleDOI

Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties

TL;DR: In this article, penalized likelihood approaches are proposed to handle variable selection problems, and it is shown that the newly proposed estimators perform as well as the oracle procedure in variable selection; namely, they work as well if the correct submodel were known.
Journal ArticleDOI

A direct approach to false discovery rates

TL;DR: The calculation of the q‐value is discussed, the pFDR analogue of the p‐value, which eliminates the need to set the error rate beforehand as is traditionally done, and can yield an increase of over eight times in power compared with the Benjamini–Hochberg FDR method.
Journal ArticleDOI

Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach

TL;DR: In this article, it was shown that the goal of the two approaches are essentially equivalent, and that the FDR point estimates can be used to define valid FDR controlling procedures in both finite sample and asymptotic settings.
Related Papers (5)