scispace - formally typeset
Open AccessPosted Content

Estimating False Discovery Proportion Under Arbitrary Covariance Dependence

Reads0
Chats0
TLDR
An approximate expression for false discovery proportion (FDP) in large-scale multiple testing when a common threshold is used and a consistent estimate of realized FDP is provided, which has important applications in controlling false discovery rate and FDP.
Abstract
Multiple hypothesis testing is a fundamental problem in high dimensional inference, with wide applications in many scientific fields. In genome-wide association studies, tens of thousands of tests are performed simultaneously to find if any SNPs are associated with some traits and those tests are correlated. When test statistics are correlated, false discovery control becomes very challenging under arbitrary dependence. In the current paper, we propose a novel method based on principal factor approximation, which successfully subtracts the common dependence and weakens significantly the correlation structure, to deal with an arbitrary dependence structure. We derive an approximate expression for false discovery proportion (FDP) in large scale multiple testing when a common threshold is used and provide a consistent estimate of realized FDP. This result has important applications in controlling FDR and FDP. Our estimate of realized FDP compares favorably with Efron (2007)'s approach, as demonstrated in the simulated examples. Our approach is further illustrated by some real data applications. We also propose a dependence-adjusted procedure, which is more powerful than the fixed threshold procedure.

read more

Citations
More filters
Journal ArticleDOI

Challenges of Big Data Analysis

TL;DR: In this article, the authors provide an overview of the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures, and provide various new perspectives on the Big Data analysis and computation.
Posted Content

Large Covariance Estimation by Thresholding Principal Orthogonal Complements

TL;DR: In this article, the Principal Orthogonal Factorization Thresholding (POET) method was introduced to explore an approximate factor structure with sparsity, which includes the sample covariance matrix, the factor-based covariance matrices, the thresholding estimator, and their factor loadings.
Journal ArticleDOI

svaseq: removing batch effects and other unwanted noise from sequencing data

TL;DR: A version of the sva approach specifically created for count data or FPKMs from sequencing experiments based on appropriate data transformation is described, and the addition of supervised sva (ssva) for using control probes to identify the part of the genomic data only affected by artifacts is described.
Journal ArticleDOI

Estimation of high dimensional mean regression in the absence of symmetry and light tail assumptions

TL;DR: A penalized Huber loss with diverging parameter to reduce biases created by the traditional Huer loss is proposed and a penalized robust approximate (RA) quadratic loss is called the RA lasso, which is compared with other regularized robust estimators based on quantile regression and least absolute deviation regression.
Journal ArticleDOI

Asymptotics of empirical eigenstructure for high dimensional spiked covariance.

TL;DR: These results are a natural extension of those in Paul (2007) to a more general setting and solve the rates of convergence problems in Shen et al. (2013) and lead to a new covariance estimator for the approximate factor model, called shrinkage principal orthogonal complement thresholding (S-POET), that corrects the biases.
References
More filters
Journal ArticleDOI

Controlling the false discovery rate: a practical and powerful approach to multiple testing

TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.
Journal ArticleDOI

The control of the false discovery rate in multiple testing under dependency

TL;DR: In this paper, it was shown that a simple FDR controlling procedure for independent test statistics can also control the false discovery rate when test statistics have positive regression dependency on each of the test statistics corresponding to the true null hypotheses.
Journal ArticleDOI

Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties

TL;DR: In this article, penalized likelihood approaches are proposed to handle variable selection problems, and it is shown that the newly proposed estimators perform as well as the oracle procedure in variable selection; namely, they work as well if the correct submodel were known.
Journal ArticleDOI

A direct approach to false discovery rates

TL;DR: The calculation of the q‐value is discussed, the pFDR analogue of the p‐value, which eliminates the need to set the error rate beforehand as is traditionally done, and can yield an increase of over eight times in power compared with the Benjamini–Hochberg FDR method.
Journal ArticleDOI

Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach

TL;DR: In this article, it was shown that the goal of the two approaches are essentially equivalent, and that the FDR point estimates can be used to define valid FDR controlling procedures in both finite sample and asymptotic settings.
Related Papers (5)