Unifying and Generalizing Methods for Removing Unwanted Variation Based on Negative Controls
David Gerard,Matthew Stephens +1 more
Reads0
Chats0
TLDR
This paper introduces a general framework, RUV*, that both unites and generalizes existing RUV approaches and provides conditions under which RUV2 and RUV4 are equivalent, and implements RUVB, a version of RUV* based on Bayesian factor analysis.Abstract:
Unwanted variation, including hidden confounding, is a well-known problem in many fields, particularly large-scale gene expression studies. Recent proposals to use control genes --- genes assumed to be unassociated with the covariates of interest --- have led to new methods to deal with this problem. Going by the moniker Removing Unwanted Variation (RUV), there are many versions --- RUV1, RUV2, RUV4, RUVinv, RUVrinv, RUVfun. In this paper, we introduce a general framework, RUV*, that both unites and generalizes these approaches. This unifying framework helps clarify connections between existing methods. In particular we provide conditions under which RUV2 and RUV4 are equivalent. The RUV* framework also preserves an advantage of RUV approaches --- their modularity --- which facilitates the development of novel methods based on existing matrix imputation algorithms. We illustrate this by implementing RUVB, a version of RUV* based on Bayesian factor analysis. In realistic simulations based on real data we found that RUVB is competitive with existing methods in terms of both power and calibration, although we also highlight the challenges of providing consistently reliable calibration among data sets.read more
Citations
More filters
Posted ContentDOI
Separating measurement and expression models clarifies confusion in single cell RNA-seq analysis
Abhishek Sarkar,Matthew Stephens +1 more
TL;DR: It is argued that much of this terminology is unhelpful and confusing, and simple ideas are outlined to help reduce confusion, including: (1) observed scRNA-seq counts reflect both true gene expression levels and measurement error, and carefully distinguishing these contributions helps clarify thinking.
Posted Content
Permutation methods for factor analysis and PCA
TL;DR: It is shown that the parallel analysis permutation method consistently selects the large components in certain high-dimensional factor models, however, it does not select the smaller components.
Journal ArticleDOI
Deterministic parallel analysis: An improved method for selecting factors and principal components
Edgar Dobriban,Art B. Owen +1 more
TL;DR: In this article, the authors proposed a deterministic parallel analysis (DPA) to counter shadowing in principal component analysis (PCA), which is faster and more reproducible than PA.
Journal ArticleDOI
Deterministic parallel analysis: An improved method for selecting factors and principal components
Edgar Dobriban,Art B. Owen +1 more
TL;DR: In this paper, the authors proposed a deterministic parallel analysis (DPA), which is faster and more reproducible than PA, and they also proposed to counter shadowing by raising the decision threshold to improve estimation accuracy.
References
More filters
Journal ArticleDOI
Controlling the false discovery rate: a practical and powerful approach to multiple testing
Yoav Benjamini,Yosef Hochberg +1 more
TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.
Journal ArticleDOI
Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments
TL;DR: The hierarchical model of Lonnstedt and Speed (2002) is developed into a practical approach for general microarray experiments with arbitrary numbers of treatments and RNA samples and the moderated t-statistic is shown to follow a t-distribution with augmented degrees of freedom.
Journal ArticleDOI
The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans
Kristin G. Ardlie,David S. DeLuca,Ayellet V. Segrè,Timothy J. Sullivan,Taylor Young,Ellen Gelfand,Casandra A. Trowbridge,Julian Maller,Taru Tukiainen,Monkol Lek,Lucas D. Ward,Pouya Kheradpour,Benjamin Iriarte,Yan Meng,Cameron D. Palmer,Tõnu Esko,Wendy Winckler,Joel N. Hirschhorn,Manolis Kellis,Daniel G. MacArthur,Gad Getz,Andrey A. Shabalin,Gen Li,Yi-Hui Zhou,Andrew B. Nobel,Ivan Rusyn,Fred A. Wright,Tuuli Lappalainen,Pedro G. Ferreira,Halit Ongen,Manuel A. Rivas,Alexis Battle,Sara Mostafavi,Jean Monlong,Michael Sammeth,Marta Melé,Ferran Reverter,Jakob M. Goldmann,Daphne Koller,Roderic Guigó,Mark I. McCarthy,Emmanouil T. Dermitzakis,Eric R. Gamazon,Hae Kyung Im,Anuar Konkashbaev,Dan L. Nicolae,Nancy J. Cox,Timothée Flutre,Xiaoquan Wen,Matthew Stephens,Jonathan K. Pritchard,Zhidong Tu,Bin Zhang,Tao Huang,Quan Long,Luan Lin,Jialiang Yang,Jun Zhu,Jun Liu,Amanda Brown,Bernadette Mestichelli,Denee Tidwell,Edmund Lo,Mike Salvatore,Saboor Shad,Jeffrey A. Thomas,John T. Lonsdale,Michael T. Moser,Bryan Gillard,Ellen Karasik,Kimberly Ramsey,Christopher Choi,Barbara A. Foster,John Syron,Johnell Fleming,Harold Magazine,Rick Hasz,Gary Walters,Jason Bridge,Mark Miklos,Susan L. Sullivan,Laura Barker,Heather M. Traino,Maghboeba Mosavel,Laura A. Siminoff,Dana R. Valley,Daniel C. Rohrer,Scott D. Jewell,Philip A. Branton,Leslie H. Sobin,Mary Barcus,Liqun Qi,Jeffrey McLean,Pushpa Hariharan,Ki Sung Um,Shenpei Wu,David Tabor,Charles Shive,Anna M. Smith,Stephen A. Buia,Anita H. Undale,Karna Robinson,Nancy Roche,Kimberly M. Valentino,Angela Britton,Robin Burges,Debra Bradbury,Kenneth W. Hambright,John Seleski,Greg E. Korzeniewski,Kenyon Erickson,Yvonne Marcus,Jorge Tejada,Mehran Taherian,Chunrong Lu,Margaret J. Basile,Deborah C. Mash,Simona Volpi,Jeffery P. Struewing,Gary F. Temple,Joy T. Boyer,Deborah Colantuoni,Roger Little,Susan E. Koester,Latarsha J. Carithers,Helen M. Moore,Ping Guan,Carolyn C. Compton,Sherilyn Sawyer,Joanne P. Demchok,Jimmie B. Vaught,Chana A. Rabiner,Nicole C. Lockhart +132 more
TL;DR: The landscape of gene expression across tissues is described, thousands of tissue-specific and shared regulatory expression quantitative trait loci (eQTL) variants are cataloged, complex network relationships are described, and signals from genome-wide association studies explained by eQTLs are identified.
Journal ArticleDOI
A component based noise correction method (CompCor) for BOLD and perfusion based fMRI
TL;DR: A component based method for the reduction of noise in both blood oxygenation level-dependent (BOLD) and perfusion-based functional magnetic resonance imaging (fMRI) data is presented and the temporal standard deviation of resting-state perfusion and BOLD data in gray matter regions was significantly reduced.
Journal ArticleDOI
Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper)
TL;DR: In this paper, a folded-noncentral-$t$ family of conditionally conjugate priors for hierarchical standard deviation parameters is proposed, and weakly informative priors in this family are considered.
Related Papers (5)
Unifying and Generalizing Methods for Removing Unwanted Variation Based on Negative Controls
David Gerard,Matthew Stephens +1 more
Capturing heterogeneity in gene expression studies by surrogate variable analysis.
Jeffrey T. Leek,John D. Storey +1 more