scispace - formally typeset
Open AccessJournal ArticleDOI

Unifying and Generalizing Methods for Removing Unwanted Variation Based on Negative Controls

Reads0
Chats0
TLDR
This paper introduces a general framework, RUV*, that both unites and generalizes existing RUV approaches and provides conditions under which RUV2 and RUV4 are equivalent, and implements RUVB, a version of RUV* based on Bayesian factor analysis.
Abstract
Unwanted variation, including hidden confounding, is a well-known problem in many fields, particularly large-scale gene expression studies. Recent proposals to use control genes --- genes assumed to be unassociated with the covariates of interest --- have led to new methods to deal with this problem. Going by the moniker Removing Unwanted Variation (RUV), there are many versions --- RUV1, RUV2, RUV4, RUVinv, RUVrinv, RUVfun. In this paper, we introduce a general framework, RUV*, that both unites and generalizes these approaches. This unifying framework helps clarify connections between existing methods. In particular we provide conditions under which RUV2 and RUV4 are equivalent. The RUV* framework also preserves an advantage of RUV approaches --- their modularity --- which facilitates the development of novel methods based on existing matrix imputation algorithms. We illustrate this by implementing RUVB, a version of RUV* based on Bayesian factor analysis. In realistic simulations based on real data we found that RUVB is competitive with existing methods in terms of both power and calibration, although we also highlight the challenges of providing consistently reliable calibration among data sets.

read more

Citations
More filters
Posted ContentDOI

Separating measurement and expression models clarifies confusion in single cell RNA-seq analysis

TL;DR: It is argued that much of this terminology is unhelpful and confusing, and simple ideas are outlined to help reduce confusion, including: (1) observed scRNA-seq counts reflect both true gene expression levels and measurement error, and carefully distinguishing these contributions helps clarify thinking.
Posted Content

Permutation methods for factor analysis and PCA

TL;DR: It is shown that the parallel analysis permutation method consistently selects the large components in certain high-dimensional factor models, however, it does not select the smaller components.
Journal ArticleDOI

Deterministic parallel analysis: An improved method for selecting factors and principal components

TL;DR: In this article, the authors proposed a deterministic parallel analysis (DPA) to counter shadowing in principal component analysis (PCA), which is faster and more reproducible than PA.
Journal ArticleDOI

Deterministic parallel analysis: An improved method for selecting factors and principal components

TL;DR: In this paper, the authors proposed a deterministic parallel analysis (DPA), which is faster and more reproducible than PA, and they also proposed to counter shadowing by raising the decision threshold to improve estimation accuracy.
References
More filters
Journal ArticleDOI

Controlling the false discovery rate: a practical and powerful approach to multiple testing

TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.
Journal ArticleDOI

Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments

TL;DR: The hierarchical model of Lonnstedt and Speed (2002) is developed into a practical approach for general microarray experiments with arbitrary numbers of treatments and RNA samples and the moderated t-statistic is shown to follow a t-distribution with augmented degrees of freedom.
Journal ArticleDOI

The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans

Kristin G. Ardlie, +132 more
- 08 May 2015 - 
TL;DR: The landscape of gene expression across tissues is described, thousands of tissue-specific and shared regulatory expression quantitative trait loci (eQTL) variants are cataloged, complex network relationships are described, and signals from genome-wide association studies explained by eQTLs are identified.
Journal ArticleDOI

A component based noise correction method (CompCor) for BOLD and perfusion based fMRI

TL;DR: A component based method for the reduction of noise in both blood oxygenation level-dependent (BOLD) and perfusion-based functional magnetic resonance imaging (fMRI) data is presented and the temporal standard deviation of resting-state perfusion and BOLD data in gray matter regions was significantly reduced.
Journal ArticleDOI

Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper)

Andrew Gelman
- 01 Sep 2006 - 
TL;DR: In this paper, a folded-noncentral-$t$ family of conditionally conjugate priors for hierarchical standard deviation parameters is proposed, and weakly informative priors in this family are considered.
Related Papers (5)