Adjusting batch effects in microarray expression data using empirical Bayes methods
Reads0
Chats0
TLDR
This paper proposed parametric and non-parametric empirical Bayes frameworks for adjusting data for batch effects that is robust to outliers in small sample sizes and performs comparable to existing methods for large samples.Abstract:
SUMMARY Non-biological experimental variation or “batch effects” are commonly observed across multiple batches of microarray experiments, often rendering the task of combining data from these batches difficult. The ability to combine microarray data sets is advantageous to researchers to increase statistical power to detect biological phenomena from studies where logistical considerations restrict sample size or in studies that require the sequential hybridization of arrays. In general, it is inappropriate to combine data sets without adjusting for batch effects. Methods have been proposed to filter batch effects from data, but these are often complicated and require large batch sizes (>25) to implement. Because the majority of microarray studies are conducted using much smaller sample sizes, existing methods are not sufficient. We propose parametric and non-parametric empirical Bayes frameworks for adjusting data for batch effects that is robust to outliers in small sample sizes and performs comparable to existing methods for large samples. We illustrate our methods using two example data sets and show that our methods are justifiable, easy to apply, and useful in practice. Software for our method is freely available at: http://biosun1.harvard.edu/complab/batch/.read more
Citations
More filters
Journal ArticleDOI
Integrating single-cell transcriptomic data across different conditions, technologies, and species.
TL;DR: An analytical strategy for integrating scRNA-seq data sets based on common sources of variation is introduced, enabling the identification of shared populations across data sets and downstream comparative analysis.
Journal ArticleDOI
GSVA: gene set variation analysis for microarray and RNA-seq data.
TL;DR: This work introduces Gene Set Variation Analysis (GSVA), a GSE method that estimates variation of pathway activity over a sample population in an unsupervised manner and constitutes a starting point to build pathway-centric models of biology.
Journal ArticleDOI
The consensus molecular subtypes of colorectal cancer
Justin Guinney,Rodrigo Dienstmann,Rodrigo Dienstmann,Xingwu Wang,Xingwu Wang,Aurélien de Reyniès,Andreas Schlicker,Charlotte Soneson,Laetitia Marisa,Paul Roepman,Gift Nyamundanda,Paolo Angelino,Brian M. Bot,Jeffrey S. Morris,Iris Simon,Sarah Gerster,Evelyn Fessler,Felipe De Sousa E Melo,Edoardo Missiaglia,Hena R. Ramay,David Barras,Krisztian Homicsko,Dipen M. Maru,Ganiraju C. Manyam,Bradley M. Broom,Valérie Boige,Beatriz Perez-Villamil,Ted Laderas,Ramon Salazar,Joe W. Gray,Douglas Hanahan,Josep Tabernero,René Bernards,Stephen H. Friend,Pierre Laurent-Puig,Jan Paul Medema,Anguraj Sadanandam,Lodewyk F. A. Wessels,Mauro Delorenzi,Mauro Delorenzi,Scott Kopetz,Louis Vermeulen,Sabine Tejpar +42 more
TL;DR: An international consortium dedicated to large-scale data sharing and analytics across expert groups is formed, showing marked interconnectivity between six independent classification systems coalescing into four consensus molecular subtypes (CMSs) with distinguishing features.
Journal ArticleDOI
The sva package for removing batch effects and other unwanted variation in high-throughput experiments
TL;DR: The sva package is described, which supports surrogate variable estimation with the sva function, direct adjustment for known batch effects with the ComBat function and adjustment for batch and latent variables in prediction problems with the fsva function.
Journal ArticleDOI
BET Bromodomain Inhibition as a Therapeutic Strategy to Target c-Myc
Jake Delmore,Ghayas C Issa,Madeleine E. Lemieux,Peter B. Rahl,Junwei Shi,Hannah M. Jacobs,Efstathios Kastritis,Timothy Gilpatrick,Ronald M. Paranal,Jun Qi,Marta Chesi,Anna C. Schinzel,Michael R. McKeown,Timothy P. Heffernan,Christopher R. Vakoc,P. Leif Bergsagel,Irene M. Ghobrial,Paul G. Richardson,Richard A. Young,William C. Hahn,William C. Hahn,Kenneth C. Anderson,Andrew L. Kung,James E. Bradner,Constantine S. Mitsiades +24 more
TL;DR: In this paper, a small-molecule bromodomain inhibitor, JQ1, was used to identify BET proteins as regulatory factors for c-Myc oncoprotein.
References
More filters
Journal ArticleDOI
Significance analysis of microarrays applied to the ionizing radiation response
TL;DR: A method that assigns a score to each gene on the basis of change in gene expression relative to the standard deviation of repeated measurements is described, suggesting that this repair pathway for UV-damaged DNA might play a previously unrecognized role in repairing DNA damaged by ionizing radiation.
Journal ArticleDOI
Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments
TL;DR: The hierarchical model of Lonnstedt and Speed (2002) is developed into a practical approach for general microarray experiments with arbitrary numbers of treatments and RNA samples and the moderated t-statistic is shown to follow a t-distribution with augmented degrees of freedom.
Journal ArticleDOI
Exploration, normalization, and summaries of high density oligonucleotide array probe level data
Rafael A. Irizarry,Bridget G. Hobbs,Francois Collin,Yasmin Beazer-Barclay,Kristen J. Antonellis,Uwe Scherf,Terence P. Speed +6 more
TL;DR: There is no obvious downside to using RMA and attaching a standard error (SE) to this quantity using a linear model which removes probe-specific affinities, and the exploratory data analyses of the probe level data motivate a new summary measure that is a robust multi-array average (RMA) of background-adjusted, normalized, and log-transformed PM values.
Journal ArticleDOI
Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation
TL;DR: This article proposes normalization methods that are based on robust local regression and account for intensity and spatial dependence in dye biases for different types of cDNA microarray experiments.
Journal ArticleDOI
Empirical Bayes analysis of a microarray experiment
TL;DR: A simple nonparametric empirical Bayes model is introduced, which is used to guide the efficient reduction of the data to a single summary statistic per gene, and also to make simultaneous inferences concerning which genes were affected by the radiation.
Related Papers (5)
Controlling the false discovery rate: a practical and powerful approach to multiple testing
Yoav Benjamini,Yosef Hochberg +1 more