Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses
Ruijie Liu,Aliaksei Holik,Shian Su,Natasha Jansz,Kelan Chen,Kelan Chen,Huei San Leong,Huei San Leong,Marnie E. Blewitt,Marnie E. Blewitt,Marie Liesse Asselin-Labat,Marie Liesse Asselin-Labat,Gordon K. Smyth,Matthew E. Ritchie +13 more
Reads0
Chats0
TLDR
A comprehensive analysis involving both simulations and experimental RNA-sequencing data demonstrates that this strategy leads to a universally more powerful analysis and fewer false discoveries when compared to conventional approaches.Abstract:
Variations in sample quality are frequently encountered in small RNA-sequencing experiments, and pose a major challenge in a differential expression analysis. Removal of high variation samples reduces noise, but at a cost of reducing power, thus limiting our ability to detect biologically meaningful changes. Similarly, retaining these samples in the analysis may not reveal any statistically significant changes due to the higher noise level. A compromise is to use all available data, but to down-weight the observations from more variable samples. We describe a statistical approach that facilitates this by modelling heterogeneity at both the sample and observational levels as part of the differential expression analysis. At the sample level this is achieved by fitting a log-linear variance model that includes common sample-specific or group-specific parameters that are shared between genes. The estimated sample variance factors are then converted to weights and combined with observational level weights obtained from the mean–variance relationship of the log-counts-per-million using ‘voom’. A comprehensive analysis involving both simulations and experimental RNA-sequencing data demonstrates that this strategy leads to a universally more powerful analysis and fewer false discoveries when compared to conventional approaches. This methodology has wide application and is implemented in the open-source ‘limma’ package.read more
Citations
More filters
Journal ArticleDOI
SARS-CoV-2 infection of human ACE2-transgenic mice causes severe lung inflammation and impaired function.
Emma S. Winkler,Adam L. Bailey,Natasha M. Kafai,Sharmila Nair,Broc T. McCune,Jinsheng Yu,Julie M. Fox,Rita E. Chen,James T. Earnest,Shamus P. Keeler,Jon H. Ritter,Liang I. Kang,Sarah Dort,Annette Robichaud,Richard D. Head,Michael J. Holtzman,Michael S. Diamond +16 more
TL;DR: The transgenic mice expressing the human angiotensin I-converting enzyme 2 (ACE2) receptor driven by the cytokeratin-18 (K18) gene promoter are evaluated as a model of SARS-CoV-2 infection to define the basis of lung disease and test immune and antiviral-based countermeasures.
Journal ArticleDOI
Single-cell profiling of breast cancer T cells reveals a tissue-resident memory subset associated with improved prognosis
Peter Savas,Balaji Virassamy,Chengzhong Ye,Agus Salim,Agus Salim,Christopher P. Mintoff,Franco Caramia,Roberto Salgado,David J Byrne,Zhi Ling Teo,Zhi Ling Teo,Sathana Dushyanthen,Ann Byrne,Lironne Wein,Stephen J Luen,Catherine Poliness,Sophie Nightingale,Anita S Skandarajah,Anita S Skandarajah,David E. Gyorki,Chantel M Thornton,Paul A. Beavis,Paul A. Beavis,Stephen B. Fox,Phillip K. Darcy,Phillip K. Darcy,Terence P. Speed,Terence P. Speed,Laura K. Mackay,Paul J Neeson,Paul J Neeson,Sherene Loi,Sherene Loi +32 more
TL;DR: Detailed, high-dimensional characterization of T cells in breast cancer reveals activated TRM population and a gene signature associated with improved prognosis and suggest that CD8+ TRM cells contribute to BC immunosurveillance and are the key targets of modulation by immune checkpoint inhibition.
Journal ArticleDOI
iDEP: an integrated web application for differential expression and pathway analysis of RNA-Seq data
TL;DR: iDEP helps unveil the multifaceted functions of p53 and the possible involvement of several microRNAs such as miR-92a, miR/Bioconductor packages, 2 web services, and comprehensive annotation and pathway databases for 220 plant and animal species.
Journal ArticleDOI
Association of respiratory allergy, asthma, and expression of the SARS-CoV-2 receptor ACE2.
Daniel J. Jackson,William W. Busse,Leonard B. Bacharier,Meyer Kattan,George T. O'Connor,Robert A. Wood,Cynthia M. Visness,Stephen R. Durham,David E. Larson,Stephane Esnault,Carole Ober,Peter J. Gergen,Patrice Becker,Alkis Togias,James E. Gern,Mathew C. Altman,Mathew C. Altman +16 more
TL;DR: Underlying respiratory allergy and experimental allergen exposure reduce the expression of the SARS-CoV-2 receptor, ACE2, which could lead to reduced COVID-19 susceptibility.
Journal ArticleDOI
RNA-seq analysis is easy as 1-2-3 with limma, Glimma and edgeR.
Charity W. Law,Monther Alhamdoosh,Shian Su,Xueyi Dong,Luyi Tian,Luyi Tian,Gordon K. Smyth,Gordon K. Smyth,Matthew E. Ritchie,Matthew E. Ritchie +9 more
TL;DR: This workflow article analyzes RNA-sequencing data from the mouse mammary gland, demonstrating use of the popular edgeR package to import, organise, filter and normalise the data, followed by the limma package with its voom method, linear modelling and empirical Bayes moderation to assess differential expression and perform gene set testing.
References
More filters
Journal Article
R: A language and environment for statistical computing.
TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.
Journal ArticleDOI
Controlling the false discovery rate: a practical and powerful approach to multiple testing
Yoav Benjamini,Yosef Hochberg +1 more
TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.
Journal ArticleDOI
edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.
TL;DR: EdgeR as mentioned in this paper is a Bioconductor software package for examining differential expression of replicated count data, which uses an overdispersed Poisson model to account for both biological and technical variability and empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference.
Journal ArticleDOI
limma powers differential expression analyses for RNA-sequencing and microarray studies
Matthew E. Ritchie,Belinda Phipson,Di Wu,Yifang Hu,Charity W. Law,Wei Shi,Gordon K. Smyth,Gordon K. Smyth +7 more
TL;DR: The philosophy and design of the limma package is reviewed, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.
Journal ArticleDOI
featureCounts: an efficient general-purpose program for assigning sequence reads to genomic features
TL;DR: FeatureCounts as discussed by the authors is a read summarization program suitable for counting reads generated from either RNA or genomic DNA sequencing experiments, which implements highly efficient chromosome hashing and feature blocking techniques.