Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses

doi:10.1093/NAR/GKV412

Open AccessJournal ArticleDOI

Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses

Ruijie Liu, +13 more

- 03 Sep 2015 -

Nucleic Acids Research

- Vol. 43, Iss: 15

Chats0

TLDR

A comprehensive analysis involving both simulations and experimental RNA-sequencing data demonstrates that this strategy leads to a universally more powerful analysis and fewer false discoveries when compared to conventional approaches.

Abstract:

Variations in sample quality are frequently encountered in small RNA-sequencing experiments, and pose a major challenge in a differential expression analysis. Removal of high variation samples reduces noise, but at a cost of reducing power, thus limiting our ability to detect biologically meaningful changes. Similarly, retaining these samples in the analysis may not reveal any statistically significant changes due to the higher noise level. A compromise is to use all available data, but to down-weight the observations from more variable samples. We describe a statistical approach that facilitates this by modelling heterogeneity at both the sample and observational levels as part of the differential expression analysis. At the sample level this is achieved by fitting a log-linear variance model that includes common sample-specific or group-specific parameters that are shared between genes. The estimated sample variance factors are then converted to weights and combined with observational level weights obtained from the mean–variance relationship of the log-counts-per-million using ‘voom’. A comprehensive analysis involving both simulations and experimental RNA-sequencing data demonstrates that this strategy leads to a universally more powerful analysis and fewer false discoveries when compared to conventional approaches. This methodology has wide application and is implemented in the open-source ‘limma’ package.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

SARS-CoV-2 infection of human ACE2-transgenic mice causes severe lung inflammation and impaired function.

Emma S. Winkler, +16 more

- 24 Aug 2020 -

Nature Immunology

TL;DR: The transgenic mice expressing the human angiotensin I-converting enzyme 2 (ACE2) receptor driven by the cytokeratin-18 (K18) gene promoter are evaluated as a model of SARS-CoV-2 infection to define the basis of lung disease and test immune and antiviral-based countermeasures.

...read moreread less

Journal ArticleDOI

Single-cell profiling of breast cancer T cells reveals a tissue-resident memory subset associated with improved prognosis

Peter Savas, +32 more

- 25 Jun 2018 -

Nature Medicine

TL;DR: Detailed, high-dimensional characterization of T cells in breast cancer reveals activated TRM population and a gene signature associated with improved prognosis and suggest that CD8+ TRM cells contribute to BC immunosurveillance and are the key targets of modulation by immune checkpoint inhibition.

...read moreread less

Journal ArticleDOI

iDEP: an integrated web application for differential expression and pathway analysis of RNA-Seq data

Steven Xijin Ge, +2 more

- 19 Dec 2018 -

BMC Bioinformatics

TL;DR: iDEP helps unveil the multifaceted functions of p53 and the possible involvement of several microRNAs such as miR-92a, miR/Bioconductor packages, 2 web services, and comprehensive annotation and pathway databases for 220 plant and animal species.

...read moreread less

Journal ArticleDOI

Association of respiratory allergy, asthma, and expression of the SARS-CoV-2 receptor ACE2.

Daniel J. Jackson, +16 more

- 22 Apr 2020 -

The Journal of Allergy and Clinical Immu...

TL;DR: Underlying respiratory allergy and experimental allergen exposure reduce the expression of the SARS-CoV-2 receptor, ACE2, which could lead to reduced COVID-19 susceptibility.

...read moreread less

Journal ArticleDOI

RNA-seq analysis is easy as 1-2-3 with limma, Glimma and edgeR.

Charity W. Law, +9 more

- 17 Jun 2016 -

F1000Research

TL;DR: This workflow article analyzes RNA-sequencing data from the mouse mammary gland, demonstrating use of the popular edgeR package to import, organise, filter and normalise the data, followed by the limma package with its voom method, linear modelling and empirical Bayes moderation to assess differential expression and perform gene set testing.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal Article

R: A language and environment for statistical computing.

R Core Team

- 01 Jan 2014 -

MSOR connections

TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.

...read moreread less

Journal ArticleDOI

Controlling the false discovery rate: a practical and powerful approach to multiple testing

Yoav Benjamini, +1 more

- 01 Jan 1995 -

Journal of the royal statistical society...

TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.

...read moreread less

Journal ArticleDOI

edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Mark D. Robinson, +2 more

- 01 Jan 2010 -

Bioinformatics

TL;DR: EdgeR as mentioned in this paper is a Bioconductor software package for examining differential expression of replicated count data, which uses an overdispersed Poisson model to account for both biological and technical variability and empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference.

...read moreread less

Journal ArticleDOI

limma powers differential expression analyses for RNA-sequencing and microarray studies

Matthew E. Ritchie, +7 more

- 20 Apr 2015 -

Nucleic Acids Research

TL;DR: The philosophy and design of the limma package is reviewed, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

...read moreread less

Journal ArticleDOI

featureCounts: an efficient general-purpose program for assigning sequence reads to genomic features

Yang Liao, +2 more

- 01 Apr 2014 -

Bioinformatics

TL;DR: FeatureCounts as discussed by the authors is a read summarization program suitable for counting reads generated from either RNA or genomic DNA sequencing experiments, which implements highly efficient chromosome hashing and feature blocking techniques.

...read moreread less