scispace - formally typeset
Open AccessJournal ArticleDOI

Count-based differential expression analysis of RNA sequencing data using R and Bioconductor

TLDR
This protocol presents a state-of-the-art computational and statistical RNA-seq differential expression analysis workflow largely based on the free open-source R language and Bioconductor software and, in particular, on two widely used tools, DESeq and edgeR.
Abstract
RNA sequencing (RNA-seq) has been rapidly adopted for the profiling of transcriptomes in many areas of biology, including studies into gene regulation, development and disease. Of particular interest is the discovery of differentially expressed genes across different conditions (e.g., tissues, perturbations) while optionally adjusting for other systematic factors that affect the data-collection process. There are a number of subtle yet crucial aspects of these analyses, such as read counting, appropriate treatment of biological variability, quality control checks and appropriate setup of statistical modeling. Several variations have been presented in the literature, and there is a need for guidance on current best practices. This protocol presents a state-of-the-art computational and statistical RNA-seq differential expression analysis workflow largely based on the free open-source R language and Bioconductor software and, in particular, on two widely used tools, DESeq and edgeR. Hands-on time for typical small experiments (e.g., 4-10 samples) can be <1 h, with computation time <1 d using a standard desktop PC.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

featureCounts: an efficient general-purpose program for assigning sequence reads to genomic features

TL;DR: FeatureCounts as discussed by the authors is a read summarization program suitable for counting reads generated from either RNA or genomic DNA sequencing experiments, which implements highly efficient chromosome hashing and feature blocking techniques.
Journal ArticleDOI

Fusobacterium nucleatum Promotes Chemoresistance to Colorectal Cancer by Modulating Autophagy.

TL;DR: It is found that Fusobacterium (F.) nucleatum was abundant in colorectal cancer tissues in patients with recurrence post chemotherapy, and was associated with patient clinicopathological characterisitcs, and bioinformatic and functional studies demonstrated that F. nucleatum promoted coloreCTal cancer resistance to chemotherapy.
Journal ArticleDOI

Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data

TL;DR: Qualimap 2 represents a next step in the QC analysis of HTS data, along with comprehensive single-sample analysis of alignment data, and includes new modes that allow simultaneous processing and comparison of multiple samples.
Journal ArticleDOI

Pathway enrichment analysis and visualization of omics data using g:Profiler, GSEA, Cytoscape and EnrichmentMap

TL;DR: This protocol describes pathway enrichment analysis of gene lists from RNA-seq and other genomics experiments using g:Profiler, GSEA, Cytoscape and EnrichmentMap software, and describes innovative visualization techniques.
References
More filters
Journal ArticleDOI

Controlling the false discovery rate: a practical and powerful approach to multiple testing

TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.
Journal ArticleDOI

The Sequence Alignment/Map format and SAMtools

TL;DR: SAMtools as discussed by the authors implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments.
Journal ArticleDOI

STAR: ultrafast universal RNA-seq aligner

TL;DR: The Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure outperforms other aligners by a factor of >50 in mapping speed.
Journal ArticleDOI

edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

TL;DR: EdgeR as mentioned in this paper is a Bioconductor software package for examining differential expression of replicated count data, which uses an overdispersed Poisson model to account for both biological and technical variability and empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference.
Related Papers (5)