scispace - formally typeset
Open AccessJournal ArticleDOI

HTSeq—a Python framework to work with high-throughput sequencing data

Simon Anders, +2 more
- 15 Jan 2015 - 
- Vol. 31, Iss: 2, pp 166-169
TLDR
This work presents HTSeq, a Python library to facilitate the rapid development of custom scripts for high-throughput sequencing data analysis, and presents htseq-count, a tool developed with HTSequ that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes.
Abstract
Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard workflows, custom scripts are needed. Results: We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data, such as genomic coordinates, sequences, sequencing reads, alignments, gene model information and variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. Availability and implementation: HTSeq is released as an opensource software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index at https://pypi.python.org/pypi/HTSeq. Contact: sanders@fs.tum.de

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Early lineage segregation of multipotent embryonic mammary gland progenitors.

TL;DR: The timing and the mechanisms mediating early lineage segregation of multipotent progenitors during mammary gland development are identified and a developmental switch from multipotency to unipotency is demonstrated.
Journal ArticleDOI

The Balancing Act between Cancer Immunity and Autoimmunity in Response to Immunotherapy

TL;DR: Key questions in this emerging field are explored, summarizing preclinical and clinical experiences with this new generation of cancer drugs, the growing understanding of the role of the immune response in mediating these toxicities, the relationship of CPI-induced autoimmunity to conventional autoimmune diseases, and insights into the mechanism of irAE development and treatment.
Journal ArticleDOI

In Vivo Transcriptional Activation Using CRISPR/Cas9 in Drosophila

TL;DR: The dCas9-VPR system can be used in cell culture to upregulate a range of target genes, singly and in multiplex, and that a single guide RNA upstream of the transcription start site can activate high levels of target transcription.
Journal ArticleDOI

Allelic barley MLA immune receptors recognize sequence-unrelated avirulence effectors of the powdery mildew pathogen

TL;DR: This study reveals that the expression of a fungal avirulence effector alone is necessary and sufficient for allele-specific mildew resistance locus A receptor activation in planta, and identifies effector genes of a pathogenic powdery mildew fungus that are recognized by allelic variants of barley intracellular nucleotide-binding domain and leucine-rich repeat protein-type receptors.
References
More filters
Journal ArticleDOI

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

TL;DR: This work presents DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates, which enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression.
Journal ArticleDOI

The Sequence Alignment/Map format and SAMtools

TL;DR: SAMtools as discussed by the authors implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments.
Journal ArticleDOI

Trimmomatic: a flexible trimmer for Illumina sequence data

TL;DR: Timmomatic is developed as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data and is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested.
Journal ArticleDOI

edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

TL;DR: EdgeR as mentioned in this paper is a Bioconductor software package for examining differential expression of replicated count data, which uses an overdispersed Poisson model to account for both biological and technical variability and empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference.
Journal ArticleDOI

BEDTools: a flexible suite of utilities for comparing genomic features

TL;DR: A new software suite for the comparison, manipulation and annotation of genomic features in Browser Extensible Data (BED) and General Feature Format (GFF) format, which allows the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks.
Related Papers (5)