HTSeq—a Python framework to work with high-throughput sequencing data
TLDR
This work presents HTSeq, a Python library to facilitate the rapid development of custom scripts for high-throughput sequencing data analysis, and presents htseq-count, a tool developed with HTSequ that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes.Abstract:
Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard workflows, custom scripts are needed. Results: We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data, such as genomic coordinates, sequences, sequencing reads, alignments, gene model information and variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. Availability and implementation: HTSeq is released as an opensource software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index at https://pypi.python.org/pypi/HTSeq. Contact: sanders@fs.tum.deread more
Citations
More filters
Journal ArticleDOI
Integration of multi-omics data and deep phenotyping enables prediction of cytokine responses
Olivier B. Bakker,Raul Aguirre-Gamboa,Serena Sanna,Marije Oosting,Sanne P. Smeekens,Martin Jaeger,Maria M. Zorro,Urmo Võsa,Sebo Withoff,Romana T. Netea-Maier,Hans J. P. M. Koenen,Irma Joosten,Ramnik J. Xavier,Lude Franke,Leo A. B. Joosten,Vinod Kumar,Vinod Kumar,Cisca Wijmenga,Cisca Wijmenga,Mihai G. Netea,Mihai G. Netea,Yang Li +21 more
TL;DR: It is shown that integration of multi-omics data and deep phenotyping enables prediction of cytokine production in responses to pathogens, and a computational model based on genetic data predicted the genetic component of stimulus-induced cytokineProduction and nongenetic factors influenced cytokineproduction.
Journal ArticleDOI
CD74 is a novel transcription regulator
Naama Gil-Yarom,Lihi Radomir,Lital Sever,Matthias P. Kramer,Hadas Lewinsky,Chamutal Bornstein,Ronnie Blecher-Gonen,Zohar Barnett-Itzhaki,Vita Mirkin,Gilgi Friedlander,Lev Shvidel,Yair Herishanu,Elias Lolis,Shirly Becker-Herman,Ido Amit,Idit Shachar +15 more
TL;DR: It is demonstrated that CD74’s cytoplasmic domain binds chromatin and regulates transcription and expression of genes involved in immune regulation, cell survival, and hematopoietic cancers and that identifying targets of CD74 will help in understanding of essential pathways regulating B-cell survival in health and disease.
Journal ArticleDOI
PIEZO2 is required for mechanotransduction in human stem cell–derived touch receptors
Katrin Schrenk-Siemens,Hagen Wende,Vincenzo Prato,Kun Song,Charlotte Rostock,Alexander Loewer,Jochen Utikal,Gary R. Lewin,Stefan G. Lechner,Jan Siemens +9 more
TL;DR: This work establishes a model system that resembles human touch receptors, which may facilitate mechanistic analysis of other sensory subtypes and provide insight into developmental programs underlying sensory neuron diversity.
Journal ArticleDOI
Activation Dynamics and Immunoglobulin Evolution of Pre-existing and Newly Generated Human Memory B cell Responses to Influenza Hemagglutinin
Sarah F. Andrews,Michael Chambers,Chaim A. Schramm,Jason Plyler,Julie E. Raab,Masaru Kanekiyo,Rebecca A. Gillespie,Amy Ransier,Sam Darko,Jianfei Hu,Xuejun Chen,Hadi M. Yassine,Jeffrey C. Boyington,Michelle C. Crank,Grace L. Chen,Emily E. Coates,John R. Mascola,Daniel C. Douek,Barney S. Graham,Julie E. Ledgerwood,Adrian B. McDermott +20 more
TL;DR: In this article, the authors analyzed the response to H7N9 vaccination in naive adults and found that the recall response to conserved epitopes on H7 HA involved a transient expansion of memory B cells with little observed adaptation.
Journal ArticleDOI
Changes in chromatin accessibility between Arabidopsis stem cells and mesophyll cells illuminate cell type-specific transcription factor networks.
TL;DR: It is found that preferentially accessible chromatin regions in mesophyll cells tended to also be substantially accessible in the stem cells, whereas the converse was not true, suggesting that the generally higher accessibility of regulatory elements in stem cells might contribute to their developmental plasticity.
References
More filters
Journal ArticleDOI
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2
TL;DR: This work presents DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates, which enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression.
Journal ArticleDOI
The Sequence Alignment/Map format and SAMtools
Heng Li,Bob Handsaker,Alec Wysoker,T. J. Fennell,Jue Ruan,Nils Homer,Gabor T. Marth,Gonçalo R. Abecasis,Richard Durbin +8 more
TL;DR: SAMtools as discussed by the authors implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments.
Journal ArticleDOI
Trimmomatic: a flexible trimmer for Illumina sequence data
TL;DR: Timmomatic is developed as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data and is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested.
Journal ArticleDOI
edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.
TL;DR: EdgeR as mentioned in this paper is a Bioconductor software package for examining differential expression of replicated count data, which uses an overdispersed Poisson model to account for both biological and technical variability and empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference.
Journal ArticleDOI
BEDTools: a flexible suite of utilities for comparing genomic features
Aaron R. Quinlan,Ira M. Hall +1 more
TL;DR: A new software suite for the comparison, manipulation and annotation of genomic features in Browser Extensible Data (BED) and General Feature Format (GFF) format, which allows the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks.