HTSeq—a Python framework to work with high-throughput sequencing data
TLDR
This work presents HTSeq, a Python library to facilitate the rapid development of custom scripts for high-throughput sequencing data analysis, and presents htseq-count, a tool developed with HTSequ that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes.Abstract:
Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard workflows, custom scripts are needed. Results: We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data, such as genomic coordinates, sequences, sequencing reads, alignments, gene model information and variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. Availability and implementation: HTSeq is released as an opensource software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index at https://pypi.python.org/pypi/HTSeq. Contact: sanders@fs.tum.deread more
Citations
More filters
Journal ArticleDOI
Transcriptome analysis of a wild bird reveals physiological responses to the urban environment
TL;DR: This is one of the first studies to reveal transcriptional differences between urban- and rural-dwelling animals and suggests an important role for epigenetics in mediating environmentally induced physiological variation.
Journal ArticleDOI
Carrageenan catabolism is encoded by a complex regulon in marine heterotrophic bacteria
Elizabeth Ficko-Blean,Aurélie Préchoux,François Thomas,Tatiana Rochat,Robert Larocque,Yongtao Zhu,Mark Stam,Sabine Genicot,Murielle Jam,Alexandra Calteau,Benjamin Thomas Viart,David Ropartz,David Pérez-Pascual,Gaëlle Correc,Maria Matard-Mann,Keith A. Stubbs,Hélène Rogniaux,Alexandra Jeudy,Tristan Barbeyron,Claudine Médigue,Mirjam Czjzek,David Vallenet,Mark J. McBride,Eric Duchaud,Gurvan Michel +24 more
TL;DR: The complete catabolic pathway for carrageenans, major cell wall polysaccharides of red macroalgae, in the marine heterotrophic bacterium Zobellia galactanivorans is described and an extension on the definition of bacterial PUL-mediated poly Saccharide digestion is extended.
Journal ArticleDOI
Expansion and differentiation of human hepatocyte-derived liver progenitor-like cells and their use for the study of hepatotropic pathogens
Gong-Bo Fu,Wei-Jian Huang,Min Zeng,Xu Zhou,Hong-Ping Wu,Changcheng Liu,Han Wu,Jun Weng,Hong-Dan Zhang,Yongchao Cai,Charles Ashton,Min Ding,Dan Tang,Baohua Zhang,Yi Gao,Wei-Feng Yu,Bo Zhai,Zhiying He,Hongyang Wang,He-Xin Yan +19 more
TL;DR: A protocol achieving efficient conversion of human primary hepatocytes into liver progenitor-like cells (HepLPCs) through delivery of developmentally relevant cues, including NAD + -dependent deacetylase SIRT1 signaling is described.
Journal ArticleDOI
Refined RIP-seq protocol for epitranscriptome analysis with low input materials.
Yong Zeng,Shiyan Wang,Shanshan Gao,Fraser Soares,Musadeqque Ahmed,Haiyang Guo,Miranda Wang,Junjie Tony Hua,Junjie Tony Hua,Jiansheng Guan,Jiansheng Guan,Michael F. Moran,Michael F. Moran,Ming-Sound Tsao,Ming-Sound Tsao,Ming-Sound Tsao,Housheng Hansen He,Housheng Hansen He +17 more
TL;DR: The refined m6A MeRIP-seq method is suitable for m 6A epitranscriptome profiling in a limited amount of patient tumors, setting the ground for unraveling the dynamics of the m6 a epitranscriptionome and the underlying mechanisms in clinical settings.
Journal ArticleDOI
Time-scale dynamics of proteome and transcriptome of the white-rot fungus Phlebia radiata: growth on spruce wood and decay effect on lignocellulose
Jaana Kuuskeri,Mari Häkkinen,Pia Laine,Olli-Pekka Smolander,Fitsum Tamene,Sini Miettinen,Paula Nousiainen,Marianna Kemell,Petri Auvinen,Taina Lundell +9 more
TL;DR: Significant changes in carbohydrate-active enzyme expression are indicated during the six-week surveillance of P. radiata growing on wood, allowing us to head for systems biology, development of biofuel production, and industrial applications on plant biomass utilizing wood-decay fungi.
References
More filters
Journal ArticleDOI
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2
TL;DR: This work presents DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates, which enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression.
Journal ArticleDOI
The Sequence Alignment/Map format and SAMtools
Heng Li,Bob Handsaker,Alec Wysoker,T. J. Fennell,Jue Ruan,Nils Homer,Gabor T. Marth,Gonçalo R. Abecasis,Richard Durbin +8 more
TL;DR: SAMtools as discussed by the authors implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments.
Journal ArticleDOI
Trimmomatic: a flexible trimmer for Illumina sequence data
TL;DR: Timmomatic is developed as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data and is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested.
Journal ArticleDOI
edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.
TL;DR: EdgeR as mentioned in this paper is a Bioconductor software package for examining differential expression of replicated count data, which uses an overdispersed Poisson model to account for both biological and technical variability and empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference.
Journal ArticleDOI
BEDTools: a flexible suite of utilities for comparing genomic features
Aaron R. Quinlan,Ira M. Hall +1 more
TL;DR: A new software suite for the comparison, manipulation and annotation of genomic features in Browser Extensible Data (BED) and General Feature Format (GFF) format, which allows the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks.