HTSeq—a Python framework to work with high-throughput sequencing data
TLDR
This work presents HTSeq, a Python library to facilitate the rapid development of custom scripts for high-throughput sequencing data analysis, and presents htseq-count, a tool developed with HTSequ that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes.Abstract:
Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard workflows, custom scripts are needed. Results: We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data, such as genomic coordinates, sequences, sequencing reads, alignments, gene model information and variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. Availability and implementation: HTSeq is released as an opensource software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index at https://pypi.python.org/pypi/HTSeq. Contact: sanders@fs.tum.deread more
Citations
More filters
Journal ArticleDOI
Self-assembly of embryonic and two extra-embryonic stem cell types into gastrulating embryo-like structures
Berna Sozen,Berna Sozen,Gianluca Amadei,Andy Cox,Ran Wang,Ellen Na,Sylwia Czukiewska,Lia Chappell,Thierry Voet,Thierry Voet,Geert Michel,Naihe Jing,Naihe Jing,David M. Glover,Magdalena Zernicka-Goetz +14 more
TL;DR: An approach to combine embryonic stem cells, trophoblast stem cells and extra-embryonic endoderm stem cells into self-assembling embryo-like structures, which recapitulate key hallmarks of gastrulation in vitro, is devised.
Journal ArticleDOI
Characterizing noise structure in single-cell RNA-seq distinguishes genuine from technical stochastic allelic expression
Jong Kyoung Kim,Aleksandra A. Kolodziejczyk,Aleksandra A. Kolodziejczyk,Tomislav Ilicic,Tomislav Ilicic,Sarah A. Teichmann,Sarah A. Teichmann,John C. Marioni,John C. Marioni +8 more
TL;DR: In this article, a generative statistical model that accurately quantifies technical noise with the help of external RNA spike-ins was proposed to distinguish biological variability from the high level of technical noise that affects scRNA-seq protocols.
Journal ArticleDOI
Towards clinical application of pronuclear transfer to prevent mitochondrial DNA disease
Louise Hyslop,Paul Blakeley,Lyndsey Craven,Jessica Richardson,Norah M. E. Fogarty,Elpida Fragouli,Mahdi Lamb,Sissy E. Wamaitha,Nilendran Prathalingam,Qi Zhang,Hannah O’Keefe,Yuko Takeda,Lucia Arizzi,Samer Alfarawati,Helen A. L. Tuppen,Laura Irving,Dimitrios Kalleas,Meenakshi Choudhary,Dagan Wells,Alison Murdoch,Douglass M. Turnbull,Kathy K. Niakan,Mary Herbert +22 more
TL;DR: It is concluded that PNT has the potential to reduce the risk of mtDNA disease, but it may not guarantee prevention.
Journal ArticleDOI
The Wound Microenvironment Reprograms Schwann Cells to Invasive Mesenchymal-like Cells to Drive Peripheral Nerve Regeneration
Melanie Clements,Elizabeth Byrne,Luis F. Camarillo Guerrero,Anne-Laure Cattin,Leila Zakka,Azhaar Ashraf,Jemima J. Burden,Sanjay Khadayate,Alison C. Lloyd,Samuel Marguerat,Simona Parrinello +10 more
TL;DR: The wound microenvironment is a key determinant of Schwann cell identity, and it promotes nerve repair through integration of multiple concerted signals.
Journal ArticleDOI
Identification of Spen as a Crucial Factor for Xist Function through Forward Genetic Screening in Haploid Embryonic Stem Cells
Asun Monfort,Giulio Di Minin,Andreas Postlmayr,Remo Freimann,Fabiana Arieti,Stéphane Thore,Stéphane Thore,Anton Wutz +7 more
TL;DR: A genetic screen for silencing factors in X chromosome inactivation using haploid mouse embryonic stem cells that carry an engineered selectable reporter system identifies the RNA-binding protein Spen, the homolog of split ends, who is required for gene repression by Xist.
References
More filters
Journal ArticleDOI
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2
TL;DR: This work presents DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates, which enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression.
Journal ArticleDOI
The Sequence Alignment/Map format and SAMtools
Heng Li,Bob Handsaker,Alec Wysoker,T. J. Fennell,Jue Ruan,Nils Homer,Gabor T. Marth,Gonçalo R. Abecasis,Richard Durbin +8 more
TL;DR: SAMtools as discussed by the authors implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments.
Journal ArticleDOI
Trimmomatic: a flexible trimmer for Illumina sequence data
TL;DR: Timmomatic is developed as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data and is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested.
Journal ArticleDOI
edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.
TL;DR: EdgeR as mentioned in this paper is a Bioconductor software package for examining differential expression of replicated count data, which uses an overdispersed Poisson model to account for both biological and technical variability and empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference.
Journal ArticleDOI
BEDTools: a flexible suite of utilities for comparing genomic features
Aaron R. Quinlan,Ira M. Hall +1 more
TL;DR: A new software suite for the comparison, manipulation and annotation of genomic features in Browser Extensible Data (BED) and General Feature Format (GFF) format, which allows the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks.