HTSeq—a Python framework to work with high-throughput sequencing data
TLDR
This work presents HTSeq, a Python library to facilitate the rapid development of custom scripts for high-throughput sequencing data analysis, and presents htseq-count, a tool developed with HTSequ that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes.Abstract:
Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard workflows, custom scripts are needed. Results: We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data, such as genomic coordinates, sequences, sequencing reads, alignments, gene model information and variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. Availability and implementation: HTSeq is released as an opensource software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index at https://pypi.python.org/pypi/HTSeq. Contact: sanders@fs.tum.deread more
Citations
More filters
Journal ArticleDOI
Fasting activates Fatty Acid Oxidation to enhance intestinal stem cell function during homeostasis and aging
Maria M. Mihaylova,Chia-Wei Cheng,Chia-Wei Cheng,Amanda Q. Cao,Surya Tripathi,Miyeko D. Mana,Miyeko D. Mana,Khristian E. Bauer-Rowe,Khristian E. Bauer-Rowe,Monther Abu-Remaileh,Laura Clavain,Aysegul Erdemir,Aysegul Erdemir,Caroline A. Lewis,Elizaveta Freinkman,Audrey S. Dickey,Albert R. La Spada,Yanmei Huang,George W. Bell,Vikram Deshpande,Peter Carmeliet,Pekka Katajisto,Pekka Katajisto,David M. Sabatini,Ömer H. Yilmaz,Ömer H. Yilmaz,Ömer H. Yilmaz +26 more
TL;DR: It is shown that a 24 hr fast augments intestinal stem cell (ISC) function in young and aged mice by inducing a fatty acid oxidation (FAO) program and that pharmacological activation of this program mimics many effects of fasting.
Journal ArticleDOI
OncodriveFML: a general framework to identify coding and non-coding regions with cancer driver mutations
Loris Mularoni,Radhakrishnan Sabarinathan,Jordi Deu-Pons,Abel Gonzalez-Perez,Nuria Lopez-Bigas,Nuria Lopez-Bigas +5 more
TL;DR: OncodriveFML is presented, a method designed to analyze the pattern of somatic mutations across tumors in both coding and non-coding genomic regions to identify signals of positive selection, and therefore, their involvement in tumorigenesis.
Journal ArticleDOI
B Cell Super-Enhancers and Regulatory Clusters Recruit AID Tumorigenic Activity.
Jason Qian,Qiao Wang,Marei Dose,Nathanael Pruett,Kyong-Rim Kieffer-Kwon,Wolfgang Resch,Genqing Liang,Zhonghui Tang,Ewy Mathé,Christopher Benner,Wendy Dubois,Steevenson Nelson,Laura Vian,Thiago Y. Oliveira,Mila Jankovic,Ofir Hakim,Anna Gazumyan,Rushad Pavri,Parirokh Awasthi,Bin Song,Geng Liu,Longyun Chen,Shida Zhu,Lionel Feigenbaum,Louis M. Staudt,Cornelis Murre,Yijun Ruan,Davide F. Robbiani,Qiang Pan-Hammarström,Michel C. Nussenzweig,Rafael Casellas +30 more
TL;DR: It is shown that AID targets are not randomly distributed across the genome but are predominantly grouped within super-enhancers and regulatory clusters, and 3D-linked targets cooperate to recruit AID-mediated breaks.
Journal ArticleDOI
Single-Cell Transcriptomic Atlas of Primate Ovarian Aging.
Si Wang,Yuxuan Zheng,Yuxuan Zheng,Jingyi Li,Yang Yu,Weiqi Zhang,Moshi Song,Zunpeng Liu,Zheying Min,H.X. Hu,Ying Jing,Xiaojuan He,Liang Sun,Lifang Ma,Concepcion Rodriguez Esteban,Piu Chan,Jie Qiao,Qi Zhou,Juan Carlos Izpisua Belmonte,Jing Qu,Fuchou Tang,Fuchou Tang,Guang-Hui Liu +22 more
TL;DR: A comprehensive understanding of the cell-type-specific mechanisms underlying primate ovarian aging at single-cell resolution is provided, revealing new diagnostic biomarkers and potential therapeutic targets for age-related human ovarian disorders.
Journal ArticleDOI
NK cell–mediated cytotoxicity contributes to tumor control by a cytostatic drug combination
Marcus Ruscetti,Josef Leibold,Matthew J. Bott,Myles Fennell,Amanda Kulick,Nelson R. Salgado,Chi-Chao Chen,Yu-Jui Ho,Francisco J. Sánchez-Rivera,Judith Feucht,Timour Baslan,Sha Tian,Hsuan-An Chen,Paul B. Romesser,John T. Poirier,Charles M. Rudin,Elisa de Stanchina,Eusebio Manchado,Charles J. Sherr,Charles J. Sherr,Scott W. Lowe,Scott W. Lowe +21 more
TL;DR: It is shown that mitogen-activated protein kinase (MAPK) and cyclin-dependent kinase 4/6 inhibitors act in combination to suppress the proliferation of KRAS-mutant lung cancer cells while simultaneously provoking a natural killer cell surveillance program leading to tumor cell death.
References
More filters
Journal ArticleDOI
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2
TL;DR: This work presents DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates, which enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression.
Journal ArticleDOI
The Sequence Alignment/Map format and SAMtools
Heng Li,Bob Handsaker,Alec Wysoker,T. J. Fennell,Jue Ruan,Nils Homer,Gabor T. Marth,Gonçalo R. Abecasis,Richard Durbin +8 more
TL;DR: SAMtools as discussed by the authors implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments.
Journal ArticleDOI
Trimmomatic: a flexible trimmer for Illumina sequence data
TL;DR: Timmomatic is developed as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data and is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested.
Journal ArticleDOI
edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.
TL;DR: EdgeR as mentioned in this paper is a Bioconductor software package for examining differential expression of replicated count data, which uses an overdispersed Poisson model to account for both biological and technical variability and empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference.
Journal ArticleDOI
BEDTools: a flexible suite of utilities for comparing genomic features
Aaron R. Quinlan,Ira M. Hall +1 more
TL;DR: A new software suite for the comparison, manipulation and annotation of genomic features in Browser Extensible Data (BED) and General Feature Format (GFF) format, which allows the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks.