HTSeq—a Python framework to work with high-throughput sequencing data
TLDR
This work presents HTSeq, a Python library to facilitate the rapid development of custom scripts for high-throughput sequencing data analysis, and presents htseq-count, a tool developed with HTSequ that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes.Abstract:
Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard workflows, custom scripts are needed. Results: We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data, such as genomic coordinates, sequences, sequencing reads, alignments, gene model information and variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. Availability and implementation: HTSeq is released as an opensource software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index at https://pypi.python.org/pypi/HTSeq. Contact: sanders@fs.tum.deread more
Citations
More filters
Journal ArticleDOI
MYB-FL controls gain and loss of floral UV absorbance, a key trait affecting pollinator preference and reproductive isolation
Hester Sheehan,Michel Moser,Ulrich Klahre,Korinna Esfeld,Alexandre Dell’Olivo,Therese Mandel,Sabine Metzger,Michiel Vandenbussche,Loreta B. Freitas,Cris Kuhlemeier +9 more
TL;DR: The genetic basis of floral UV absorbance, a key trait for attracting nocturnal pollinators, is studied in Petunia and functional differences in MYB-FL provide insight into the process of speciation and clarify phylogenetic relationships between nascent species.
Journal ArticleDOI
Autism-like phenotype and risk gene mRNA deadenylation by CPEB4 mis-splicing
Alberto Parras,Alberto Parras,Héctor Anta,María Santos-Galindo,María Santos-Galindo,Vivek Swarup,Ainara Elorza,Ainara Elorza,Jose Luis Nieto-Gonzalez,Jose Luis Nieto-Gonzalez,Sara Picó,Sara Picó,Ivó H. Hernández,Ivó H. Hernández,Ivó H. Hernández,Juan Ignacio Díaz-Hernández,Juan Ignacio Díaz-Hernández,Eulàlia Belloc,Annie Rodolosse,Neelroop N. Parikshak,Olga Peñagarikano,Olga Peñagarikano,Rafael Fernández-Chacón,Rafael Fernández-Chacón,Manuel Irimia,Pilar Navarro,Daniel H. Geschwind,Raúl Méndez,José J. Lucas,José J. Lucas +29 more
TL;DR: CPEB4 binds the mRNA of genes known to be associated with autism and shows an isoform imbalance in individuals with autism, and an equivalent imbalance in mice induces an autism-like phenotype, which identifies CPEB4 as a regulator of ASD risk genes.
Journal ArticleDOI
Foxc1 reinforces quiescence in self-renewing hair follicle stem cells
TL;DR: It is shown that murine hair follicle SCs induce the Foxc1 transcription factor when activated, which reveals a dynamic, cell-intrinsic mechanism used by hair follicles SCs to reinforce quiescence upon self-renewal and suggest a unique ability ofSCs to maintain cell identity.
Journal ArticleDOI
Regulation of Zn and Fe transporters by the GPC1 gene during early wheat monocarpic senescence
Stephen Pearce,Facundo Tabbita,Dario Cantu,Vince Buffalo,Raz Avni,Hans Vazquez-Gross,Rongrong Zhao,Christopher J. Conley,Assaf Distelfeld,Jorge Dubcovksy,Jorge Dubcovksy +10 more
TL;DR: It is demonstrated that GPC1 is a key regulator of nutrient remobilization which acts predominantly during the early stages of senescence, which can help mitigate Zn and Fe deficiencies that afflict many regions of the developing world.
Journal ArticleDOI
A single gene underlies the dynamic evolution of poplar sex determination
Niels A. Müller,Birgit Kersten,Ana Paula Leite Montalvão,Niklas Mähler,Carolina Bernhardsson,Katharina Bräutigam,Zulema Carracedo Lorenzo,Hans Hoenicka,Vikash Kumar,Malte Mader,Birte Pakull,Kathryn M. Robinson,Maurizio Sabatti,Cristina Vettori,Pär K. Ingvarsson,Quentin C. B. Cronk,Nathaniel R. Street,Matthias Fladung +17 more
TL;DR: In this article, the authors show that diverse poplar species carry partial duplicates of the ARABIDOPSIS RESPONSE REGULATOR 17 (ARR17) orthologue in the male specific region of the Y chromosome.
References
More filters
Journal ArticleDOI
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2
TL;DR: This work presents DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates, which enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression.
Journal ArticleDOI
The Sequence Alignment/Map format and SAMtools
Heng Li,Bob Handsaker,Alec Wysoker,T. J. Fennell,Jue Ruan,Nils Homer,Gabor T. Marth,Gonçalo R. Abecasis,Richard Durbin +8 more
TL;DR: SAMtools as discussed by the authors implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments.
Journal ArticleDOI
Trimmomatic: a flexible trimmer for Illumina sequence data
TL;DR: Timmomatic is developed as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data and is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested.
Journal ArticleDOI
edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.
TL;DR: EdgeR as mentioned in this paper is a Bioconductor software package for examining differential expression of replicated count data, which uses an overdispersed Poisson model to account for both biological and technical variability and empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference.
Journal ArticleDOI
BEDTools: a flexible suite of utilities for comparing genomic features
Aaron R. Quinlan,Ira M. Hall +1 more
TL;DR: A new software suite for the comparison, manipulation and annotation of genomic features in Browser Extensible Data (BED) and General Feature Format (GFF) format, which allows the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks.