nhmmer: DNA homology search with profile HMMs
Travis J. Wheeler,Sean R. Eddy +1 more
Reads0
Chats0
TLDR
A tool for DNA/DNA sequence comparison that is built on the HMMER framework, which applies probabilistic inference methods based on hidden Markov models to the problem of homology search, called nhmmer, enables improved detection of remote DNA homologs.Abstract:
Summary: Sequence database searches are an essential part of molecular biology, providing information about the function and evolutionary history of proteins, RNA molecules and DNA sequence elements. We present a tool for DNA/DNA sequence comparison that is built on the HMMER framework, which applies probabilistic inference methods based on hidden Markov models to the problem of homology search. This tool, called nhmmer, enables improved detection of remote DNA homologs, and has been used in combination with Dfam and RepeatMasker to improve annotation of transposable elements in the human genome. Availability: nhmmer is a part of the new HMMER3.1 release. Source code and documentation can be downloaded from http://hmmer.org. HMMER3.1 is freely licensed under the GNU GPLv3 and should be portable to any POSIX-compliant operating system, including Linux and Mac OS/X. Contact: wheelert@janelia.hhmi.orgread more
Citations
More filters
Journal ArticleDOI
GeSeq - versatile and accurate annotation of organelle genomes.
Michael Tillich,Pascal Lehwark,Tommaso Pellizzer,Elena S. Ulbricht-Jones,Axel Fischer,Ralph Bock,Stephan Greiner +6 more
TL;DR: The web application GeSeq combines batch processing with a fully customizable reference sequence selection of organellar genome records from NCBI and/or references uploaded by the user to support high-quality annotations of chloroplast genomes.
Journal ArticleDOI
RepeatModeler2 for automated genomic discovery of transposable element families.
Jullien M. Flynn,Robert Hubley,Clément Goubert,Jeb Rosen,Andrew G. Clark,Cédric Feschotte,Arian F.A. Smit +6 more
TL;DR: This program brings substantial improvements over the original version of RepeatModeler, one of the most widely used tools for TE discovery, and incorporates a module for structural discovery of complete long terminal repeat (LTR) retroelements, which are widespread in eukaryotic genomes but recalcitrant to automated identification because of their size and sequence complexity.
Journal ArticleDOI
Rfam 12.0: updates to the RNA families database
Eric P. Nawrocki,Sarah W. Burge,Alex Bateman,Jennifer Daub,Ruth Y. Eberhardt,Sean R. Eddy,Evan Floden,Paul P. Gardner,Thomas A. Jones,John Tate,Robert D. Finn,Robert D. Finn +11 more
TL;DR: The upgrade of the authors' search pipeline to use Infernal 1.1 is described and improved homology detection ability is demonstrated by comparison with the previous version, and the new pipeline is easier for users to apply to their own data sets, and its ability to annotate RNAs in genomic and metagenomic data sets of various sizes is illustrated.
Journal ArticleDOI
An atlas of human long non-coding RNAs with accurate 5′ ends
Chung-Chau Hon,Jordan A. Ramilowski,Jayson Harshbarger,Nicolas Bertin,Nicolas Bertin,Owen J. L. Rackham,Owen J. L. Rackham,Julian Gough,Elena Denisenko,Sebastian Schmeier,Thomas M. Poulsen,Jessica Severin,Marina Lizio,Hideya Kawaji,Takeya Kasukawa,Masayoshi Itoh,A. Maxwell Burroughs,Shohei Noma,Sarah Djebali,Sarah Djebali,Tanvir Alam,Yulia A. Medvedeva,Alison C. Testa,Leonard Lipovich,Chi Wai Yip,Imad Abugessaisa,Mickal Mendez,Akira Hasegawa,Dave Tang,Timo Lassmann,Peter Heutink,Magda Babina,Christine A. Wells,Christine A. Wells,Soichi Kojima,Yukio Nakamura,Harukazu Suzuki,Carsten O. Daub,Michiel J. L. de Hoon,Erik Arner,Yoshihide Hayashizaki,Piero Carninci,Alistair R. R. Forrest +42 more
TL;DR: This work integrates multiple transcript collections to generate a comprehensive atlas of 27,919 human lncRNA genes with high-confidence 5′ ends and expression profiles across 1,829 samples from the major human primary cell types and tissues, identifying 19,175 potentially functional lncRNAs in the human genome.
Journal ArticleDOI
The complete sequence of a human genome
Sergey Koren,Sergey Nurk,Mikko Rautiainen,B Ren,Weijun Zhu,Richard Lawless,Саидмуродов Мамур Таирович +6 more
TL;DR: The T2T-CHM13-T2T Consortium presented a complete 3.055 billion-base pair sequence of a human genome, including gapless assemblies for all chromosomes except Y, corrected errors in the prior references, and introduced nearly 200 million base pairs of sequence containing gene predictions, 99 of which are predicted to be protein coding as discussed by the authors .
References
More filters
Journal ArticleDOI
Basic Local Alignment Search Tool
TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.
Journal ArticleDOI
Fast and accurate short read alignment with Burrows–Wheeler transform
Heng Li,Richard Durbin +1 more
TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.
Journal ArticleDOI
Ultrafast and memory-efficient alignment of short DNA sequences to the human genome
TL;DR: Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches and can be used simultaneously to achieve even greater alignment speeds.
Journal ArticleDOI
BLAST+: architecture and applications.
Christiam Camacho,George Coulouris,Vahram Avagyan,Ning Ma,Jason S. Papadopoulos,Kevin Bealer,Thomas L. Madden +6 more
TL;DR: The new BLAST command-line applications, compared to the current BLAST tools, demonstrate substantial speed improvements for long queries as well as chromosome length database sequences.
Journal ArticleDOI
Identification of common molecular subsequences.
TL;DR: This letter extends the heuristic homology algorithm of Needleman & Wunsch (1970) to find a pair of segments, one from each of two long sequences, such that there is no other Pair of segments with greater similarity (homology).
Related Papers (5)
Trimmomatic: a flexible trimmer for Illumina sequence data
MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability
Kazutaka Katoh,Daron M. Standley +1 more