BLAST+: architecture and applications.
Christiam Camacho,George Coulouris,Vahram Avagyan,Ning Ma,Jason S. Papadopoulos,Kevin Bealer,Thomas L. Madden +6 more
TLDR
The new BLAST command-line applications, compared to the current BLAST tools, demonstrate substantial speed improvements for long queries as well as chromosome length database sequences.Abstract:
Sequence similarity searching is a very important bioinformatics task. While Basic Local Alignment Search Tool (BLAST) outperforms exact methods through its use of heuristics, the speed of the current BLAST software is suboptimal for very long queries or database sequences. There are also some shortcomings in the user-interface of the current command-line applications. We describe features and improvements of rewritten BLAST software and introduce new command-line applications. Long query sequences are broken into chunks for processing, in some cases leading to dramatically shorter run times. For long database sequences, it is possible to retrieve only the relevant parts of the sequence, reducing CPU time and memory usage for searches of short queries against databases of contigs or chromosomes. The program can now retrieve masking information for database sequences from the BLAST databases. A new modular software library can now access subject sequence data from arbitrary data sources. We introduce several new features, including strategy files that allow a user to save and reuse their favorite set of options. The strategy files can be uploaded to and downloaded from the NCBI BLAST web site. The new BLAST command-line applications, compared to the current BLAST tools, demonstrate substantial speed improvements for long queries as well as chromosome length database sequences. We have also improved the user interface of the command-line applications.read more
Citations
More filters
Journal ArticleDOI
CARD 2020: antibiotic resistome surveillance with the comprehensive antibiotic resistance database
Brian Alcock,Amogelang R. Raphenya,Tammy T. Y. Lau,Kara K. Tsang,Mégane Bouchard,Arman Edalatmand,William Huynh,Anna-Lisa V. Nguyen,Annie A. Cheng,Sihan Liu,Sally Y. Min,Anatoly Miroshnichenko,Hiu-Ki R Tran,Rafik El Werfalli,Jalees A. Nasir,Martins Oloni,David Speicher,Alexandra Florescu,Bhavya Singh,Mateusz Faltyn,Anastasia Hernández-Koutoucheva,Arjun N. Sharma,Emily Bordeleau,Andrew C. Pawlowski,Haley L. Zubyk,Damion M. Dooley,Emma Griffiths,Finlay Maguire,Geoffrey L. Winsor,Robert G. Beiko,Fiona S. L. Brinkman,William W. L. Hsiao,William W. L. Hsiao,Gary Van Domselaar,Gary Van Domselaar,Andrew G. McArthur +35 more
TL;DR: A new Resistomes & Variants module provides analysis and statistical summary of in silico predicted resistance variants from 82 pathogens and over 100 000 genomes, able to summarize predicted resistance using the information included in CARD, identify trends in AMR mobility and determine previously undescribed and novel resistance variants.
Journal ArticleDOI
A taxonomic note on the genus Lactobacillus: Description of 23 novel genera, emended description of the genus Lactobacillus Beijerinck 1901, and union of Lactobacillaceae and Leuconostocaceae.
Jinshui Zheng,Stijn Wittouck,Elisa Salvetti,Charles M. A. P. Franz,Hugh M. B. Harris,Paola Mattarelli,Paul W. O'Toole,Bruno Pot,Peter Vandamme,Jens Walter,Koichi Watanabe,Sander Wuyts,Giovanna E. Felis,Michael G. Gänzle,Michael G. Gänzle,Sarah Lebeer +15 more
TL;DR: This study evaluated the taxonomy of Lactobacillaceae and Leuconostocaceae on the basis of whole genome sequences and proposed reclassification reflects the phylogenetic position of the micro-organisms, and groups lactobacilli into robust clades with shared ecological and metabolic properties.
Journal ArticleDOI
antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences
Marnix H. Medema,Kai Blin,Peter Cimermancic,Victor de Jager,Victor de Jager,Piotr Zakrzewski,Michael A. Fischbach,Tilmann Weber,Eriko Takano,Rainer Breitling +9 more
TL;DR: This work presents the first comprehensive pipeline capable of identifying biosynthetic loci covering the whole range of known secondary metabolite compound classes, and integrates or cross-links all previously available secondary-metabolite specific gene analysis methods in one interactive view.
Journal ArticleDOI
The Comprehensive Antibiotic Resistance Database
Andrew G. McArthur,Nicholas Waglechner,Fazmin Nizam,Austin Yan,Marisa A. Azad,Alison J. Baylay,Kirandeep Bhullar,Marc J. Canova,Gianfranco De Pascale,Linda Ejim,Lindsay Kalan,Andrew M. King,Kalinka Koteva,Mariya Morar,Michael R. Mulvey,Jonathan S O'Brien,Andrew C. Pawlowski,Laura J. V. Piddock,Peter Spanogiannopoulos,Arlene D. Sutherland,Irene Tang,Patricia L. Taylor,Maulik N. Thaker,Wenliang Wang,Marie Yan,Tennison Yu,Gerard D. Wright +26 more
TL;DR: The CARD integrates disparate molecular and sequence data, provides a unique organizing principle in the form of the Antibiotic Resistance Ontology (ARO), and can quickly identify putative antibiotic resistance genes in new unannotated genome sequences.
Journal ArticleDOI
Bandage: interactive visualization of de novo genome assemblies
TL;DR: Bandage (a Bioinformatics Application for Navigating De novo Assembly Graphs Easily) is a tool for visualizing assembly graphs with connections that presents new possibilities for analyzing de novo assemblies that are not possible through investigation of contigs alone.
References
More filters
Journal ArticleDOI
Basic Local Alignment Search Tool
TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.
Journal ArticleDOI
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
Stephen F. Altschul,Thomas L. Madden,Alejandro A. Schäffer,Jinghui Zhang,Zheng Zhang,Webb Miller,David J. Lipman +6 more
TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Journal ArticleDOI
BLAT—The BLAST-Like Alignment Tool
TL;DR: How BLAT was optimized is described, which is more accurate and 500 times faster than popular existing tools for mRNA/DNA alignments and 50 times faster for protein alignments at sensitivity settings typically used when comparing vertebrate sequences.
Journal ArticleDOI
Initial sequencing and comparative analysis of the mouse genome.
Robert H. Waterston,Kerstin Lindblad-Toh,Ewan Birney,Jane Rogers,Josep F. Abril,Pankaj K. Agarwal,Richa Agarwala,Rachel Ainscough,Marina Alexandersson,Peter An,Stylianos E. Antonarakis,John Attwood,Robert Baertsch,J Bailey,K F Barlow,Stephan Beck,Eric Berry,Bruce W. Birren,Toby Bloom,Peer Bork,Marc Botcherby,Nicolas Bray,Michael R. Brent,Daniel G. Brown,Daniel G. Brown,Stephen D. Brown,Carol J. Bult,John Burton,Jonathan Butler,R. D. Campbell,Piero Carninci,Simon Cawley,Francesca Chiaromonte,Asif T. Chinwalla,Deanna M. Church,Michele Clamp,C M Clee,Francis S. Collins,Lisa Cook,Richard R. Copley,Alan Coulson,Olivier Couronne,James Cuff,Val Curwen,Tim Cutts,Mark J. Daly,Robert David,Joy Davies,Kimberly D. Delehaunty,Justin Deri,Emmanouil T. Dermitzakis,Colin N. Dewey,Nicholas J. Dickens,Mark Diekhans,Sheila Dodge,Inna Dubchak,Diane M. Dunn,Sean R. Eddy,Laura Elnitski,Richard D. Emes,Pallavi Eswara,Eduardo Eyras,Adam Felsenfeld,Ginger A. Fewell,Paul Flicek,Karen Foley,Wayne N. Frankel,Lucinda Fulton,Robert S. Fulton,Terrence S. Furey,Diane Gage,Richard A. Gibbs,Gustavo Glusman,Sante Gnerre,Nick Goldman,Leo Goodstadt,Darren Grafham,Tina Graves,Eric D. Green,Simon G. Gregory,Roderic Guigó,Mark S. Guyer,Ross C. Hardison,David Haussler,Yoshihide Hayashizaki,Deana W. LaHillier,Angela S. Hinrichs,Wratko Hlavina,Timothy Holzer,Fan Hsu,Axin Hua,Tim Hubbard,Adrienne Hunt,Ian J. Jackson,David B. Jaffe,L. Steven Johnson,Matthew Jones,Thomas A. Jones,A Joy,Michael Kamal,Elinor K. Karlsson,Donna Karolchik,Arkadiusz Kasprzyk,Jun Kawai,Evan Keibler,Cristyn Kells,W. James Kent,Andrew Kirby,Diana L. Kolbe,Ian F Korf,Raju Kucherlapati,Edward J. Kulbokas,David Kulp,Tom Landers,J. P. Leger,Steven Leonard,Ivica Letunic,Rosie Levine,Jia Li,Ming Li,Christine Lloyd,Susan Lucas,Bin Ma,Donna Maglott,Elaine R. Mardis,Lucy Matthews,Evan Mauceli,John Mayer,Megan McCarthy,W. Richard McCombie,Stuart McLaren,Kirsten McLay,John Douglas Mcpherson,James Meldrim,Beverley Meredith,Jill P. Mesirov,Webb Miller,Tracie L. Miner,Emmanuel Mongin,Kate Montgomery,Michael J. Morgan,Richard Mott,James C. Mullikin,Donna M. Muzny,William E. Nash,Joanne O. Nelson,Michael N. Nhan,Robert Nicol,Zemin Ning,Chad Nusbaum,Michael J. O’Connor,Yasushi Okazaki,Karen Oliver,Emma Overton-Larty,Lior Pachter,Genís Parra,Kymberlie H. Pepin,Jane Peterson,Pavel A. Pevzner,Robert W. Plumb,Craig Pohl,Alex Poliakov,Tracy C. Ponce,Chris P. Ponting,Simon C. Potter,Michael A. Quail,Alexandre Reymond,Bruce A. Roe,Krishna M. Roskin,Edward M. Rubin,Alistair G. Rust,Ralph Santos,Victor Sapojnikov,Brian Schultz,Jörg Schultz,Matthias S. Schwartz,Scott Schwartz,Carol Scott,Steven Seaman,Steve Searle,Ted Sharpe,Andrew Sheridan,Ratna Shownkeen,Sarah Sims,Jonathan Singer,Guy Slater,Arian F.A. Smit,Douglas Smith,Brian Spencer,Arne Stabenau,Nicole Stange-Thomann,Charles W. Sugnet,Mikita Suyama,Glenn Tesler,Johanna Thompson,David Torrents,Evanne Trevaskis,John Tromp,Catherine Ucla,Abel Ureta-Vidal,Jade P. Vinson,Andrew von Niederhausern,Claire M. Wade,Melanie M. Wall,R. J. Weber,Robert B. Weiss,Michael C. Wendl,Anthony P. West,Kris A. Wetterstrand,Raymond Wheeler,Simon Whelan,Jamey Wierzbowski,David Willey,Sophie Williams,Richard K. Wilson,Eitan E. Winter,Kim C. Worley,Dudley Wyman,Shan Yang,Shiaw Pyng Yang,Evgeny M. Zdobnov,Michael C. Zody,Eric S. Lander +222 more
TL;DR: The results of an international collaboration to produce a high-quality draft sequence of the mouse genome are reported and an initial comparative analysis of the Mouse and human genomes is presented, describing some of the insights that can be gleaned from the two sequences.
Journal ArticleDOI
A greedy algorithm for aligning DNA sequences.
TL;DR: A new greedy alignment algorithm is introduced with particularly good performance and it is shown that it computes the same alignment as does a certain dynamic programming algorithm, while executing over 10 times faster on appropriate data.