MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform
Reads0
Chats0
TLDR
A simplified scoring system is proposed that performs well for reducing CPU time and increasing the accuracy of alignments even for sequences having large insertions or extensions as well as distantly related sequences of similar length.Abstract:
A multiple sequence alignment program, MAFFT, has been developed. The CPU time is drastically reduced as compared with existing methods. MAFFT includes two novel techniques. (i) Homologous regions are rapidly identified by the fast Fourier transform (FFT), in which an amino acid sequence is converted to a sequence composed of volume and polarity values of each amino acid residue. (ii) We propose a simplified scoring system that performs well for reducing CPU time and increasing the accuracy of alignments even for sequences having large insertions or extensions as well as distantly related sequences of similar length. Two different heuristics, the progressive method (FFT-NS-2) and the iterative refinement method (FFT-NS-i), are implemented in MAFFT. The performances of FFT-NS-2 and FFT-NS-i were compared with other methods by computer simulations and benchmark tests; the CPU time of FFT-NS-2 is drastically reduced as compared with CLUSTALW with comparable accuracy. FFT-NS-i is over 100 times faster than T-COFFEE, when the number of input sequences exceeds 60, without sacrificing the accuracy.read more
Citations
More filters
Journal ArticleDOI
Uncovering Earth’s virome
David Paez-Espino,Emiley A. Eloe-Fadrosh,Georgios A. Pavlopoulos,Alex D. Thomas,Marcel Huntemann,Natalia Mikhailova,Edward M. Rubin,Edward M. Rubin,Natalia Ivanova,Nikos C. Kyrpides +9 more
TL;DR: Analysis of viral distribution across diverse ecosystems revealed strong habitat-type specificity for the vast majority of viruses, but also identified some cosmopolitan groups, and detailed insight into viral habitat distribution and host–virus interactions is provided.
Journal ArticleDOI
Reference genome sequence of the model plant Setaria
Jeffrey L. Bennetzen,Jeremy Schmutz,Hao Wang,Ryan Percifield,Ryan Percifield,Jennifer S. Hawkins,Jennifer S. Hawkins,Ana Clara Pontaroli,Ana Clara Pontaroli,Matt C. Estep,Matt C. Estep,Liang Feng,Justin N. Vaughn,Jane Grimwood,Jerry Jenkins,Kerrie Barry,Erika Lindquist,Uffe Hellsten,Shweta Deshpande,Xuewen Wang,Xiaomei Wu,Xiaomei Wu,Therese Mitros,Jimmy K. Triplett,Jimmy K. Triplett,Xiaohan Yang,Chu-Yu Ye,Margarita Mauro-Herrera,Lin Wang,Pinghua Li,Manoj Sharma,Rita Sharma,Pamela C. Ronald,Olivier Panaud,Elizabeth A. Kellogg,Thomas P. Brutnell,Thomas P. Brutnell,Andrew N. Doust,Gerald A. Tuskan,Daniel S. Rokhsar,Katrien M. Devos +40 more
TL;DR: A high-quality reference genome sequence for foxtail millet (Setaria italica) is generated and regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion are identified.
Journal ArticleDOI
Lifestyle transitions in plant pathogenic Colletotrichum fungi deciphered by genome and transcriptome analyses
Richard J. O'Connell,Michael R. Thon,Stéphane Hacquard,Stefan G. Amyotte,Jochen Kleemann,Maria F. Torres,Ulrike Damm,Ester Alvarenga Santos Buiate,Lynn Epstein,Noam Alkan,Janine Altmüller,Lucia Alvarado-Balderrama,Christopher Bauser,Christian Becker,Bruce W. Birren,Zehua Chen,Jaeyoung Choi,Jo Anne Crouch,Jonathan P. Duvick,Jonathan P. Duvick,Mark A. Farman,Pamela Gan,David I. Heiman,Bernard Henrissat,Richard J. Howard,Mehdi Kabbage,Christian Koch,Barbara Kracher,Yasuyuki Kubo,Audrey D. Law,Marc-Henri Lebrun,Yong-Hwan Lee,Itay Miyara,Neil Moore,Ulla Neumann,Karl Nordström,Daniel G. Panaccione,Ralph Panstruga,Ralph Panstruga,Michael Place,Robert H. Proctor,Dov Prusky,Gabriel E. Rech,Richard Reinhardt,Jeffrey A. Rollins,Steve Rounsley,Christopher L. Schardl,David C. Schwartz,Narmada Shenoy,Ken Shirasu,Usha Rani Sikhakolli,Kurt Stüber,Serenella A. Sukno,James A. Sweigard,Yoshitaka Takano,Hiroyuki Takahara,Hiroyuki Takahara,Frances Trail,H. Charlotte van der Does,H. Charlotte van der Does,Lars M. Voll,Isa Will,Sarah Young,Qiandong Zeng,Jingze Zhang,Shiguo Zhou,Martin B. Dickman,Paul Schulze-Lefert,Emiel Ver Loren van Themaat,Li-Jun Ma,Li-Jun Ma,Lisa J. Vaillancourt +71 more
TL;DR: Findings show that preinvasion perception of plant-derived signals substantially reprograms fungal gene expression and indicate previously unknown functions for particular fungal cell types.
Journal ArticleDOI
Parallelization of the MAFFT multiple sequence alignment program
Kazutaka Katoh,Hiroyuki Toh +1 more
TL;DR: The three calculation stages, all-to-all comparison, progressive alignment and iterative refinement, of the MAFFT MSA program were parallelized using the POSIX Threads library to reduce the time required for large-scale sequence analyses.
Journal ArticleDOI
Towards the definition of a core of microorganisms involved in anaerobic digestion of sludge
Delphine Rivière,Virginie Desvignes,Eric Pelletier,Sébastien Chaussonnerie,Sonda Guermazi,Sonda Guermazi,Jean Weissenbach,Jean Weissenbach,Tianlun Li,Patricia Camacho,Abdelghani Sghir,Abdelghani Sghir +11 more
TL;DR: A comparison of anaerobic digester populations is a first step towards a future understanding of the relationship among biodiversity, operating conditions and digester efficiency.
References
More filters
Journal ArticleDOI
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
Stephen F. Altschul,Thomas L. Madden,Alejandro A. Schäffer,Jinghui Zhang,Zheng Zhang,Webb Miller,David J. Lipman +6 more
TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Journal ArticleDOI
Clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice
TL;DR: The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved and modifications are incorporated into a new program, CLUSTAL W, which is freely available.
Journal ArticleDOI
A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences.
TL;DR: Some examples were worked out using reported globin sequences to show that synonymous substitutions occur at much higher rates than amino acid-altering substitutions in evolution.
Book
Numerical Recipes in C: The Art of Scientific Computing
TL;DR: Numerical Recipes: The Art of Scientific Computing as discussed by the authors is a complete text and reference book on scientific computing with over 100 new routines (now well over 300 in all), plus upgraded versions of many of the original routines, with many new topics presented at the same accessible level.
Journal ArticleDOI
Improved tools for biological sequence comparison.
TL;DR: Three computer programs for comparisons of protein and DNA sequences can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity.