scispace - formally typeset
Open AccessJournal ArticleDOI

Kalign – an accurate and fast multiple sequence alignment algorithm

Timo Lassmann, +1 more
- 12 Dec 2005 - 
- Vol. 6, Iss: 1, pp 298-298
Reads0
Chats0
TLDR
Kalign, a method employing the Wu-Manber string-matching algorithm, is developed to improve both the accuracy and speed of multiple sequence alignment and is especially well suited for the increasingly important task of aligning large numbers of sequences.
Abstract
The alignment of multiple protein sequences is a fundamental step in the analysis of biological data It has traditionally been applied to analyzing protein families for conserved motifs, phylogeny, structural properties, and to improve sensitivity in homology searching The availability of complete genome sequences has increased the demands on multiple sequence alignment (MSA) programs Current MSA methods suffer from being either too inaccurate or too computationally expensive to be applied effectively in large-scale comparative genomics We developed Kalign, a method employing the Wu-Manber string-matching algorithm, to improve both the accuracy and speed of multiple sequence alignment We compared the speed and accuracy of Kalign to other popular methods using Balibase, Prefab, and a new large test set Kalign was as accurate as the best other methods on small alignments, but significantly more accurate when aligning large and distantly related sets of sequences In our comparisons, Kalign was about 10 times faster than ClustalW and, depending on the alignment size, up to 50 times faster than popular iterative methods Kalign is a fast and robust alignment method It is especially well suited for the increasingly important task of aligning large numbers of sequences

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Database resources of the National Center for Biotechnology Information

TL;DR: In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI’s website.
Journal ArticleDOI

Improvement of Phylogenies after Removing Divergent and Ambiguously Aligned Blocks from Protein Sequence Alignments

TL;DR: Whether phylogenetic reconstruction improves after alignment cleaning or not is examined and cleaned alignments produce better topologies although, paradoxically, with lower bootstrap, which indicates that divergent and problematic alignment regions may lead, when present, to apparently better supported although, in fact, more biased topologies.
Journal ArticleDOI

Recent developments in the MAFFT multiple sequence alignment program

TL;DR: The initial version of the MAFFT program was developed in 2002 and was updated in 2007 with two new techniques: the PartTree algorithm and the Four-way consistency objective function, which improved the scalability of progressive alignment and the accuracy of ncRNA alignment.
References
More filters
Journal ArticleDOI

Basic Local Alignment Search Tool

TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.
Journal ArticleDOI

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Journal ArticleDOI

Clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice

TL;DR: The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved and modifications are incorporated into a new program, CLUSTAL W, which is freely available.
Journal ArticleDOI

The neighbor-joining method: a new method for reconstructing phylogenetic trees.

TL;DR: The neighbor-joining method and Sattath and Tversky's method are shown to be generally better than the other methods for reconstructing phylogenetic trees from evolutionary distance data.
Journal ArticleDOI

MUSCLE: multiple sequence alignment with high accuracy and high throughput

TL;DR: MUSCLE is a new computer program for creating multiple alignments of protein sequences that includes fast distance estimation using kmer counting, progressive alignment using a new profile function the authors call the log-expectation score, and refinement using tree-dependent restricted partitioning.
Related Papers (5)