Improved tools for biological sequence comparison.

doi:10.1073/PNAS.85.8.2444

Open AccessJournal ArticleDOI

Improved tools for biological sequence comparison.

William R. Pearson, +1 more

- 01 Apr 1988 -

Proceedings of the National Academy of S...

- Vol. 85, Iss: 8, pp 2444-2448

Chats0

TLDR

Three computer programs for comparisons of protein and DNA sequences can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity.

Abstract:

We have developed three computer programs for comparisons of protein and DNA sequences. They can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity. The FASTA program is a more sensitive derivative of the FASTP program, which can be used to search protein or DNA sequence data bases and can compare a protein sequence to a DNA sequence data base by translating the DNA data base as it is searched. FASTA includes an additional step in the calculation of the initial pairwise similarity score that allows multiple regions of similarity to be joined to increase the score of related sequences. The RDF2 program can be used to evaluate the significance of similarity scores using a shuffling method that preserves local sequence composition. The LFASTA program can display all the regions of local similarity between two sequences with scores greater than a threshold, using the same scoring parameters and a similar alignment algorithm; these local similarities can be displayed as a "graphic matrix" plot or as individual alignments. In addition, these programs have been generalized to allow comparison of DNA or protein sequences based on a variety of alternative scoring matrices.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Basic Local Alignment Search Tool

Stephen F. Altschul, +4 more

- 01 Oct 1990 -

Journal of Molecular Biology

TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.

...read moreread less

Journal ArticleDOI

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Stephen F. Altschul, +6 more

- 01 Sep 1997 -

Nucleic Acids Research

TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.

...read moreread less

Journal ArticleDOI

Clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice

Julie D. Thompson, +2 more

- 11 Nov 1994 -

Nucleic Acids Research

TL;DR: The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved and modifications are incorporated into a new program, CLUSTAL W, which is freely available.

...read moreread less

Journal ArticleDOI

Fast and accurate short read alignment with Burrows–Wheeler transform

Heng Li, +1 more

- 01 Jul 2009 -

Bioinformatics

TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.

...read moreread less

Journal ArticleDOI

The Protein Data Bank

Helen M. Berman, +7 more

- 01 Jan 2000 -

Nucleic Acids Research

TL;DR: The goals of the PDB are described, the systems in place for data deposition and access, how to obtain further information and plans for the future development of the resource are described.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Similar Amino Acid Sequences: Chance or Common Ancestry?

Russell F. Doolittle

- 09 Oct 1981 -

Science

TL;DR: The systemic comparison of every newly determined amino acid sequence with all other known sequences may allow a complete reconstruction of the evolutionary events leading to contemporary proteins, but sometimes the surviving similarities are so vague that even computer-based sequence comparisons procedures are unable to validate relationships.

...read moreread less

Journal ArticleDOI

Enhanced graphic matrix analysis of nucleic acid and protein sequences.

Jacob V. Maizel, +1 more

- 01 Dec 1981 -

Proceedings of the National Academy of S...

TL;DR: Computer translation of nucleic acid sequences into all possible amino acid sequences followed by graphic matrix analysis provides a way to detect the most likely protein encoding regions and can predict the correct reading frames in sequences in which splicing patterns are not defined.

...read moreread less

Journal ArticleDOI

Pattern recognition in nucleic acid sequences. I. A general method for finding local homologies and symmetries

Walter B. Goad, +1 more

- 11 Jan 1982 -

Nucleic Acids Research

TL;DR: An algorithm is presented--a generalization of the Needleman-Wunsch-Sellers algorithm--which finds within longer sequences all subsequences that resemble one another locally locally.

...read moreread less

Journal ArticleDOI

Efficient algorithms for folding and comparing nucleic acid sequences

Jean-Pierre Dumas, +1 more

- 11 Jan 1982 -

Nucleic Acids Research

TL;DR: The homology and secondary structure programs are respectively illustrated with a comparison of two phage genomes, and a discussion of Drosophila melanogaster 55 RNA folding.

...read moreread less

Journal ArticleDOI

On the statistical significance of nucleic acid similarities.

David J. Lipman, +3 more

- 11 Jan 1984 -

Nucleic Acids Research

TL;DR: It is demonstrated that the known statistical properties of nucleic acid sequences strongly affect the statistical distribution of similarity values when calculated by standard procedures and a series of models are proposed which account for some of theseknown statistical properties.

...read moreread less