scispace - formally typeset
Book ChapterDOI

Rapid and Sensitive Sequence Comparison with FASTP and FASTA.

William R. Pearson
- 01 Jan 1990 - 
- Vol. 183, pp 63-98
Reads0
Chats0
TLDR
FASTA and FASTP were designed to identify protein sequences that have descended from a common ancestor and they have proved very useful for this task, but it is not clear that NWS-based programs would be more successful in finding distantly related members of the G-protein-coupled receptor family.
Abstract
The FASTA program can search the NBRF protein sequence library (2.5 million residues) in less than 20 min on an IBM-PC microcomputer and unambiguously detect proteins that shared a common ancestor billions of years in the past. FASTA is both fast and selective because it initially considers only amino acid identities. Its sensitivity is increased not only by using the PAM250 matrix to score and rescore regions with large numbers of identities but also by joining initial regions. The results of searches with FASTA compare favorably with results using NWS-based programs that are 100 times slower. FASTA is slightly less sensitive but considerably more selective. It is not clear that NWS-based programs would be more successful in finding distantly related members of the G-protein-coupled receptor family. The joining step by FASTA to calculate the initn score is especially useful for sequences that share regions of sequence similarity that are separated by variable-length loops. FASTP and FASTA were designed to identify protein sequences that have descended from a common ancestor, and they have proved very useful for this task. In many cases, a FASTA sequence search will result in a list of high scoring library sequences that are homologous to the query sequence, or the search will result in a list of sequences with similarity scores that cannot be distinguished from the bulk of the library. In either case, the question of whether there are sequences in the library that are clearly related to the query sequence has been answered unambiguously. Unfortunately, the results often will not be so clear-cut, and careful analysis of similarity scores, statistical significance, the actual aligned residues, and the biological context are required. In the course of analyzing the G-protein-coupled receptor family, several proteins were found that, because of a high initn score and a low init1 score that increased almost 2-fold with optimization, appeared to be members of this family which were not previously recognized. RDF2 analysis showed borderline z values, and only a careful examination of the sequence alignments that focused on the conserved residues provided convincing evidence that the high scores were fortuitous. As sequence comparison methods become more powerful by becoming more sensitive, they become more likely to mislead, and even greater care is required.

read more

Citations
More filters
Journal ArticleDOI

Amino acid substitution matrices from protein blocks

TL;DR: This work has derived substitution matrices from about 2000 blocks of aligned sequence segments characterizing more than 500 groups of related proteins, leading to marked improvements in alignments and in searches using queries from each of the groups.
Journal ArticleDOI

An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database.

TL;DR: The approach described in this manuscript provides a convenient method to interpret tandem mass spectra with known sequences in a protein database.
Journal ArticleDOI

Predicting subcellular localization of proteins based on their N-terminal amino acid sequence.

TL;DR: A neural network-based tool, TargetP, for large-scale subcellular location prediction of newly identified proteins has been developed and it is estimated that 10% of all plant proteins are mitochondrial and 14% chloroplastic, and that the abundance of secretory proteins, in both Arabidopsis and Homo, is around 10%.
Journal ArticleDOI

Comparative protein structure modeling of genes and genomes

TL;DR: There is a need to develop an automated, rapid, robust, sensitive, and accurate comparative modeling pipeline applicable to whole genomes and to encourage new kinds of applications for the many resulting models, based on their large number and completeness at the level of the family, organism, or functional network.
References
More filters
Journal ArticleDOI

A general method applicable to the search for similarities in the amino acid sequence of two proteins

TL;DR: A computer adaptable method for finding similarities in the amino acid sequences of two proteins has been developed and it is possible to determine whether significant homology exists between the proteins to trace their possible evolutionary development.
Journal ArticleDOI

Identification of common molecular subsequences.

TL;DR: This letter extends the heuristic homology algorithm of Needleman & Wunsch (1970) to find a pair of segments, one from each of two long sequences, such that there is no other Pair of segments with greater similarity (homology).
Journal ArticleDOI

Rapid and sensitive protein similarity searches

TL;DR: An algorithm was developed which facilitates the search for similarities between newly determined amino acid sequences and sequences already available in databases and increases sensitivity by giving high scores to those amino acid replacements which occur frequently in evolution.
Journal ArticleDOI

Glutathione transferases--structure and catalytic activity.

TL;DR: The glutathione transferases are recognized as important catalysts in the biotransformation of xenobiotics, including drugs as well as environmental pollutants, and numerous transferases from mammalian tissues, insects, and plants have been isolated and characterized.
Journal ArticleDOI

Cloning of the gene and cDNA for mammalian β -adrenergic receptor and homology with rhodopsin

TL;DR: Cloning of the gene and cDNA for the mammalian β2AR indicates significant amino-acid homology with bovine rhodopin and suggests that, like rhodopsin7, βAR possesses multiple membrane-spanning regions.
Related Papers (5)