scispace - formally typeset
Open AccessJournal ArticleDOI

Improved tools for biological sequence comparison.

Reads0
Chats0
TLDR
Three computer programs for comparisons of protein and DNA sequences can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity.
Abstract
We have developed three computer programs for comparisons of protein and DNA sequences. They can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity. The FASTA program is a more sensitive derivative of the FASTP program, which can be used to search protein or DNA sequence data bases and can compare a protein sequence to a DNA sequence data base by translating the DNA data base as it is searched. FASTA includes an additional step in the calculation of the initial pairwise similarity score that allows multiple regions of similarity to be joined to increase the score of related sequences. The RDF2 program can be used to evaluate the significance of similarity scores using a shuffling method that preserves local sequence composition. The LFASTA program can display all the regions of local similarity between two sequences with scores greater than a threshold, using the same scoring parameters and a similar alignment algorithm; these local similarities can be displayed as a "graphic matrix" plot or as individual alignments. In addition, these programs have been generalized to allow comparison of DNA or protein sequences based on a variety of alternative scoring matrices.

read more

Citations
More filters
Journal ArticleDOI

SNPs, protein structure, and disease

TL;DR: Ninety percent of the known disease‐causing missense mutations examined fit a model for assigning a mechanism of action of each mutation at the protein level, with the vast majority affecting protein stability, through a variety of energy related factors.
Journal ArticleDOI

Structural, functional, and evolutionary relationships among extracellular solute-binding receptors of bacteria.

TL;DR: The occurrence of two distinct classes of bacterial cytoplasmic repressor proteins which are homologous to two different clusters of periplasmic binding proteins suggests that the gene-splicing events which allowed functional conversion of these proteins with retention of domain structure have occurred repeatedly during evolutionary history.
Journal ArticleDOI

Complete genome sequence of Clostridium perfringens, an anaerobic flesh-eater

TL;DR: The genome analysis proved an efficient method for finding four members of the two-component VirR/VirS regulon that coordinately regulates the pathogenicity of C. perfringens, and a total of five hyaluronidase genes that will also contribute to virulence.
Journal ArticleDOI

Molecular cloning of a putative receptor protein kinase gene encoded at the self-incompatibility locus of Brassica oleracea

TL;DR: The S receptor kinase (SRK) gene is described, a previously uncharacterized gene that resides at the S locus that exhibits striking homology to the secreted product of the S-locus glycoprotein (SLG) gene.
Book

Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning

TL;DR: Techniques covered range from traditional multivariate methods, such as multiple regression, principal components, canonical variates, linear discriminant analysis, factor analysis, clustering, multidimensional scaling, and correspondence analysis, to the newer methods of density estimation, projection pursuit, neural networks, and classification and regression trees.
References
More filters
Journal ArticleDOI

A general method applicable to the search for similarities in the amino acid sequence of two proteins

TL;DR: A computer adaptable method for finding similarities in the amino acid sequences of two proteins has been developed and it is possible to determine whether significant homology exists between the proteins to trace their possible evolutionary development.
Journal ArticleDOI

Identification of common molecular subsequences.

TL;DR: This letter extends the heuristic homology algorithm of Needleman & Wunsch (1970) to find a pair of segments, one from each of two long sequences, such that there is no other Pair of segments with greater similarity (homology).
Journal ArticleDOI

Rapid and sensitive protein similarity searches

TL;DR: An algorithm was developed which facilitates the search for similarities between newly determined amino acid sequences and sequences already available in databases and increases sensitivity by giving high scores to those amino acid replacements which occur frequently in evolution.
Journal ArticleDOI

Rapid similarity searches of nucleic acid and protein data banks.

TL;DR: An algorithm for the global comparison of sequences based on matching k-tuples of sequence elements for a fixed k results in substantial reduction in the time required to search a data bank when compared with prior techniques of similarity analysis, with minimal loss in sensitivity.
Related Papers (5)