Improved tools for biological sequence comparison.
Reads0
Chats0
TLDR
Three computer programs for comparisons of protein and DNA sequences can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity.Abstract:
We have developed three computer programs for comparisons of protein and DNA sequences. They can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity. The FASTA program is a more sensitive derivative of the FASTP program, which can be used to search protein or DNA sequence data bases and can compare a protein sequence to a DNA sequence data base by translating the DNA data base as it is searched. FASTA includes an additional step in the calculation of the initial pairwise similarity score that allows multiple regions of similarity to be joined to increase the score of related sequences. The RDF2 program can be used to evaluate the significance of similarity scores using a shuffling method that preserves local sequence composition. The LFASTA program can display all the regions of local similarity between two sequences with scores greater than a threshold, using the same scoring parameters and a similar alignment algorithm; these local similarities can be displayed as a "graphic matrix" plot or as individual alignments. In addition, these programs have been generalized to allow comparison of DNA or protein sequences based on a variety of alternative scoring matrices.read more
Citations
More filters
Journal ArticleDOI
SNPs, protein structure, and disease
Zhen Wang,John Moult +1 more
TL;DR: Ninety percent of the known disease‐causing missense mutations examined fit a model for assigning a mechanism of action of each mutation at the protein level, with the vast majority affecting protein stability, through a variety of energy related factors.
Journal ArticleDOI
Structural, functional, and evolutionary relationships among extracellular solute-binding receptors of bacteria.
R Tam,Milton H. Saier +1 more
TL;DR: The occurrence of two distinct classes of bacterial cytoplasmic repressor proteins which are homologous to two different clusters of periplasmic binding proteins suggests that the gene-splicing events which allowed functional conversion of these proteins with retention of domain structure have occurred repeatedly during evolutionary history.
Journal ArticleDOI
Complete genome sequence of Clostridium perfringens, an anaerobic flesh-eater
Tohru Shimizu,Kaori Ohtani,Hideki Hirakawa,Kenshiro Ohshima,Atsushi Yamashita,Tadayoshi Shiba,Naotake Ogasawara,Masahira Hattori,Satoru Kuhara,Hideo Hayashi +9 more
TL;DR: The genome analysis proved an efficient method for finding four members of the two-component VirR/VirS regulon that coordinately regulates the pathogenicity of C. perfringens, and a total of five hyaluronidase genes that will also contribute to virulence.
Journal ArticleDOI
Molecular cloning of a putative receptor protein kinase gene encoded at the self-incompatibility locus of Brassica oleracea
TL;DR: The S receptor kinase (SRK) gene is described, a previously uncharacterized gene that resides at the S locus that exhibits striking homology to the secreted product of the S-locus glycoprotein (SLG) gene.
Book
Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning
TL;DR: Techniques covered range from traditional multivariate methods, such as multiple regression, principal components, canonical variates, linear discriminant analysis, factor analysis, clustering, multidimensional scaling, and correspondence analysis, to the newer methods of density estimation, projection pursuit, neural networks, and classification and regression trees.
References
More filters
Journal ArticleDOI
A general method applicable to the search for similarities in the amino acid sequence of two proteins
TL;DR: A computer adaptable method for finding similarities in the amino acid sequences of two proteins has been developed and it is possible to determine whether significant homology exists between the proteins to trace their possible evolutionary development.
Journal ArticleDOI
Identification of common molecular subsequences.
TL;DR: This letter extends the heuristic homology algorithm of Needleman & Wunsch (1970) to find a pair of segments, one from each of two long sequences, such that there is no other Pair of segments with greater similarity (homology).
Journal ArticleDOI
Rapid and sensitive protein similarity searches
TL;DR: An algorithm was developed which facilitates the search for similarities between newly determined amino acid sequences and sequences already available in databases and increases sensitivity by giving high scores to those amino acid replacements which occur frequently in evolution.
Journal ArticleDOI
Rapid similarity searches of nucleic acid and protein data banks.
W. J. Wilbur,David J. Lipman +1 more
TL;DR: An algorithm for the global comparison of sequences based on matching k-tuples of sequence elements for a fixed k results in substantial reduction in the time required to search a data bank when compared with prior techniques of similarity analysis, with minimal loss in sensitivity.