scispace - formally typeset
Search or ask a question

Showing papers by "Chris Sander published in 1993"


Journal ArticleDOI
TL;DR: A novel algorithm (DALI) for optimal pairwise alignment of protein structures that identifies structural resemblances and common structural cores accurately and sensitively, even in the presence of geometrical distortions is developed.

4,075 citations


Journal ArticleDOI
TL;DR: A two-layered feed-forward neural network is trained on a non-redundant data base to predict the secondary structure of water-soluble proteins with a new key aspect is the use of evolutionary information in the form of multiple sequence alignments that are used as input in place of single sequences.

2,977 citations


Journal ArticleDOI
TL;DR: A substantial increase in both the accuracy and quality of secondary-structure predictions, using a neural-network algorithm, and the predicted structures have a more realistic distribution of helix and strand segments.
Abstract: The explosive accumulation of protein sequences in the wake of large-scale sequencing projects is in stark contrast to the much slower experimental determination of protein structures. Improved methods of structure prediction from the gene sequence alone are therefore needed. Here, we report a substantial increase in both the accuracy and quality of secondary-structure predictions, using a neural-network algorithm. The main improvements come from the use of multiple sequence alignments (better overall accuracy), from "balanced training" (better prediction of beta-strands), and from "structure context training" (better prediction of helix and strand lengths). This method, cross-validated on seven different test sets purged of sequence similarity to learning sets, achieves a three-state prediction accuracy of 69.7%, significantly better than previous methods. In addition, the predicted structures have a more realistic distribution of helix and strand segments. The predictions may be suitable for use in practice as a first estimate of the structural type of newly sequenced proteins.

524 citations


Journal ArticleDOI
TL;DR: The three‐dimensional structure of hexokinase is known and can be used to build models of functionally important regions of other kinases in this family, which contains many prokaryotic and eukaryotic sugar kinases with diverse specificities, including a new member, rhamnokinase from Salmonella typhimurium.
Abstract: Kinases that catalyze phosphorylation of sugars, called here sugar kinases, can be divided into at least three distinct nonhomologous families. The first is the hexokinase family, which contains many prokaryotic and eukaryotic sugar kinases with diverse specificities, including a new member, rhamnokinase from Salmonella typhimurium. The three-dimensional structure of hexokinase is known and can be used to build models of functionally important regions of other kinases in this family. The second is the ribokinase family, of unknown three-dimensional structure, and comprises pro- and eukaryotic ribokinases, bacterial fructokinases, the minor 6-phosphofructokinase 2 from Escherichia coli, 6-phosphotagatokinase, 1-phosphofructokinase, and, possibly, inosine-guanosine kinase. The third family, also of unknown three-dimensional structure, contains several bacterial and yeast galactokinases and eukaryotic mevalonate and phosphomevalonate kinases and may have a substrate binding region in common with homoserine kinases. Each of the three families of sugar kinases appears to have a distinct three-dimensional fold, since conserved sequence patterns are strikingly different for the three families. Yet each catalyzes chemically equivalent reactions on similar or identical substrates. The enzymatic function of sugar phosphorylation appears to have evolved independently on the three distinct structural frameworks, by convergent evolution. In addition, evolutionary trees reveal that (1) fructokinase specificity has evolved independently in both the hexokinase and ribokinase families and (2) glucose specificity has evolved independently in different branches of the hexokinase family. These are examples of independent Darwinian adaptation of a structure to the same substrate at different evolutionary times. The flexible combination of active sites and three-dimensional folds observed in nature can be exploited by protein engineers in designing and optimizing enzymatic function.

385 citations


Journal ArticleDOI
TL;DR: In this article, a contact quality index is defined as a measure of the agreement between the distributions of atoms around each residue fragment in the model and equivalent distributions derived from the database of known structures solved at high resolution.
Abstract: Branden & Jones state, in Nature: `Protein crystallography is an exacting trade, and the results may contain errors that are difficult to identify. It is the crystallographer's responsibility to make sure that incorrect protein structures do not reach the literature.' [Branden & Jones. (1990). Nature (London), 343, 687–689.] One of several available methods of checking structures for correctness is the evaluation of atomic contacts. From an initial hypothesis that atom-atom interactions are the primary determinant of protein folding, any protein model can be tested for proper packing by the calculation of a contact quality index. The index is a measure of the agreement between the distributions of atoms around each residue fragment in the model and equivalent distributions derived from the database of known structures solved at high resolution. The better the agreement, the higher the contact quality index. This empirical test, which is independent of X-ray data, is applied to a series of successively refined crystal structures. In all cases, the model known or expected to be better (the one with the lower R-factor) has a better contact quality index, indicating that this type of contact analysis can be used as an independent quality criterion during crystallographic refinement. Modelled proteins and predicted mutant structures can also be evaluated.

321 citations


Journal ArticleDOI
TL;DR: In this paper, the authors describe protein-water interactions in terms of atomic solvation parameters, which represent the solvation free energy per unit of volume, for six different atoms types, using experimental free energies of solvation.
Abstract: Several approaches to the treatment of solvent effects based on continuum models are reviewed and a new method based on occupied atomic volumes (occupancies) is proposed and tested. The new method describes protein-water interactions in terms of atomic solvation parameters, which represent the solvation free energy per unit of volume. These parameters were determined for six different atoms types, using experimental free energies of solvation. The method was implemented in the GROMOS and PRESTO molecular simulation program suites. Simulations with the solvation term require 20-50% more CPU time than the corresponding vacuum simulations and are approximately 20 times faster than explicit water simulations. The method and parameters were tested by carrying out 200 ps simulations of BPTI in water, in vacuo, and with the solvation term. The performance of the solvation term was assessed by comparing the structures and energies from the solvation simulations with the equivalent quantities derived from...

274 citations


Journal ArticleDOI
TL;DR: The HSSP is a derived database merging structural three dimensional (3-D) and sequence one dimensional (1- D) information and a database of implied secondary and tertiary structures covering 27% of all Swissprot-stored sequences.
Abstract: HSSP is a derived database merging structural (3-D) and sequence (1-D) information. For each protein of known 3-D structure from the Protein Data Bank (PDB), the database has a multiple sequence alignment of all available homologues and a sequence profile characteristic of the family. The list of homologues is the result of a database search in SwissProt using a position-weighted dynamic programming method for sequence profile alignment (MaxHom). The database is updated frequently. The listed homologues are very likely to have the same 3-D structure as the PDB protein to which they have been aligned. As a result, the database is not only a database of aligned sequence families, but also a database of implied secondary and tertiary structures covering 29% of all SwissProt-stored sequences.

205 citations


Journal ArticleDOI
TL;DR: This model identifies Norrie disease protein (NDP) as a member of an emerging family of growth factors containing a cystine knot motif, with direct implications for the physiological role of NDP.
Abstract: The X–lined gene for Norrie disease, which is characterized by blindness, deafness and mental retardation has been cloned recently. This gene has been thought to code for a putative extracellular factor; its predicted amino acid sequence is homologous to the C–terminal domain of diverse extracellular proteins. Sequence pattern searches and three–dimensional modelling now suggest that the Norrie disease protein (NDP) has a tertiary structure similar to that of transforming growth factor β (TGFβ). Our model identifies NDP as a member of an emerging family of growth factors containing a cystine knot motif, with direct implications for the physiological role of NDP. The model also sheds light on sequence related domains such as the C–terminal domain of mucins and of von Willebrand factor.

183 citations


Journal ArticleDOI
TL;DR: The main results are that a native sequence can very well find its native structure among a large number of alternatives, in correct alignment and that contact interface parameters are clearly superior to classical secondary structure parameters.

138 citations


Journal ArticleDOI
TL;DR: A new approach is described for the modeling of transmembrane seven helix bundles based on statistically derived environmental preference parameters combined with experimentally determined features of the receptors to create a model for the human beta 2-adrenoreceptor.
Abstract: Transmembrane seven helix bundles form a large family of membrane inserted receptors and are responsible for a wide range of biological functions. Experimental data suggest that their overall structure is similar to bacteriorhodopsin. We describe here a new approach for the modeling of transmembrane seven helix bundles based on statistically derived environmental preference parameters combined with experimentally determined features of the receptors. The method was used to create a model for the human beta 2-adrenoreceptor. This model is physically plausible, is in reasonable agreement with experimental data and may be helpful in planning new receptor engineering experiments.

102 citations


Journal ArticleDOI
TL;DR: The structural similarities between three well-known proteins that have no readily detectable primary sequence similarities but for which X-ray crystallography has revealed very similar structures are described.

Journal ArticleDOI
TL;DR: A database search employing a novel algorithm for protein structure comparison by alignment of distance matrices has revealed a striking resemblance between the tertiary structures of the bacterial toxin colicin A and globins, suggesting that these three protein families are an example of physical convergence to a stable folding motif, the three‐on‐three helical sandwich.

Journal ArticleDOI
TL;DR: Protein structure prediction can be improved substantially when a family of related sequences is available, and so molecular biologists equipped with a new amino acid sequence and a multiple sequence alignment in hand may be tempted to test the new prediction methods.

Journal ArticleDOI
TL;DR: A model of the GDP state of ras‐p21 that is in agreement with all relevant experimental evidence is proposed and provides important clues about a possible molecular mechanism for signal transmission from the site of GTP hydrolysis to downstream effectors.

Journal ArticleDOI
TL;DR: It is concluded that on the level of secondary structure, there is no practical advantage in training on two states, especially given the added margin of error in identifying the structural class of a protein.
Abstract: Can secondary structure prediction be improved by prediction rules that focus on a particular structural class of proteins? To help answer this question, we have assessed the accuracy of prediction for all-helical proteins, using two conceptually different method and two levels of description. An overall two-state single-residue accuracy of ∼80% can be obtained by a neural network, no matter whether it is trained on two states (helix and non-helix) or first trained on three states (helix, strand and loop) and then evaluated on two states. For four test proteins, this is similar to the accuracy obtained with inductive logic programming

Journal ArticleDOI
TL;DR: A molecular architecture of RNase L is proposed, with an unusual combination, in one protein chain, of 9 ankyrin‐like repeats, a functional active protein kinase and a C‐terminal catalytic RNase similar to the yeast protein, IRE1.


Journal ArticleDOI
TL;DR: Based on the detected homology, it is predicted that NifS has also a pyridoxal phosphate‐dependent serine (or related) aminotransferase function associated with nitrogen economy and/or protection during nitrogen fixation.


Journal ArticleDOI
TL;DR: Cloning and sequence analysis of a new open reading frame from Bacillus cereus reveals the relationship to a recently identified family of putative eukaryotic transcription activators similar to the yeast SNF2 gene product, suggesting a defined subgroup of DNA helicases present in all species, with specific function in transcription activation.

01 Jan 1993
TL;DR: Using a novel protein sequence comparison method, it is found that there are similarities between different Nef sequences and the α chain of human MHC class I proteins.
Abstract: The sequence of the HIV Nef protein has no significant homology to other proteins in the SwissProt database, and experimental data concerning its function are sparse and contradictory. Using a novel protein sequence comparison method, we find similarities between different Nef sequences and the a chain of human MHC class I proteins. The possible biologica impli~tions of this f&ding are discussed.

Journal ArticleDOI
TL;DR: The sequence of the HIV Nef protein has no significant homology to other proteins in the SwissProt database, and experimental data concerning its function are sparse and contradictory as discussed by the authors, and using a novel protein sequence comparison method, they find similarities between different Nef sequences and the α chain of human MHC class I proteins.

Book ChapterDOI
01 Jan 1993
TL;DR: It is pointed out that the size of the database of known protein three-dimensional structures can be significantly increased by the use of sequence homology, based on the following observations.
Abstract: Attempts to solve the protein folding problem by empirical methods are said to be limited by the size of the database of known protein three-dimensional structures. Here we point out that this database can be significantly increased by the use of sequence homology, based on the following observations. (1) The database of known sequences, currently at more than 15000 proteins, is two orders of magnitude larger than the database of known structures. (2) The currently most powerful method of predicting protein structures is model building by homology. (3) Structural homology can be inferred from the level of sequence similarity. (4) The threshold of sequence similarity sufficient for structural homology depends strongly on the length of the alignment.

Book ChapterDOI
15 Feb 1993
TL;DR: Prediction of three-dimensional protein structure from sequence alone is a classical problem of molecular biology and using evolutionary information in the form of sequence and structure alignments of related proteins opens up powerful new approaches that bring us closer to a solution.
Abstract: Prediction of three-dimensional protein structure from sequence alone is a classical problem of molecular biology. Progress with this problem has been slow over the last 20 years. Using evolutionary information in the form of sequence and structure alignments of related proteins opens up powerful new approaches that bring us closer to a solution. For example, prediction of secondary structure has now been advanced to the 70% three-state accuracy level using a neural network algorithm with multiple related sequences as input [1]. In another example, structure comparison of actin with its distant evolutionary cousins led to a database search pattern that identified a new class of bacterial ATPases, probably ancient relatives of actin [2]. These approaches work because mutational noise and disparate functional requirements are averaged out, leaving a clearer sequence signal for the three-dimensional fold.