scispace - formally typeset
Search or ask a question

Showing papers in "Proteins in 1994"


Journal ArticleDOI
01 May 1994-Proteins
TL;DR: This work extends the previous three‐level system of neural networks by using additional input information derived from multiple alignments using a position‐specific conservation weight as part of the input to increase performance and greatly increased accuracy.
Abstract: Using evolutionary information contained in multiple sequence alignments as input to neural networks, secondary structure can be predicted at significantly increased accuracy. Here, we extend our previous three-level system of neural networks by using additional input information derived from multiple alignments. Using a position-specific conservation weight as part of the input increases performance. Using the number of insertions and deletions reduces the tendency for overprediction and increases overall accuracy. Addition of the global amino acid content yields a further improvement, mainly in predicting structural class. The final network system has sustained overall accuracy of 71.6% in a multiple cross-validation test on 126 unique protein chains. A test on a new set of 124 recently solved protein structures that have no significant sequence similarity to the learning set confirms the high level of accuracy. The average cross-validated accuracy for all 250 sequence-unique chains is above 72%. Using various data sets, the method is compared to alternative prediction methods, some of which also use multiple alignments: the performance advantage of the network system is at least 6 percentage points in three-state accuracy. In addition, the network estimates secondary structure content from multiple sequence alignments about as well as circular dichroism spectroscopy on a single protein and classifies 75% of the 250 proteins correctly into one of four protein structural classes. Of particular practical importance is the definition of a position-specific reliability index. For 40% of all residues the method has a sustained three-state accuracy of 88%, as high as the overall average for homology modelling. A further strength of the method is greatly increased accuracy in predicting the placement of secondary structure segments.

1,470 citations


Journal ArticleDOI
01 Apr 1994-Proteins
TL;DR: A simple and general method is presented to analyze correlations in mutational behavior between different positions in a multiple sequence alignment to predict contact maps for each of 11 protein families and compare the result with the contacts determined by crystallography.
Abstract: The maintenance of protein function and structure constrains the evolution of amino acid sequences. This fact can be exploited to interpret correlated mutations observed in a sequence family as an indication of probable physical contact in three dimensions. Here we present a simple and general method to analyze correlations in mutational behavior between different positions in a multiple sequence alignment. We then use these correlations to predict contact maps for each of 11 protein families and compare the result with the contacts determined by crystallography. For the most strongly correlated residue pairs predicted to be in contact, the prediction accuracy ranges from 37 to 68% and the improvement ratio relative to a random prediction from 1.4 to 5.1. Predicted contact maps can be used as input for the calculation of protein tertiary structure, either from sequence information alone or in combination with experimental information.

876 citations


Journal ArticleDOI
01 Nov 1994-Proteins
TL;DR: A neural network system that predicts relative solvent accessibility of each residue using evolutionary profiles of amino acid substitutions derived from multiple sequence alignments is introduced, and the most reliably predicted fraction of the residues (50%) is predicted as accurately as by automatic homology modeling.
Abstract: Currently, the prediction of three-dimensional (3D) protein structure from sequence alone is an exceedingly difficult task. As an intermediate step, a much simpler task has been pursued extensively: predicting 1D strings of secondary structure. Here, we present an analysis of another 1D projection from 3D structure: the relative solvent accessibility of each residue. We show that solvent accessibility is less conserved in 3D homologues than is secondary structure, and hence is predicted less accurately from automatic homology modeling; the correlation coefficient of relative solvent accessibility between 3D homologues is only 0.77, and the average accuracy of predictions based on sequence alignments is only 0.68. The latter number provides an effective upper limit on the accuracy of predicting accessibility from sequence when homology modeling is not possible. We introduce a neural network system that predicts relative solvent accessibility (projected onto ten discrete states) using evolutionary profiles of amino acid substitutions derived from multiple sequence alignments. Evaluated in a cross-validation test on 238 unique proteins, the correlation between predicted and observed relative accessibility is 0.54. Interpreted in terms of a three-state (buried, intermediate, exposed) description of relative accessibility, the fraction of correctly predicted residue states is about 58%. In absolute terms this accuracy appears poor, but given the relatively low conservation of accessibility in 3D families, the network system is not far from its likely optimal performance. The most reliably predicted fraction of the residues (50%) is predicted as accurately as by automatic homology modeling. Prediction is best for buried residues, e.g., 86% of the completely buried sites are correctly predicted as having 0% relative accessibility. © 1994 Wiley-Liss, Inc.

623 citations


Journal ArticleDOI
01 Aug 1994-Proteins
TL;DR: Applications to crystallographic refinement show a significantly increased radius of convergence over conventional techniques, and the sampling strategy presented here combines high temperature torsion angle dynamics with repeated trajectories using different initial velocities.
Abstract: A reduced variable conformational sampling strategy for macromolecules based on molecular dynamics in torsion angle space is evaluated using crystallographic refinement as a prototypical search problem. Bae and Haug's algorithm for constrained dynamics [Bae, D.S., Haug, E.J. A recursive formulation for constrained mechanical system dynamics. Mech. Struct. Mach. 15:359-382, 1987], originally developed for robotics, was used. Their formulation solves the equations of motion exactly for arbitrary holonomic constraints, and hence differs from commonly used approximation algorithms. It uses gradients calculated in Cartesian coordinates, and thus also differs from internal coordinate formulations. Molecular dynamics can be carried out at significantly higher temperatures due to the elimination of the high frequency bond and angle vibrations. The sampling strategy presented here combines high temperature torsion angle dynamics with repeated trajectories using different initial velocities. The best solutions can be identified by the free R value, or the R value if experimental phase information is appropriately included in the refinement. Applications to crystallographic refinement. Applications to crystallographic refinement show a significantly increased radius of convergence over conventional techniques. For a test system with diffraction data to 2 A resolution, slow-cooling protocols fail to converge if the backbone atom root mean square (rms) coordinate deviation from the crystal structure is greater than 1.25 A, but torsion angle refinement can correct backbone atom rms coordinate deviations up to approximately 1.7 A.

369 citations


Journal ArticleDOI
01 Aug 1994-Proteins
TL;DR: The X‐ray structure of an oxygenated hemocyanin molecule, subunit II of Limulus polyphemus hemocynin, was determined at 2.4 Å resolution and refined to a crystallographic R‐factor of 17.1%.
Abstract: The X-ray structure of an oxygenated hemocyanin molecule, subunit II of Limulus polyphemus hemocyanin, was determined at 2.4 A resolution and refined to a crystallographic R-factor of 17.1%. The 73-kDa subunit crystallizes with the symmetry of the space group R32 with one subunit per asymmetric unit forming hexamers with 32 point group symmetry. Molecular oxygen is bound to a dinuclear copper center in the protein's second domain, symmetrically between and equidistant from the two copper atoms. The copper-copper distance in oxygenated Limulus hemocyanin is 3.6 +/- 0.2 A, which is surprisingly 1 A less than that seen previously in deoxygenated Limulus polyphemus subunit II hemocyanin (Hazes et al., Protein Sci. 2:597, 1993). Away from the oxygen binding sites, the tertiary and quaternary structures of oxygenated and deoxygenated Limulus subunit II hemocyanins are quite similar. A major difference in tertiary structures is seen, however, when the Limulus structures are compared with deoxygenated Panulirus interruptus hemocyanin (Volbeda, A., Hol, W.G.J.J. Mol. Biol. 209:249, 1989) where the position of domain 1 is rotated by 8 degrees with respect to domains 2 and 3. We postulate this rotation plays an important role in cooperativity and regulation of oxygen affinity in all arthropod hemocyanins.

354 citations


Journal ArticleDOI
01 Sep 1994-Proteins
TL;DR: The hydrogen exchange (HX) rates of the slowest peptide group NH hydrogens in oxidized cytochrome c (equine) are controlled by the transient global unfolding equilibrium and can be measured by one‐dimensional nuclear magnetic resonance and used to determine the thermodynamic parameters of global unfolding at mild solution conditions well below the melting transition.
Abstract: The hydrogen exchange (HX) rates of the slowest peptide group NH hydro- gens in oxidized cytochrome c (equine) are con- trolled by the transient global unfolding equi- librium. These rates can be measured by one- dimensional nuclear magnetic resonance and used to determine the thermodynamic parame- ters of global unfolding at mild solution condi- tions well below the melting transition. The free energy for global unfolding measured by hy- drogen exchange can differ from values found by standard denaturation methods, most nota- bly due to the slow cis-trans isomerization of the prolyl peptide bond. This difference can be quantitatively calculated from basic principles. Even with these corrections, HX experiments at low denaturant concentration measure a free energy of protein stability that rises above the usual linear extrapolation from denaturation data, as predicted by the denaturant binding model of Tanford. o 1994 Wiley-Liss, Inc.

302 citations


Journal ArticleDOI
01 Jun 1994-Proteins
TL;DR: Indices with the new parameters showed better correlation to protein stability than those used previously; furthermore they had relationship even when the old parameters failed.
Abstract: Protein structural flexibility is important for catalysis, binding, and allostery. Flexibility has been predicted from amino acid sequence with a sliding window averaging technique and applied primarily to epitope search. New prediction parameters were derived from 92 refined protein structures in an unbiased selection of the Protein Data Bank by developing further the method of Karplus and Schulz (Naturwissenschaften 72:212-213, 1985). The accuracy of four flexibility prediction techniques was studied by comparing atomic temperature factors of known three-dimensional protein structures to predictions by using correlation coefficients. The size of the prediction window was optimized for each method. Predictions made with our new parameters, using an optimized window size of 9 residues in the prediction window, were giving the best results. The difference from another previously used technique was small, whereas two other methods were much poorer. Applicability of the predictions was also tested by searching for known epitopes from amino acid sequences. The best techniques predicted correctly 20 of 31 continuous epitopes in seven proteins. Flexibility parameters have previously been used for calculating protein average flexibility indices which are inversely correlated to protein stability. Indices with the new parameters showed better correlation to protein stability than those used previously; furthermore they had relationship even when the old parameters failed. (C) 1994 Wiley-Liss, Inc. (Less)

300 citations


Journal ArticleDOI
01 Apr 1994-Proteins
TL;DR: A new hierarchical method for the simulation of the protein folding process and the de novo prediction of protein three‐dimensional structure is proposed, which employs lattice discretizations of increasing geometrical resolution and a single ball representation of side chain rotamers.
Abstract: A new hierarchical method for the simulation of the protein folding process and the de novo prediction of protein three-di- mensional structure is proposed. The reduced representation of the protein a-carbon back- bone employs lattice discretizations of increas- ing geometrical resolution and a single ball representation of side chain rotamers. In par- ticular, coarser and finer lattice backbone de- scriptions are used. The coarser (finer) lattice represents Ca traces of native proteins with an accuracy of 1.0 (0.7) A rms. Folding is simulated by means of very fast Monte Carlo lattice dy- namics. The potential of mean force, predomi- nantly of statistical origin, contains several novel terms that facilitate the cooperative as- sembly of secondary structure elements and the cooperative packing of the side chains. Partic- ular contributions to the interaction scheme are discussed in detail. In the accompanying paper (Kolinski, A., Skolnick, J. Monte Carlo simula- tion of protein folding. 11. Application to pro- tein A, ROP, and crambin. Proteins 18:353366, 19941, the method is applied to three small globular proteins. o 1994 Wiley-Liss, hc.

289 citations


Journal ArticleDOI
01 Dec 1994-Proteins
TL;DR: By using a large enough database and a proper definition for the secondary structure propensities, it is possible to obtain a scale as good as any of experimental origin, including α‐helix as well as β‐strand.
Abstract: Today there are several different experimental scales for the intrinsic α-helix as well as β-strand, propensities of the 20 amino acids obtained from the thermodynamic analysis of various model systems. These scales do not compare well with those extracted from statistical analysis of three-dimensional structure databases. Possible explanations for this could be the limited size of the databases used, the definitions of intrinsic propensities, or the theoretical approach. Here we report a statistical determination of α-helix and β-strand propensities derived from the analysis of a database of 279 three-dimensional structures. Contrary to what has been generally done, we have considered a particular residue as in α-helix or β-strand conformation by looking only at its dihedral angles (ϕ–ψ matrices). Neither the identity nor the conformation of the surrounding residues in the amino acid sequence has been taken into consideration. Pseudoenergy empirical scales have been calculated from the statistical propensities. These scales agree very well with the experimental ones in relative and absolute terms. Moreover, its correlation with the average of the experimental scales for α-helix or β-strand is as good as the correlations of the individual experimental scales with the average. These results show that by using a large enough database and a proper definition for the secondary structure propensities, it is possible to obtain a scale as good as any of experimental origin. Interestingly the ϕ–ψ analysis of the Ramachandran plot suggests that the amino acids could have different β-strand propensities in different subregions of the β-strand area. © 1994 Wiley-Liss, Inc.

285 citations


Journal ArticleDOI
01 Jul 1994-Proteins
TL;DR: The method is illustrated by constructing ligands for the sialic acid binding site of the hemagglutinin from the influenza A virus and the active site of chloramphenicol acetyltransferase.
Abstract: A program (HOOK) is described for generating potential ligands that satisfy the chemical and steric requirements of the binding region of a macromolecule. Functional group sites with defined positions and orientations are derived from known ligand structures or the multicopy simulation search (MCSS) method (Miranker, A., Karplus, M. Proteins 11:29–34, 1991). HOOK places molecular “skeletons” from a database into the protein binding region by making bonds between sites (“hooks”) on the skeleton and functional groups. The nonpolar interactions with the binding region of candidate molecules are assessed by use of a simplified van der Waals potential. The method is illustrated by constructing ligands for the sialic acid binding site of the hemagglutinin from the influenza A virus and the active site of chloramphenicol acetyltransferase. Aspects of the HOOK program that lead to a highly efficient search of 105 or more skeletons for binding to 102 or more functional group minima are outlined. © 1994 Wiley-Liss, Inc.

258 citations


Journal ArticleDOI
31 Jan 1994-Proteins
TL;DR: Crystal structures of the Fabs from an autoantibody with specificity for single‐stranded DNA have been determined in the presence and absence of a trinucleotide of deoxythymidylic acid, d(pT)3.
Abstract: Crystal structures of the Fabs from an autoantibody (BV04-01) with specificity for single-stranded DNA have been determined in the presence and absence of a trinucleotide of deoxythymidylic acid, d(pT)3. Formation of the ligand-protein complex was accompanied by small adjustments in the orientations of the variable (VL and VH) domains. In addition, there were local conformational changes in the first hypervariable loop of the light chain and the third hypervariable loop of the heavy chain, which together with the domain shifts led to an improvement in the complementarity of nucleotide and Fab. The sugar–phosphate chain adopted an extended and “open” conformation, with the base, sugar, and phosphate components available for interactions with the protein. Nucleotide 1 (5′-end) was associated exclusively with the heavy chain, nucleotide 2 was shared by both heavy and light chains, and nucleotide 3 was bound by the light chain. The orientation of phosphate 1 was stabilized by hydrogen bonds with serine H52a and asparagine H53. Phosphate 2 formed an ion pair with arginine H52, but no other charge–charge interactions were observed. Insertion of the side chain of histidine L27d between nucleotides 2 and 3 resulted in a bend in the sugar–phosphate chain. The most dominant contacts with the protein involved the central thymine base, which was immobilized by cooperative stacking and hydrogen bonding interactions. This base was intercalated between a tryptophan ring (no. H100a) from the heavy chain and a tyrosine ring (no. L32) from the light chain. The resulting orientation of thymine was favorable for the simultaneous formation of two hydrogen bonds with the backbone carbonyl oxygen and the side chain hydroxyl group of serine L91 (the thymine atoms were the hydrogen on nitrogen 3 and keto oxygen 4).

Journal ArticleDOI
01 Jul 1994-Proteins
TL;DR: A new generation of computer algorithms has now been developed that allows routine comparison of a protein structure with the database of all known structures, and such structure database searches are beginning to rival sequence database searches as a tool for discovering biologically interesting relationships.
Abstract: The number of protein structures knownin atomic detail has increased from one in 1960 [1] to more than 1000 in 1994. The rate at which new structures are being published exceeds one a day as a result of recent advances in protein engineering, crystallography, and spectroscopy. More and more frequently, a newly determinedstructure is similar in fold to a known one, even when no sequence similarity isdetectable. A new generation of computer algorithms has now been developed thatallows routine comparison of a protein structure with the database of all known structures. Such structure database searches are already used daily and they are beginning to rival sequence database searches as a tool for discovering biologically interesting relationships.

Journal ArticleDOI
01 Jul 1994-Proteins
TL;DR: An algorithm for identification of structural units by objective, quantitative criteria based on atomic interactions is proposed, which is useful for the analysis of folding principles, for modular protein design and for protein engineering.
Abstract: General patterns of protein structural organization have emerged from studies of hundreds of structures elucidated by X-ray crystallography and nuclear magnetic resonance. Structural units are commonly iden- tified by visual inspection of molecular models using qualitative criteria. Here, we propose an algorithm for identification of structural units by objective, quantitative criteria based on atomic interactions. The underlying physical concept is maximal interactions within each unit and minimal interaction between units (do- mains). In a simple harmonic approximation, interdomain dynamics is determined by the strength of the interface and the distribution of masses. The most likely domain decomposition involves units with the most correlated motion, or largest interdomain fluctuation time. The de- composition of a convoluted 3-D structure is complicated by the possibility that the chain can cross over several times between units. Grouping the residues by solving an eigenvalue problem for the contact matrix reduces the problem to a one-dimensional search for all rea- sonable trial bisections. Recursive bisection yields a tree of putative folding units. Simple physical criteria are used to identify units that could exist by themselves. The units so defined closely correspond to crystallographers' notion of structural domains. The results are useful for the analysis of folding principles, for modular protein design and for protein engineering. Q 1994 Wiley-Liss, Inc.

Journal ArticleDOI
01 Sep 1994-Proteins
TL;DR: A comparison of calculated and experimental values of side chain conformational entropy yields an excellent agreement, validating not only the theoretical estimates but also the separability of the entropic contributions into configurational terms and solvation related terms.
Abstract: Theoretical estimations of changes in side chain configurational entropy are essential for understanding the different contributions to the overall thermodynamic behavior of important biological processes like folding and binding. The configurational entropy of any given side chain in any particular protein can be evaluated from the complete energy profile of the side chain. Calculations of the energy profiles can be performed using the side chain single bond dihedrals as the only independent variables as long as the structures at each value of the dihedrals are allowed to relax through small changes in the valence bond angles. The probabilities of different side chain conformers obtained from these energy profiles are very similar to the conformer populations obtained by analysis of side chain preferences in the proteins of the Protein Data Bank. Also, side chain conformational entropies obtained from the energy profiles agree extremely well with those obtained from the Protein Data Bank conformer populations. Changes in side chain configurational entropy in binding and folding can be computed as differences in conformational entropy because, in most cases, the frequency of the rotational oscillation around the energy minimum of any given conformer does not appear to change significantly in the reaction. Changes of side chain conformational entropy calculated in this way were compared with experimental values. The only available experimental data–the effect of side chain substitution on the stability of α-helices–were used for this comparison. The experimental values were corrected to subtract the solvent contributions. This comparison yields an excellent agreement between calculated and experimental values, validating not only the theoretical estimates but also the separability of the entropic contributions into configurational terms and solvation related terms. © 1994 Wiley-Liss, Inc.

Journal ArticleDOI
01 Dec 1994-Proteins
TL;DR: A partial representation of the molecules based on hydrophobic groups should improve the quality of the results in finding molecular recognition sites, as compared to full representation, by applying the idea to an existing geometric fit procedure and compared the results obtained with full vs. hydrophilic representations.
Abstract: In the classical procedures for predicting the structure of protein complexes two molecules are brought in contact at multiple relative positions, the extent of complementarity (geometric and/or energy) at the surface of contact is assessed at each position, and the best fits are retrieved. In view of the higher occurrence of hydrophobic groups at contact sites, their contribution results in more intermolecular atom-atom contacts per unit area for correct matches than for false positive fits. The hydrophobic groups are also potentially less flexible at the surface. Thus, from a practical point of view, a partial representation of the molecules based on hydrophobic groups should improve the quality of the results in finding molecular recognition sites, as compared to full representation. We tested this proposal by applying the idea to an existing geometric fit procedure and compared the results obtained with full vs. hydrophobic representations of molecules in known molecular complexes. The hydrophobic docking yielded distinctly higher signal-to-noise ratio so that the correct match is discriminated better from false positive fits. It appears that nonhydrophobic groups contribute more to false matches. The results are discussed in terms of their relevance to molecular recognition techniques as compared to energy calculations.

Journal ArticleDOI
01 Jun 1994-Proteins
TL;DR: The X‐ray crystal structure of a 19 kDa active fragment of human fibroblast collagenase has been determined by the multiple isomorphous replacement method and refined at 1.56 Å resolution to an R‐factor of 17.4%.
Abstract: The X-ray crystal structure of a 19 kDa active fragment of human fibroblast collagenase has been determined by the multiple isomorphous replacement method and refined at 1.56 A resolution to an R-factor of 17.4%. The current structure includes a bound hydroxamate inhibitor, 88 waters and three metal atoms (two zincs and a calcium). The overall topology of the enzyme, comprised of a five stranded β-sheet and three α-helices, is similar to the thermolysin-like metalloproteinases. There are some important differences between the collagenase and thermolysin families of enzymes. The active site zinc ligands are all histidines (His-218, His-222, and His-228). The presence of a second zinc ion in a structural role is a unique feature of the matrix metalloproteinases. The binding properties of the active site cleft are more dependent on the main chain conformation of the enzyme (and substrate) compared with thermolysin. A mechanism of action for peptide cleavage similar to that of thermolysin is proposed for fibroblast collagenase. © 1994 Wiley-Liss, Inc.

Journal ArticleDOI
01 Mar 1994-Proteins
TL;DR: For the first time it has been shown directly that the enthalpy of protein unfolding is a nonlinear function of temperature.
Abstract: The energetics of ubiquitin unfolding have been studied using differential scanning microcalorimetry. For the first time it has been shown directly that the enthalpy of protein unfolding is a nonlinear function of temperature. Thermodynamic parameters of ubiquitin unfolding were correlated with the structure of the protein. The enthalpy of hydrogen bonding in ubiquitin was calculated and compared to that obtained for other proteins. It appears that the energy of hydrogen bonding correlates with the average length of the hydrogen bond in a given protein structure.

Journal ArticleDOI
01 Jan 1994-Proteins
TL;DR: Examination of experimental values suggest that calculation of this entropy using the Sackur–Tetrode equation produces largely overestimated values, andoretical considerations suggest that the volumes available for the movement of a ligand in solution and in a complex are rather similar, suggesting that the cratic entropy provides the best estimate.
Abstract: The loss of translational degrees of freedom makes an important, unfavorable contribution to the free energy of binding. Examination of experimental values suggest that calculation of this entropy using the Sackur–Tetrode equation produces largely overestimated values. Better agreement is obtained using the cratic entropy. Theoretical considerations suggest that the volumes available for the movement of a ligand in solution and in a complex are rather similar, suggesting also that the cratic entropy provides the best estimate of the loss of translational entropy. © 1994 John Wiley & Sons, Inc.

Journal ArticleDOI
01 Jun 1994-Proteins
TL;DR: The authors showed that the side chain rotamer distributions and entropy losses are a major determinant of the helix-forming tendency of residues in both peptide and protein helices in both protein and protein peptides.
Abstract: Much effort has been invested in seeking to understand the thermodynamic basis of helix stability in both peptides and proteins. Recently, several groups have measured the helix-forming propensities of individual residues (Lyu, P.C., Liff, M.I., Marky, L.A., Kallenbach, N.R. Science 250:669-673, 1990; O'Neil, K.T., DeGrado, W.F. Science 250:646-651, 1990; Padmanabhan, S., Marqusee, S., Ridgeway, T., Laue, T.M., Baldwin, R.L. Nature (London) 344:268-270, 1990). Using Monte Carlo computer simulations, we tested the hypothesis that these differences in measured helix-forming propensity are due primarily to loss of side chain conformational entropy upon helix formation (Creamer, T.P., Rose, G.D. Proc. Natl. Acad. Sci. U.S.A. 89:5937-5941, 1992). Our previous study employed a rigid helix backbone, which is here generalized to a completely flexible helix model in order to ensure that earlier results were not a methodological artifact. Using this flexible model, side chain rotamer distributions and entropy losses are calculated and shown to agree with those obtained earlier. We note that the side chain conformational entropy calculated for Trp in our previous study was in error; a corrected value is presented. Extending earlier work, calculated entropy losses are found to correlate strongly with recent helix propensity scales derived from substitutions made within protein helices (Horovitz, A., Matthews, J.M., Fersht, A.R. J. Mol. Biol. 227:560-568, 1992; Blaber, M., Zhang, X.-J., Matthews, B.M. Science 260:1637-1640, 1993). In contrast, little correlation is found between these helix propensity scales and the accessible surface area buried upon formation of a model polyalanyl alpha-helix. Taken in sum, our results indicate that loss of side chain entropy is a major determinant of the helix-forming tendency of residues in both peptide and protein helices.

Journal ArticleDOI
01 Apr 1994-Proteins
TL;DR: For two simple helical proteins and a small α/β protein, the ability to predict protein structure from sequence has been demonstrated, with native‐like packing of the side chains.
Abstract: The hierarchy of lattice Monte Carlo models described in the accompanying paper (Kolinski, A., Skolnick, J. Monte Carlo simulations of protein folding. I. Lattice model and interaction scheme. Proteins 18:338-352, 1994) is applied to the simulation of protein folding and the prediction of 3-dimensional structure. Using sequence information alone, three proteins have been successfully folded: the B domain of staphylococcal protein A, a 120 residue, monomeric version of ROP dimer, and crambin. Starting from a random expanded conformation, the model proteins fold along relatively well-defined folding pathways. These involve a collection of early intermediates, which are followed by the final (and rate-determining) transition from compact intermediates closely resembling the molten globule state to the native-like state. The predicted structures are rather unique, with native-like packing of the side chains. The accuracy of the predicted native conformations is better than those obtained in previous folding simulations. The best (but by no means atypical) folds of protein A have a coordinate rms of 2.25 A from the native C alpha trace, and the best coordinate rms from crambin is 3.18 A. For ROP monomer, the lowest coordinate rms from equivalent C alpha s of ROP dimer is 3.65 A. Thus, for two simple helical proteins and a small alpha/beta protein, the ability to predict protein structure from sequence has been demonstrated.

Journal ArticleDOI
01 Oct 1994-Proteins
TL;DR: The crystal structure of an anionic form of salmon trypsin has been determined at 1.82 Å resolution and no overall differences in intramolecular interactions are detected between the two structures, but there are differences in certain regions of the structures which may explain some of the observed differences in physical properties.
Abstract: The crystal structure of an anionic form of salmon trypsin has been determined at 1.82 A resolution. We report the first structure of a trypsin from a phoikilothermic organism in a detailed comparison to mammalian trypsins in order to look for structural rationalizations for the cold-adaption features of salmon trypsin. This form of salmon trypsin (ST II) comprises 222 residues, and is homologous to bovine trypsin (BT) in about 65% of the primary structure. The tertiary structures are similar, with an overall displacement in main chain atomic positions between salmon trypsin and various crystal structures of bovine trypsin of about 0.8 A. Intramolecular hydrogen bonds and hydrophobic interactions are compared and discussed in order to estimate possible differences in molecular flexibility which might explain the higher catalytic efficiency and lower thermostability of salmon trypsin compared to bovine trypsin. No overall differences in intramolecular interactions are detected between the two structures, but there are differences in certain regions of the structures which may explain some of the observed differences in physical properties. The distribution of charged residues is different in the two trypsins, and the impact this might have on substrate affinity has been discussed.

Journal ArticleDOI
01 Aug 1994-Proteins
TL;DR: This work examined a large ensemble of partly folded states derived from the native structure of α‐lactalbumin in order to identify those states that satisfy the energetic criteria of the molten globule, and suggested a number of structural features that are consistent with experimental data.
Abstract: The heat-denatured state of proteins has been usually assumed to be a fully hydrated random coil. It is now evident that under certain solvent conditions or after chemical or genetic modifications, the protein molecule may exhibit a hydrophobic core and residual secondary structure after thermal denaturation. This state of the protein has been called the "compact denatured" or "molten globule" state. Recently is has been shown that alpha-lactalbumin at pH < 5 denatures into a molten globule state upon increasing the temperature (Griko, Y., Freire, E., Privalov, P.L. Biochemistry 33:1889-1899, 1994). This state has a lower heat capacity and a higher enthalpy at low temperatures the stabilization of the molten globule state is of an entropic origin since the enthalpy contributes unfavorably to the Gibbs free energy. Since the molten globule is more structured than the unfolded state and, therefore, is expected to have a lower configurational entropy, the net entropic gain must originate primarily from solvent related entropy arising from the hydrophobic effect, and to a lesser extent from protonation or electrostatic effects. In this work, we have examined a large ensemble of partly folded states derived from the native structure of alpha-lactalbumin in order to identify those states that satisfy the energetic criteria of the molten globule. It was found that only few states satisfied the experimental constraints and that, furthermore, those states were part of the same structural family. In particular, the regions corresponding to the A, B, and C helices were found to be folded, while the beta sheet and the D helix were found to be unfolded. At temperatures below 45 degrees C the states exhibiting those structural characteristics are enthalpically higher than the unfolded state in agreement with the experimental data. Interestingly, those states have a heat capacity close to that observed for the acid pH compact denatured state of alpha-lactalbumin [980 cal (mol.K)-1]. In addition, the folded regions of these states include those residues found to be highly protected by NMR hydrogen exchange experiments. This work represents an initial attempt to model the structural origin of the thermodynamic properties of partly folded states. The results suggest a number of structural features that are consistent with experimental data.

Journal ArticleDOI
01 Jul 1994-Proteins
TL;DR: The structure of E. coli adenylate kinase with bound AMP and AMPPNP at 2.0 Å resolution is presented and a comparison is made between the present structure and the structure of the heavy chain of muscle myosin.
Abstract: The structure of E. coli adenylate kinase with bound AMP and AMPPNP at 2.0 A resolution is presented. The protein crystallizes in space group C2 with two molecules in the asymmetric unit, and has been refined to an R factor of 20.1% and an Rfree of 31.6%. In the present structure, the protein is in the closed (globular) form with the large flexible lid domain covering the AMPPNP molecule. Within the protein, AMP and AMPPNP, an ATP analog, occupy the AMP and ATP sites respectively, which had been suggested by the most recent crystal structure of E. coli adenylate kinase with AP5A bound (Muller and Schulz, 1992, ref. 1) and prior fluorescence studies (Liang et al., 1991, ref. 2). The binding of substrates and the positions of the active site residues are compared between the present structure and the E. coli adenylate kinase/Ap5A structure. We failed to detect a peak in the density map corresponding to the Mg2+ ion which is required for catalysis, and its absence has been attributed to the use of ammonium sulfate in the crystallization solution. Finally, a comparison is made between the present structure and the structure of the heavy chain of muscle myosin. © 1994 Wiley-Liss, Inc.

Journal ArticleDOI
01 Dec 1994-Proteins
TL;DR: It is proposed that PAPS reductases may have evolved from ATP sulfurylases; the evolution of the new enzymatic function appears to be accompanied by a switch of the strongest functional constraint from the PP‐motif to the putative sulfate‐binding motif.
Abstract: A conserved amino acid sequence motif was identified in four distinct groups of enzymes that catalyze the hydrolysis of the α–β phosphate bond of ATP, namely GMP synthetases, argininosuccinate synthetases, asparagine synthetases, and ATP sulfurylases. The motif is also present in Rhodobacter capsulata AdgA, Escherichia coli NtrL, and Bacillus subtilis OutB, for which no enzymatic activities are currently known. The observed pattern of amino acid residue conservation and predicted secondary structures suggest that this motif may be a modified version of the P-loop of nucleotide binding domains, and that it is likely to be involved in phosphate binding. We call it PP-motif, since it appears to be a part of a previously uncharacterized ATP pyrophophatase domain. ATP sulfurylases, NtrL, and OutB consist of this domain alone. In other proteins, the pyrophosphatase domain is associated with amidotransferase domains (type I or type II), a putative citrulline-aspartate ligase domain or a nitrilase/amidase domain. Unexpectedly, statistically significant overall sequence similarity was found between ATP sulfurylase and 3′-phosphoadenosine 5′-phosphosulfate (PAPS) reductase, another protein of the sulfate activation pathway. The PP-motif is strongly modified in PAPS reductases, but they share with ATP sulfurylases another conserved motif which might be involved in sulfate binding. We propose that PAPS reductases may have evolved from ATP sulfurylases; the evolution of the new enzymatic function appears to be accompanied by a switch of the strongest functional constraint from the PP-motif to the putative sulfate-binding motif. © 1994 Wiley-Liss, Inc.

Journal ArticleDOI
01 Sep 1994-Proteins
TL;DR: A theoretical analysis is made of the decomposition into contributions from individual interactions of the free energy calculated by thermodynamic integration and it is shown that the path dependence can be used to determine the relation of the contribution of a given interaction to the state of the system.
Abstract: A theoretical analysis is made of the decomposition into contributions from individual interactions of the free energy calculated by thermodynamic integration. It is demonstrated that such a decomposition, often referred to as “component analysis,” is meaningful, even though it is a function of the integration path. Moreover, it is shown that the path dependence can be used to determine the relation of the contribution of a given interaction to the state of the system. To illustrate these conclusions, a simple transformation(Cl− to Br− in aqueous solution) is analyzed by use of the Reference Interaction Site Model-Hypernetted Chain Closure integral equation approach; it avoids the calculational difficulties of macromolecular simulation while retaining their conceptual complexity. The difference in the solvation free energy between chloride and bromide is calculated, and the contributions of the Lennard-Jones and electrostatic terms in the potential function are analyzed by the use of suitably chosen integration paths. The model is also used to examine the path dependence of individual contributions to the double free energy differences (ΔΔG or ΔΔA) that are often employed in free energy simulations of biological systems. The alchemical path, as contrasted with the experimental path, is shown to be appropriate for interpreting the effects of mutations on ligand binding and protein stability. The formulation is used to obtain a better understanding of the success of the Poisson-Boltzmann continuum approach for determining the solvation properties of polar and ionic systems. © 1994 Wiley-Liss, Inc.

Journal ArticleDOI
01 Jan 1994-Proteins
TL;DR: In this paper, the authors defined a molecular surface representation that describes precisely and concisely the complete molecular surface, consisting of a limited number of critical points disposed at key locations over the surface, despite the fact that they are modest in number.
Abstract: We have defined a molecular surface representation that describes precisely and concisely the complete molecular surface. The representation consists of a limited number of critical points disposed at key locations over the surface. These points adequately represent the shape and the important characteristics of the surface, despite the fact that they are modest in number. We expect the representation to be useful in areas such as molecular recognition and visualization. In particular, using this representation, we are able to achieve accurate and efficient protein–protein and protein–small molecule docking. © 1994 John Wiley & Sons, Inc.

Journal ArticleDOI
30 Apr 1994-Proteins
TL;DR: The structure of the methyl‐α‐D‐mannopyranoside–LOL I complex has been solved by the molecular replacement method using the refined saccharide‐free LOL I coordinates as starting model.
Abstract: The structure of the methyl-alpha-D-mannopyranoside-LOL I complex has been solved by the molecular replacement method using the refined saccharide-free LOL I coordinates as starting model. The methyl-alpha-D-mannopyranoside-LOL I complex was refined by simulated annealing using the program X-PLOR. The final R-factor value is 0.182 [Fo greater than 1 sigma(Fo)]. The isostructural methyl-alpha-D-glucopyranoside-LOL I complex was refined by X-Ray coupled energy minimization using the methyl-alpha-D-mannopyranoside-LOL I structure as a starting model to an R factor of 0.179 (all data). In both crystal forms, each dimer binds two molecules of sugar in pockets found near the calcium ions. The two saccharide moieties, which are in the C1 chair conformation, establish the same hydrogen bond pattern with the lectin. However, the van der Waals contacts are different between the O2, C2, C6, and O6 atoms of the two molecules and the backbone atoms of residues 208-211. Mannose, due to its axial C2 conformation, encloses the backbone atoms of the protein in a clamplike way. Van der Waals energy calculations suggest that this better complementarity of the mannoside molecule with the lectin could explain its higher affinity for isolectin I.

Journal ArticleDOI
01 Feb 1994-Proteins
TL;DR: A method to avoid precision problems associated with using the FDPB method to evaluate conformational free energies in proteins is described, which was quite successful in selecting conformations quite close to the crystal structure for two of the three loops.
Abstract: In this paper we discuss the problem of including solvation free energies in evaluating the relative stabilities of loops in proteins. A conformational search based on a gas-phase potential function is used to generate a large number of trial conformations. As has been found previously, the energy minimization step in this process tends to pack charged and polar side chains against the protein surface, resulting in conformations which are unstable in the aqueous phase. Various solvation models can easily identify such structures. In order to provide a more severe test of solvation models, gas phase conformations were generated in which side chains were kept extended so as to maximize their interaction with the solvent. The free energies of these conformations were compared to that calculated for the crystal structure in three loops of the protein E. coli RNase H, with lengths of 7, 8, and 9 residues. Free energies were evaluated with a finite difference Poisson-Boltzmann (FDPB) calculation for electrostatics and a surface area-based term for nonpolar contributions. These were added to a gas-phase potential function. A free energy function based on atomic solvation parameters was also tested. Both functions were quite successful in selecting, based on a free energy criterion, conformations quite close to the crystal structure for two of the three loops. For one loop, which is involved in crystal contacts, conformations that are quite different from the crystal structure were also selected. A method to avoid precision problems associated with using the FDPB method to evaluate conformational free energies in proteins is described. © 1994 John Wiley & Sons, Inc.

Journal ArticleDOI
01 May 1994-Proteins
TL;DR: Aside from the Cu ligands, Arg‐143 is the single most important residue in Cu, Zn superoxide dismutase both electrostatically and mechanistically, and provide an explanation for the evolutionary selection of arginine at position 143.
Abstract: Cu, Zn superoxide dismutase protects cells from oxidative damage by removing superoxide radicals in one of the fastest enzyme reactions known. The redox reaction at the active-site Cu ion is rate-limited by diffusion and enhanced by electrostatic guidance. To quantitatively define the electrostatic and mechanistic contributions of sequence-invariant Arg-143 in human Cu, Zn superoxide dismutase, single-site mutants at this position were investigated experimentally and computationally. Rate constants for several Arg-143 mutants were determined at different pH and ionic strength conditions using pulse radiolytic methods and compared to results from Brownian dynamics simulations. At physiological pH, substitution of Arg-143 by Lys caused a 2-fold drop in rate, neutral substitutions (Ile, Ala) reduced the rate about 10-fold, while charge-reversing substitutions (Asp, Glu) caused a 100-fold decrease. Position 143 mutants showed pH dependencies not seen in other mutants. At low pH, the acidic residue mutations exhibited pro-tonation/deprotonation effects. At high pH, all enzymes showed typical decreases in rate except the Lys mutant in which the rate dropped off at an unusually low pH. Increasing ionic strength at acidic pH decreased the rates of the wild-type enzyme and Lys mutant, while the rate of the Glu mutant was unaffected. Increasing ionic strength at higher pH (>10) increased the rates of the Lys and Glu mutants while the rate of the wild-type enzyme was unaffected. Reaction simulations with Brownian dynamics incorporating electrostatic effects tested computational predictability of ionic strength dependencies of the wild-type enzyme and the Lys, Ile, and Glu mutants. The calculated and experimental ionic strength profiles gave similar slopes in all but the Glu mutant, indicating that the electrostatic attraction of the substrate is accurately modeled. Differences between the calculated and experimental rates for the Glu and Lys mutants reflect the mechanistic contribution of Arg-143. Results from this joint analysis establish that, aside from the Cu ligands, Arg-143 is the single most important residue in Cu, Zn superoxide dismutase both electrostatically and mechanistically, and provide an explanation for the evolutionary selection of arginine at position 143. © 1994 Wiley-Liss, Inc.

Journal ArticleDOI
01 Sep 1994-Proteins
TL;DR: Pyrrolidone carboxyl peptidase (EC 3.4.11.8) is an exopeptidase commonly called PYRase, which hydrolytically removes the pGlu frompGlu‐peptides or pGLU‐proteins.
Abstract: Pyrrolidone carboxyl peptidase (EC 3.4.11.8) is an exopeptidase commonly called PYRase, which hydrolytically removes the pGlu from pGlu-peptides or pGlu-proteins. pGlu also known as pyrrolidone carboxylic acid may occur naturally by an enzymatic procedure or may occur as an artifact in proteins or peptides. The enzymatic synthesis of pGlu suggests that this residue may have important biological and physiological functions. Several studies are consistent with this supposition. PYRase has been found in a variety of bacteria, and in plant, animal, and human tissues For over two decades, biochemical and enzymatic properties of PYRase have been investigated. At least two classes of PYRase have been characterized. The first one includes the bacterial and animal type I PYRases and the second one the animal type II and serum PYRases. Enzymes from these two classes present differences in their molecular weight and in their enzymatic properties. Recently, the genes of PYRases from four bacteria, have been cloned and characterized, allowing the study of the primary structure of these enzymes, and their over-expression in heterelogous organisms. Comparison of the primary structure of these enzymes revealed striking homologies. Type I PYRases and bacterial PYRases are generally soluble enzymes, whereas type II PYRases are membrane-bound enzymes. PYRase II appears to play as important a physiological role as other neuropeptide degrading enzymes. However, the role of type I and bacterial PYRases remains unclear. The primary application of PYRase has been its utilization for some protein or peptide sequencing. Development of chromogenic substrates for this enzyme has allowed its use in bacterial diagnosis. © 1994 Wiley-Liss, Inc.