scispace - formally typeset
Search or ask a question

Showing papers in "Proteins in 1998"


Journal ArticleDOI
01 Feb 1998-Proteins
TL;DR: The result indicates that the number of dynamic domains a protein possesses may not be a constant of the motion, and a clustering algorithm is used to determine clusters of rotation vectors corresponding to main‐chain segments that form possible dynamic domains.
Abstract: Methods developed originally to analyze domain motions from simulation (Proteins 27:425-437, 1997) are adapted and extended for the analysis of X-ray conformers and for proteins with more than two domains. The method can be applied as an automatic procedure to any case where more than one conformation is available. The basis of the methodology is that domains can be recog- nized from the difference in the parameters governing their quasi-rigid body motion, and in particular their rotation vectors. A cluster- ing algorithm is used to determine clusters of rotation vectors corresponding to main-chain segments that form possible dynamic domains. Domains are accepted for further analysis on the basis of a ratio of interdomain to intrado- main fluctuation, and Chasles' theorem is used to determine interdomain screw axes. Finally residues involved in the interdomain motion are identified. The methodology is tested on citrate synthase and the M6I mutant of T4 lysozyme. In both cases new aspects to their conformational change are revealed, as are individual residues intimately involved in their dynamics. For citrate synthase the beta sheet is identified to be part of the hinging mecha- nism. In the case of T4 lysozyme, one of the four transitions in the pathway from the closed to the open conformation, furnished four dy- namic domains rather than the expected two. This result indicates that the number of dy- namic domains a protein possesses may not be a constant of the motion. Proteins 30:144-154,

778 citations


Journal ArticleDOI
15 Nov 1998-Proteins
TL;DR: This article shows how physically motivated approximations permit the calculation of low‐frequency normal modes in a few minutes on standard desktop computers, and provides new insight into the relevance of normal mode calculations and the nature of the potential energy surface of proteins.
Abstract: The identification of dynamical domains in proteins and the description of the low‐frequency domain motions are one of the important applications of numerical simulation techniques. The application of these techniques to large proteins requires a substantial computational effort and therefore cannot be performed routinely, if at all. This article shows how physically motivated approximations permit the calculation of low‐frequency normal modes in a few minutes on standard desktop computers. The technique is based on the observation that the low‐frequency modes, which describe domain motions, are independent of force field details and can be obtained with simplified mechanical models. These models also provide a useful measure for rigidity in proteins, allowing the identification of quasi‐rigid domains. The methods are validated by application to three well‐studied proteins, crambin, lysozyme, and ATCase. In addition to being useful techniques for studying domain motions, the success of the approximations provides new insight into the relevance of normal mode calculations and the nature of the potential energy surface of proteins. Proteins 33:417–429, 1998. © 1998 Wiley‐Liss, Inc.

753 citations


Journal ArticleDOI
01 Jan 1998-Proteins
TL;DR: Two simple models and the energy landscape perspective are used to study protein folding kinetics and it is found that unfolding is not always a direct reversal of the folding process.
Abstract: We use two simple models and the energy landscape perspective to study protein folding kinetics. A major challenge has been to use the landscape perspective to interpret experimental data, which requires ensemble averaging over the microscopic trajectories usually observed in such models. Here, because of the simplicity of the model, this can be achieved. The kinetics of protein folding falls into two classes: multiple-exponential and two-state (single-exponential) kinetics. Experiments show that two-state relaxation times have "chevron plot" dependences on denaturant and non-Arrhenius dependences on temperature. We find that HP and HP+ models can account for these behaviors. The HP model often gives bumpy landscapes with many kinetic traps and multiple-exponential behavior, whereas the HP+ model gives more smooth funnels and two-state behavior. Multiple-exponential kinetics often involves fast collapse into kinetic traps and slower barrier climbing out of the traps. Two-state kinetics often involves entropic barriers where conformational searching limits the folding speed. Transition states and activation barriers need not define a single conformation; they can involve a broad ensemble of the conformations searched on the way to the native state. We find that unfolding is not always a direct reversal of the folding process.

454 citations


Journal ArticleDOI
15 Nov 1998-Proteins
TL;DR: A new docking approach using a Tabu search methodology to dock flexibly ligand molecules into rigid receptor structures using an empirical objective function with a small number of physically based terms derived from fitting experimental binding affinities for crystallographic complexes.
Abstract: This article describes the implementation of a new docking approach. The method uses a Tabu search methodology to dock flexibly ligand molecules into rigid receptor structures. It uses an empirical objective function with a small number of physically based terms derived from fitting experimental binding affinities for crystallographic complexes. This means that docking energies produced by the searching algorithm provide direct estimates of the binding affinities of the ligands. The method has been tested on 50 ligand-receptor complexes for which the experimental binding affinity and binding geometry are known. All water molecules are removed from the structures and ligand molecules are minimized in vacuo before docking. The lowest energy geometry produced by the docking protocol is within 1.5 A root-mean square of the experimental binding mode for 86% of the complexes. The lowest energies produced by the docking are in fair agreement with the known free energies of binding for the ligands.

371 citations


Journal ArticleDOI
01 Oct 1998-Proteins
TL;DR: This article presents an analytically exact method for computing the metric properties of macromolecules based on the alpha shape theory, which uses the duality between alpha complex and the weighted Voronoi decomposition of a molecule.
Abstract: The size and shape of macromolecules such as proteins and nucleic acids play an important role in their functions. Prior efforts to quantify these properties have been based on various discretization or tessellation procedures involving analytical or numerical computations. In this article, we present an analytically exact method for computing the metric properties of macromolecules based on the alpha shape theory. This method uses the duality between alpha complex and the weighted Voronoi decomposition of a molecule. We describe the intuitive ideas and concepts behind the alpha shape theory and the algorithm for computing areas and volumes of macromolecules. We apply our method to compute areas and volumes of a number of protein systems. We also discuss several difficulties commonly encountered in molecular shape computations and outline methods to overcome these problems.

325 citations


Journal ArticleDOI
01 Nov 1998-Proteins
TL;DR: Temperature dependence of buffer pH calculated by using the enthalpy and heat capacity changes obtained was in good agreement with the temperature variation of the pH values actually measured in the temperature range between 0 and 50°C for all the buffers studied.
Abstract: Enthalpy and heat capacity changes for the deprotonation of 18 buffers were calorimetrically determined in 0.1 M potassium chloride at temperatures ranging from 5 to 45 degrees C. The values of the dissociation constant were also determined by means of potentiometric titration. The enthalpy changes for the deprotonation of buffers, except for the phosphate and glycerol 2-phosphate buffers, were found to be characterized by a linear function of temperature. The enthalpy changes for the second dissociation of phosphate and glycerol 2-phosphate where divalent anion is formed on dissociation were fitted with the second order function of temperature rather than the first order. Temperature dependence of buffer pH calculated by using the enthalpy and heat capacity changes obtained was in good agreement with the temperature variation of the pH values actually measured in the temperature range between 0 and 50 degrees C for all the buffers studied. On the basis of the results obtained, a numeric table showing the temperature dependence of pK values for the 18 buffers is presented.

315 citations


Journal ArticleDOI
01 Nov 1998-Proteins
TL;DR: An algorithm is described which enables us to search the conformational space of the side chains of a protein to identify the global minimum energy combination of side chain conformations as well as all other conformations within a specified energy cutoff of the global energy minimum.
Abstract: We describe an algorithm which enables us to search the conformational space of the side chains of a protein to identify the global minimum energy combination of side chain conformations as well as all other conformations within a specified energy cutoff of the global energy minimum. The program is used to explore the side chain conformational energy surface of a number of proteins, to investigate how this surface varies with the energy model used to describe the interactions within the system and the rotamer library. Enumeration of the rotamer combinations enables us to directly evaluate the partition function, and thus calculate the side chain contribution to the conformational entropy of the folded protein. An investigation of these conformations and the relationships between them shows that most of the conformations near to the global energy minimum arise from changes in side chain conformations that are essentially independent; very few result from a concerted change in conformation of two or more residues. Some of the limitations of the approach are discussed. Proteins 33:227–239, 1998. © 1998 Wiley-Liss, Inc.

262 citations


Journal ArticleDOI
01 Oct 1998-Proteins
TL;DR: A precise algorithm based on alpha shapes for measuring space‐filling‐based molecular models (such as van der Waals, solvent accessible, and molecular surface descriptions) is applied for accurate computation of the surface area and volume of cavities in several proteins.
Abstract: The structures of proteins are well-packed, yet they contain numerous cavi- ties which play key roles in accommodating small molecules, or enabling conformational changes. From high-resolution structures it is possible to identify these cavities. We have developed a precise algorithm based on alpha shapes for measuring space-filling-based mo- lecular models (such as van der Waals, solvent accessible, and molecular surface descrip- tions). We applied this method for accurate computation of the surface area and volume of cavities in several proteins. In addition, all of the atoms/residues lining the cavities are identi- fied. We use this method to study the structure and the stability of proteins, as well as to locate cavities that could contain structural water mol- ecules in the proton transport pathway in the membrane protein bacteriorhodopsin. Proteins

248 citations


Journal ArticleDOI
01 Dec 1998-Proteins
TL;DR: Energy landscape of human lysozyme in its native state is investigated by using principal component analysis and a model, jumping‐among‐minima (JAM) model, which shows that energy surfaces of individual conformational substates are nearly harmonic and mutually similar.
Abstract: We have investigated energy landscape of human lysozyme in its native state by using principal component analysis and a model, jumping-among-minima (JAM) model. These analyses are applied to 1 nsec molecular dynamics trajectory of the protein in water. An assumption embodied in the JAM model allows us to divide protein motions into intra-substate and inter-substate motions. By examining intra-substate motions, it is shown that energy surfaces of individual conformational substates are nearly harmonic and mutually similar. As a result of principal component analysis and JAM model analysis, protein motions are shown to consist of three types of collective modes, multiply hierarchical modes, singly hierarchical modes, and harmonic modes. Multiply hierarchical modes, the number of which accounts only for 0.5% of all modes, dominate contributions to total mean-square atomic fluctuation. Inter-substate motions are observed only in a small-dimensional subspace spanned by the axes of multiplyhierarchical and singly hierarchical modes. Inter-substate motions have two notable time components: faster component seen within 200 psec and slower component. The former involves transitions among the conformational substates of the low-level hierarchy, whereas the latter involves transitions of the higher level substates observed along the first four multiply hierarchical modes. We also discuss dependence of the subspace, which contains conformational substates, on time duration of simulation.

229 citations


Journal ArticleDOI
01 Jan 1998-Proteins
TL;DR: This work investigates the transport of the substrates, oxygen and protons, through the enzyme, and proposes a possible pumping mechanism that involves a shuttling motion of a glutamic acid side chain which could then transfer a proton to a propionate group of heme α3.
Abstract: Cytochrome c oxidase is a re- dox-driven proton pump, which couples the reduction of oxygen to water to the transloca- tion of protons across the membrane. The re- cently solved x-ray structures of cytochrome c oxidase permit molecular dynamics simula- tions of the underlying transport processes. To eventually establish the proton pump mecha- nism, we investigate the transport of the sub- strates, oxygen and protons, through the en- zyme. Molecular dynamics simulations of oxygen diffusion through the protein reveal a well- defined pathway to the oxygen-binding site starting at a hydrophobic cavity near the mem- brane-exposed surface of subunit I, close to the interface to subunit III. A large number of water sites are predicted within the protein, which could play an essen- tial role for the transfer of protons in cyto- chrome c oxidase. The water molecules form two channels along which protons can enter from the cytoplasmic (matrix) side of the pro- tein and reach the binuclear center. A possible pumping mechanism is proposed that involves a shuttling motion of a glutamic acid side chain, which could then transfer a proton to a propionate group of heme a3. Proteins 30:100-

214 citations


Journal ArticleDOI
01 Jan 1998-Proteins
TL;DR: Current strategies and selected applications in molecular and cell biology, which include studying the molecular composition of multiprotein complexes and characterization of secondary modifications of proteins, are described.
Abstract: The entire genomic DNA sequences of a number of prokaryotic and eukaryotic species are now available and many more, including the human genome, will be completed in the near future. The state-of-life of a cell at any given time, however, is defined by its protein composition, i.e., its proteome. Gel electrophoresis, mass spectrometry, and bioinformatics will be important tools for protein and proteome analysis in the post-genome era. Protein identification from electrophoretic gels by mass spectrometric peptide mapping or peptide sequencing combined with sequence database searching is established and has been applied to numerous biological systems. We describe current strategies and selected applications in molecular and cell biology. The next challenges are detailed structure/function analyses, which include studying the molecular composition of multiprotein complexes and characterization of secondary modifications of proteins. The advantages and limitations of a number of mass spectrometry-based strategies designed for microcharacterization of low amounts of protein from electrophoretic gels are discussed and illustrated by examples.

Journal ArticleDOI
01 May 1998-Proteins
TL;DR: The data support a model in which urea denatures proteins by decreasing the hydrophobic effect and by directly binding to the amide units via hydrogen bonds, and indicate that the enthalpy of amide hydrogen bond formation in water is considerably higher than previously estimated.
Abstract: The effects of urea on protein stability have been studied using a model system in which we have determined the energetics of dissolution of a homologous series of cyclic dipeptides into aqueous urea solutions of varying concentration at 25 degrees C using calorimetry. The data support a model in which urea denatures proteins by decreasing the hydrophobic effect and by directly binding to the amide units via hydrogen bonds. The data indicate also that the enthalpy of amide hydrogen bond formation in water is considerably higher than previously estimated. Previous estimates included the contribution of hydrophobic transfer of the alpha-carbon resulting in an overestimate of the binding between urea and the amide unit of the backbone and an underestimate of the binding enthalpy.


Journal ArticleDOI
01 Aug 1998-Proteins
TL;DR: A simple multidimensional funnel based on two‐order parameters that measure the degree of collapse and topological order is described that leads to a classification of mechanisms totally in keeping with the one‐dimensional scheme, but a topologically distinct scenario of fast folding with traps also emerges.
Abstract: An important idea that emerges from the energy landscape theory of protein folding is that subtle global features of the protein landscape can profoundly affect the apparent mechanism of folding. The relationship between various characteristic temperatures in the phase diagrams and landmarks in the folding funnel at fixed temperatures can be used to classify different folding behaviors. The one-dimensional picture of a folding funnel classifies folding kinetics into four basic scenarios, depending on the relative location of the thermodynamic barrier and the glass transition as a function of a single-order parameter. However, the folding mechanism may not always be quantitatively described by a single-order parameter. Several other order parameters, such as degree of secondary structure formation, collapse and topological order, are needed to establish the connection between minimalist models and proteins in the laboratory. In this article we describe a simple multidimensional funnel based on two-order parameters that measure the degree of collapse and topological order. The appearance of several different "mechanisms" is illustrated by analyzing lattice models with different potentials and sequences with different degrees of design. In most cases, the two-dimensional analysis leads to a classification of mechanisms totally in keeping with the one-dimensional scheme, but a topologically distinct scenario of fast folding with traps also emerges. The nature of traps depends on the relative location of the glass transition surface and the thermodynamic barrier in the multidimensional funnel.

Journal ArticleDOI
01 May 1998-Proteins
TL;DR: A comparison of a series of extended molecular dynamics simulations of bacteriophage T4 lysozyme in solvent with X‐ray data is presented, revealing that the N‐terminal helix rotates together with either of these two domains.
Abstract: A comparison of a series of extended molecular dynamics (MD) simula- tions of bacteriophage T4 lysozyme in solvent with X-ray data is presented. Essential dynam- ics analyses were used to derive collective fluctuations from both the simulated trajecto- ries and a distribution of crystallographic con- formations. In both cases the main collective fluctuations describe domain motions. The pro- tein consists of an N- and C-terminal domain connected by a long helix. The analysis of the distribution of crystallographic conformations reveals that the N-terminal helix rotates to- gether with either of these two domains. The main domain fluctuation describes a closure mode of the two domains in which the N-termi- nal helix rotates concertedly with the C-termi- nal domain, while the domain fluctuation with second largest amplitude corresponds to a twisting mode of the two domains, with the N-terminal helix rotating concertedly with the N-terminal domain. For the closure mode, the difference in hinge-bending angle between the most open and most closed X-ray structure along this mode is 49 degrees. In the MD simu- lation that shows the largest fluctuation along this mode, a rotation of 45 degrees was ob- served. Although the twisting mode has much less freedom than the closure mode in the distribution of crystallographic conformations, experimental results suggest that it might be functionally important. Interestingly, the twist- ing mode is sampled more extensively in all MD simulations than it is in the distribution of X-ray conformations. Proteins 31:116-127,

Journal ArticleDOI
Liisa Holm1, Chris Sander1
01 Oct 1998-Proteins
TL;DR: A method for automated domain identification from protein structure atomic coordinates based on quantitative measures of compactness and recurrence is presented, which yields consistent domain definitions between remote homologs, a result difficult to achieve using compactness criteria alone.
Abstract: The rapid growth in the number of experimentally determined three-dimensional protein structures has sharpened the need for comprehensive and up-to-date surveys of known structures. Classic work on protein structure classification has made it clear that a structural survey is best carried out at the level of domains, i.e., substructures that recur in evolution as functional units in different protein contexts. We present a method for automated domain identification from protein structure atomic coordinates based on quantitative measures of compactness and, as the new element, recurrence. Compactness criteria are used to recursively divide a protein into a series of successively smaller and smaller substructures. Recurrence criteria are used to select an optimal size level of these substructures, so that many of the chosen substructures are common to different proteins at a high level of statistical significance. The joint application of these criteria automatically yields consistent domain definitions between remote homologs, a result difficult to achieve using compactness criteria alone. The method is applied to a representative set of 1,137 sequence-unique protein families covering 6,500 known structures. Clustering of the resulting set of domains (substructures) yields 594 distinct fold classes (types of substructures). The Dali Domain Dictionary (http://www.embl-ebi.ac.uk/dali) not only provides a global structural classification, but also a comprehensive description of families of protein sequences grouped around representative proteins of known structure. The classification will be continuously updated and can serve as a basis for improving our understanding of protein evolution and function and for evolving optimal strategies to complete the map of all natural protein structures. Proteins 33:88–96, 1998. © 1998 Wiley-Liss, Inc.

Journal ArticleDOI
01 Jan 1998-Proteins
TL;DR: X‐ray diffraction is used to study the binding of xenon and krypton to a variety of crystallised proteins, and xenon complexes can be used to map hydrophobic sites in proteins, or as heavy‐atom derivatives in the isomorphous replacement method of structure determination.
Abstract: X-ray diffraction is used to study the binding of xenon and krypton to a variety of crystallised proteins: porcine pancreatic elastase; subtilisin Carlsberg from Bacillus licheniformis; cutinase from Fusarium solani; collagenase from Hypoderma lineatum; hen egg lysozyme, the lipoamide dehydrogenase domain from the outer membrane protein P64k from Neisseria meningitidis; urate-oxidase from Aspergillus flavus, mosquitocidal delta-endotoxin CytB from Bacillus thuringiensis and the ligand-binding domain of the human nuclear retinoid-X receptor RXR-alpha. Under gas pressures ranging from 8 to 20 bar, xenon is able to bind to discrete sites in hydrophobic cavities, ligand and substrate binding pockets, and into the pore of channel-like structures, These xenon complexes can be used to map hydrophobic sites in proteins, or as heavy-atom derivatives in the isomorphous replacement method of structure determination. (C) 1998 Wiley-Liss, Inc.

Journal ArticleDOI
01 Jul 1998-Proteins
TL;DR: It is shown that both the VDW and the Coulombic radii of polar atoms are needed in calculating the molecular and solvent‐accessible surfaces of proteins, which has important implications for docking, which relies on surface complementarity at the interface.
Abstract: We analyze the contact distance distributions between nonbonded atoms in known protein structures. A complete set of van der Waals (VDW) radii for 24 protein atom types and for crystal-bound water is derived from the contact distance distributions of these atoms with a selected group of apolar atoms. In addition, a set of Coulombic radii for polar atoms is derived from their contacts with water. The contact distance distributions and the two sets of radii are derived in a systematic and self-consistent manner using an iterative procedure. The Coulombic radii for polar atoms are, on average, 0.18 A smaller than their VDW radii. The VDW radius of water is 1.7 A, which is 0.3 A larger than its Coulombic radius. We show that both the VDW and the Coulombic radii of polar atoms are needed in calculating the molecular and solvent-accessible surfaces of proteins. The VDW radii are needed to generate the apolar portions of the surface and the Coulombic radii for the polar portions. The fact that polar atoms have two apparent sizes implies that a hydrophobic cavity has to be larger than a polar cavity in order to accommodate the same number of water molecules. Most surface area calculations have used only one radius for each polar atom. As a result, unreal cavities, grooves, or pockets may be generated if the Coulombic radii of polar atoms are used. On the other hand, if the VDW radii of polar atoms are used, the details of the polar regions of the surface may be lost. The accuracy of the molecular and the solvent-accessible surfaces of proteins can be improved if the radii of polar atoms are allowed to change depending on the nature of their contacting neighbors. The surface of a protein at a protein-protein interface differs from that in solution in that it has to be generated using at least two kinds of probes, one representing a typical apolar atom and the other a typical polar atom. This observation has important implications for docking, which relies on surface complementarity at the interface.

Journal ArticleDOI
Hiroshi Mamitsuka1
01 Dec 1998-Proteins
TL;DR: This paper proposes to apply supervised learning of hidden Markov models (HMMs) to this problem, which can surpass existing methods for the problem of predicting MHC‐binding peptides, and presents new peptide sequences that are provided with high binding probabilities by the HMM and that are thus expected to bind to HLA‐A2 proteins.
Abstract: The binding of a major histo- compatibility complex (MHC) molecule to a peptide originating in an antigen is essential to recognizing antigens in immune systems, and it has proved to be important to use com- puters to predict the peptides that will bind to an MHC molecule. The purpose of this paper is twofold: First, we propose to apply supervised learning of hidden Markov models (HMMs) to this problem, which can surpass existing meth- ods for the problem of predicting MHC-binding peptides. Second, we generate peptides that have high probabilities to bind to a certain MHC molecule, based on our proposed method using peptides binding to MHC molecules as a set of training data. From our experiments, in a type of cross-validation test, the discrimina- tion accuracy of our supervised learning method is usually approximately 2-15% better than those of other methods, including back- propagation neural networks, which have been regarded as the most effective approach to this problem. Furthermore, using an HMM trained for HLA-A2, we present new peptide sequences that are provided with high binding probabili- ties by the HMM and that are thus expected to bind to HLA-A2 proteins. Peptide sequences not shown in this paper but with rather high binding probabilities can be obtained from the

Journal ArticleDOI
01 Nov 1998-Proteins
TL;DR: It is suggested that the observed improvement of pKa values in the present studies is due not to averaging over an ensemble of structures, but rather to the generation of a single properly averaged structure for the pKa calculation.
Abstract: Several methods for including the conformational flexibility of proteins in the calculation of titration curves are compared. The methods use the linearized Poisson-Boltzmann equation to calculate the electrostatic free energies of solvation and are applied to bovine pancreatic trypsin inhibitor (BPTI) and hen egg-white lysozyme (HEWL). An ensemble of conformations is generated by a molecular dynamics simulation of the proteins with explicit solvent. The average titration curve of the ensemble is calculated in three different ways: an average structure is used for the pKa calculation; the electrostatic interaction free energies are averaged and used for the pKa calculation; and the titration curve for each structure is calculated and the curves are averaged. The three averaging methods give very similar results and improve the pKa values to approximately the same degree. This suggests, in contrast to implications from other work, that the observed improvement of pKa values in the present studies is due not to averaging over an ensemble of structures, but rather to the generation of a single properly averaged structure for the pKa calculation.

Journal ArticleDOI
01 Sep 1998-Proteins
TL;DR: Two recently developed methods—SIMS, for calculation of a smooth invariant molecular surface, and FAMBE, for solution of the Poisson equation via a fast adaptive multigrid boundary element—have been employed.
Abstract: A new method for calculating the total conformational free energy of proteins in water solvent is presented. The method consists of a relatively brief simulation by molecular dynamics with explicit solvent (ES) molecules to produce a set of microstates of the macroscopic conformation. Conformational energy and entropy are obtained from the simulation, the latter in the quasi-harmonic approximation by analysis of the covariance matrix. The implicit solvent (IS) dielectric continuum model is used to calculate the average solvation free energy as the sum of the free energies of creating the solute-size hydrophobic cavity, of the van der Waals solute-solvent interactions, and of the polarization of water solvent by the solute's charges. The reliability of the solvation free energy depends on a number of factors: the details of arrangement of the protein's charges, especially those near the surface; the definition of the molecular surface; and the method chosen for solving the Poisson equation. Molecular dynamics simulation in explicit solvent relaxes the protein's conformation and allows polar surface groups to assume conformations compatible with interaction with solvent, while averaging of internal energy and solvation free energy tend to enhance the precision. Two recently developed methods--SIMS, for calculation of a smooth invariant molecular surface, and FAMBE, for solution of the Poisson equation via a fast adaptive multigrid boundary element--have been employed. The SIMS and FAMBE programs scale linearly with the number of atoms. SIMS is superior to Connolly's MS (molecular surface) program: it is faster, more accurate, and more stable, and it smooths singularities of the molecular surface. Solvation free energies calculated with these two programs do not depend on molecular position or orientation and are stable along a molecular dynamics trajectory. We have applied this method to calculate the conformational free energy of native and intentionally misfolded globular conformations of proteins (the EMBL set of deliberately misfolded proteins) and have obtained good discrimination in favor of the native conformations in all instances.

Journal ArticleDOI
01 May 1998-Proteins
TL;DR: These results demonstrate on a quantitative basis that the geometry of sidechain packing is similar for left‐handed helix–helix pairs embedded in membranes and coiled coils of soluble proteins.
Abstract: Membrane-embedded protein domains frequently exist as alpha-helical bundles, as exemplified by photosynthetic reaction centers, bacteriorhodopsin, and cytochrome C oxidase. The sidechain packing between their transmembrane helices was investigated by a nearest-neighbor analysis which identified sets of interfacial residues for each analyzed helix-helix interface. For the left-handed helix-helix pairs, the interfacial residues almost exclusively occupy positions a, d, e, or g within a heptad motif (abcdefg) which is repeated two to three times for each interacting helical surface. The connectivity between the interfacial residues of adjacent helices conforms to the knobs-into-holes type of sidechain packing known from soluble coiled coils. These results demonstrate on a quantitative basis that the geometry of sidechain packing is similar for left-handed helix-helix pairs embedded in membranes and coiled coils of soluble proteins. The transmembrane helix-helix interfaces studied are somewhat less compact and regular as compared to soluble coiled coils and tolerate all hydrophobic amino acid types to similar degrees. The results are discussed with respect to previous experimental findings which demonstrate that specific interactions between transmembrane helices are important for membrane protein folding and/or oligomerization.

Journal ArticleDOI
01 Mar 1998-Proteins
TL;DR: Spectroscopic investigations of BthTX‐I in solution have correlated these conformational differences with changes in the intrinsic fluorescence emission of the single tryptophan residues located at the dimer interface, suggesting the possible relevance of this structural transition in the Ca2+‐independent membrane damaging activity.
Abstract: Bothropstoxin I (BthTX-I) from the venom of Bothrops jararacussu is a myotoxic phospholipase A2 (PLA2) homologue which, although catalytically inactive due to an Asp49-->Lys substitution, disrupts the integrity of lipid membranes by a Ca2+-independent mechanism. The crystal structures of two dimeric forms of BthTX-I which diffract X-rays to resolutions of 3.1 and 2.1 angstroms have been determined. The monomers in both structures are related by an almost perfect twofold axis of rotation and the dimer interfaces are defined by contacts between the N-terminal alpha-helical regions and the tips of the beta-wings of partner monomers. Significant differences in the relative orientation of the monomers in the two crystal forms results in "open" and "closed" dimer conformations. Spectroscopic investigations of BthTX-I in solution have correlated these conformational differences with changes in the intrinsic fluorescence emission of the single tryptophan residues located at the dimer interface. The possible relevance of this structural transition in the Ca2+-independent membrane damaging activity is discussed.

Journal ArticleDOI
15 Aug 1998-Proteins
TL;DR: The binary complexes of HSV‐1 thymidine kinase (TK) with the drug molecules aciclovir and penciclovIR, determined by X‐ray crystallography at 2.37 Å resolution, are reported for the first time.
Abstract: Antiherpes therapies are principally targeted at viral thymidine kinases and utilize nucleoside analogs, the triphosphates of which are inhibitors of viral DNA polymerase or result in toxic effects when incorporated into DNA. The most frequently used drug, aciclovir (Zovirax), is a relatively poor substrate for thymidine kinase and high-resolution structural information on drugs and other molecules binding to the target is therefore important for the design of novel and more potent chemotherapy, both in antiherpes treatment and in gene therapy systems where thymidine kinase is expressed. Here, we report for the first time the binary complexes of HSV-1 thymidine kinase (TK) with the drug molecules aciclovir and penciclovir, determined by X-ray crystallography at 2.37 A resolution. Moreover, from new data at 2.14 A resolution, the refined structure of the complex of TK with its substrate deoxythymidine (R = 0.209 for 96% of all data) now reveals much detail concerning substrate and solvent interactions with the enzyme. Structures of the complexes of TK with four halogen-containing substrate analogs have also been solved, to resolutions better than 2.4 A. The various TK inhibitors broadly fall into three groups which together probe the space of the enzyme active site in a manner that no one molecule does alone, so giving a composite picture of active site interactions that can be exploited in the design of novel compounds. Proteins 32:350–361, 1998. © 1998 Wiley-Liss, Inc.

Journal ArticleDOI
01 Jun 1998-Proteins
TL;DR: The doublet X‐Pro, with Pro at C′ position and extended backbone conformation for the X residue at Ccap, appears to be a common structural motif for termination of α‐helices, in addition to the Schellman motif.
Abstract: An analysis of the amino acid distributions at 15 positions, viz., N", N', Ncap, N1, N2, N3, N4, Mid, C4, C3, C2, C1, Ccap, C', and C" in 1,131 alpha-helices reveals that each position has its own unique characteristics. In general, natural helix sequences optimize by identifying the residues to be avoided at a given position and minimizing the occurrence of these avoided residues rather than by maximizing the preferred residues at various positions. Ncap is most selective in its choice of residues, with six amino acids (S, D, T, N, G, and P) being preferred at this position and another 11 (V, I, F, A, K, L, Y, R, E, M, and Q) being strongly avoided. Ser, Asp, and Thr are all more preferred at Ncap position than Asn, whose role at helix N-terminus has been highlighted by earlier analyses. Furthermore, Asn is also found to be almost equally preferred at helix C-terminus and a novel structural motif is identified, involving a hydrogen bond formed by N delta 2 of Asn at Ccap or C1 position, with the backbone carbonyl oxygen four residues inside the helix. His also forms a similar motif at the C-terminus. Pro is the most avoided residue in the main body (N4 to C4 positions) and at C-terminus, including Ccap of an alpha-helix. In 1,131 alpha-helices, no helix contains Pro at C3 or C2 positions. However, Pro is highly favoured at N1 and C'. The doublet X-Pro, with Pro at C' position and extended backbone conformation for the X residue at Ccap, appears to be a common structural motif for termination of alpha-helices, in addition to the Schellman motif. Main body of the helix shows a high preference for aliphatic residues Ala, Leu, Val, and Ile, while these are avoided at helix termini. A propensity scale for amino acids to occur in the middle of helices has been obtained. Comparison of this scale with several previously reported scales shows that this scale correlates best with the experimentally determined values.

Journal ArticleDOI
01 Aug 1998-Proteins
TL;DR: The hinge‐bending docking approach and the insight into flexibility it provides on a complex of the calmodulin with its M13 ligand are illustrated, expanding the repertoire of computational docking tools foreseen to aid in studies of recognition, conformational flexibility and drug design.
Abstract: Here we dock a ligand onto a receptor surface allowing hinge-bending do- main/substructural movements. Our approach mimics and manifests induced fit in molecular recognition. All angular rotations are allowed on the one hand, while a conformational space search is avoided on the other. Rather than dock each of the molecular parts separately with subsequent reconstruction of the consis- tently docked molecules, all parts are docked simultaneously while still utilizing the posi- tion of the hinge from the start. Like pliers closing on a screw, the receptor automatically closes on its ligand in the best surface-match- ing way. Movements are allowed either in the ligand or in the larger receptor, hence repro- ducing induced molecular fit. Hinge bending movements are frequently observed when mol- ecules associate. There are numerous examples of open versus closed conformations taking place upon binding. Such movements are ob- served when the substrate binds to its respec- tive enzyme. In particular, such movements are of interest in allosteric enzymes. The move- ments can involve entire domains, subdo- mains, loops, (other) secondary structure ele- ments, or between any groups of atoms connected by flexible joints. We have imple- mented the hinges at points and at bonds. By allowing 3-dimensional (3-D) rotation at the hinge, several rotations about (consecutive or nearby) bonds are implicitly taken into ac- count. Alternatively, if required, the point rota- tion can be restricted to bond rotation. Here we illustrate this hinge-bending docking ap- proach and the insight into flexibility it pro- vides on a complex of the calmodulin with its M13 ligand, positioning the hinges either in the ligand or in the larger receptor. This auto- mated and efficient method is adapted from computer vision and robotics. It enables utiliz- ing entire molecular surfaces rather than focus- ing a priori on active sites. Hence, allows attain- ing the overall optimally matching surfaces, the extent and type of motions which are in- volved. Here we do not treat the conforma- tional flexibility of side-chains or of very small pieces of the molecules. Therefore, currently available methods addressing these issues and the method presented here, are complemen- tary to each other, expanding the repertoire of computational docking tools foreseen to aid in studies of recognition, conformational flexibil- ity and drug design. Proteins 32:159-174, 1998.

Journal ArticleDOI
Mark Gerstein1
01 Dec 1998-Proteins
TL;DR: Eight microbial genomes are compared in terms of protein structure and patterns of fold usage—whether a given fold occurs in a particular organism and all the genomes appear to have similar usage patterns for these folds, according to a “Zipf‐like” law.
Abstract: Eight microbial genomes are compared in terms of protein structure. Specifi- cally, yeast, H. influenzae, M. genitalium, M. jannaschii, Synechocystis, M. pneumoniae, H. pylori ,a ndE. coli are compared in terms of patterns of fold usage—whether a given fold occurs in a particular organism. Of the ,340 soluble protein folds currently in the structure databank (PDB), 240 occur in at least one of the eight genomes, and 30 are shared amongst all eight. The shared folds are depleted in all- helical structure and enriched in mixed helix- sheet structure compared to the folds in the PDB. The top-10 most common of the shared 30 are enriched in superfolds, uniting many non- homologous sequence families, and are espe- cially similar in overall architecture—eight having helices packed onto a central sheet. They are also very different from the common folds in the PBD, highlighting databank biases. Folds can be ranked in terms of expression as well as genome duplication. In yeast the top-10 most highly expressed folds are considerably different from the most highly duplicated folds. A tree can be constructed grouping genomes in terms of their shared folds. This has a remark- ably similar topology to more conventional classifications, based on very different mea- sures of relatedness. Finally, folds of mem- brane proteins can be analyzed through trans- membrane-helix (TM) prediction. All the genomes appear to have similar usage patterns for these folds, with the occurrence of a particu- lar fold falling off rapidly with increasing num- bers of TM-elements, according to a ''Zipf-like'' law. This implies there are no marked prefer- ences for proteins with particular numbers of TM-helices (e.g. 7-TM) in microbial genomes. Fur- ther information pertinent to this analysis is avail- able at http://bioinfo.mbb.yale.edu/genome. Pro- teins 33:518-534, 1998. r 1998 Wiley-Liss, Inc.

Journal ArticleDOI
01 Oct 1998-Proteins
TL;DR: The three key challenges addressed in the development of SPECITOPE, a tool for screening large structural databases for potential ligands to a protein, are to eliminate infeasible candidates early in the search, incorporate ligand and protein side‐chain flexibility upon docking, and provide an appropriate rank for potential new ligands.
Abstract: The three key challenges addressed in our development of SPECITOPE, a tool for screening large structural databases for potential ligands to a protein, are to eliminate infeasible candidates early in the search, incorporate ligand and protein side-chain flexibility upon docking, and provide an appropriate rank for potential new ligands. The protein ligand-binding site is modeled by a shell of surface atoms and by hydrogen-bonding template points for the ligand to match, conferring specificity to the interaction. SPECITOPE combinatorially matches all hydrogen-bond donors and acceptors of the screened molecules to the template points. By eliminating molecules that cannot match distance or hydrogen-bond constraints, the transformation of potential docking candidates into the ligand-binding site and the shape and hydrophobic complementarity evaluations are only required for a small subset of the database. SPECITOPE screens 140,000 peptide fragments in about an hour and has identified and docked known inhibitors and potential new ligands to the free structures of four distinct targets: a serine protease, a DNA repair enzyme, an aspartic proteinase, and a glycosyltransferase. For all four, protein side-chain rotations were critical for successful docking, emphasizing the importance of inducible complementarity for accurately modeling ligand interactions. SPECITOPE has a range of potential applications for understanding and engineering protein recognition, from inhibitor and linker design to protein docking and macromolecular assembly.

Journal ArticleDOI
01 Sep 1998-Proteins
TL;DR: Comparative simplicity, high affinity and specificity, potential to reach and interact with active sites, and ability to mimic substrate suggest that camel heavy‐chain antibodies present advantages over classic antibodies in the design, production, and application of clinically valuable compounds.
Abstract: Whereas antibodies have demonstrated the ability to mimic various compounds, classic heavy/light-chain antibodies may be limited in their applications. First, they tend not to bind enzyme active site clefts. Second, their size and complexity present problems in identifying key elements for binding and in using these elements to produce clinically valuable compounds. We have previously shown how cAb-Lys3, a single variable domain fragment derived from a lysozyme-specific camel antibody naturally lacking light chains, overcomes the first limitation to become the first antibody structure observed penetrating an enzyme active site. We now demonstrate how cAb-Lys3 mimics the oligosaccharide substrate functionally (inhibition constant for lysozyme, 50 nM) and structurally (lysozyme buried surface areas, hydrogen bond partners, and hydrophobic contacts are similar to those seen in sugar-complexed structures). Most striking is the mimicry by the antibody complementary determining region 3 (CDR3) loop, especially Ala104, which mimics the subsite C sugar 2-acetamido group; this group has previously been identified as a key feature in binding lysozyme. Comparative simplicity, high affinity and specificity, potential to reach and interact with active sites, and ability to mimic substrate suggest that camel heavy-chain antibodies present advantages over classic antibodies in the design, production, and application of clinically valuable compounds.

Journal ArticleDOI
01 Sep 1998-Proteins
TL;DR: The reliability and robustness of the new method should enable its routine application in model building protocols based on various (very sparse) experimentally derived structural restraints, and increasing the number of tertiary restraints improves the accuracy of the assembled structures.
Abstract: A new, efficient method for the assembly of protein tertiary structure from known, loosely encoded secondary structure restraints and sparse information about exact side chain contacts is proposed and evaluated. The method is based on a new, very simple method for the reduced modeling of protein structure and dynamics, where the protein is described as a lattice chain connecting side chain centers of mass rather than Calphas. The model has implicit built-in multibody correlations that simulate short- and long-range packing preferences, hydrogen bonding cooperativity and a mean force potential describing hydrophobic interactions. Due to the simplicity of the protein representation and definition of the model force field, the Monte Carlo algorithm is at least an order of magnitude faster than previously published Monte Carlo algorithms for structure assembly. In contrast to existing algorithms, the new method requires a smaller number of tertiary restraints for successful fold assembly; on average, one for every seven residues as compared to one for every four residues. For example, for smaller proteins such as the B domain of protein G, the resulting structures have a coordinate root mean square deviation (cRMSD), which is about 3 A from the experimental structure; for myoglobin, structures whose backbone cRMSD is 4.3 A are produced, and for a 247-residue TIM barrel, the cRMSD of the resulting folds is about 6 A. As would be expected, increasing the number of tertiary restraints improves the accuracy of the assembled structures. The reliability and robustness of the new method should enable its routine application in model building protocols based on various (very sparse) experimentally derived structural restraints.