scispace - formally typeset
Search or ask a question

Showing papers in "Proteins in 2013"


Journal ArticleDOI
01 Dec 2013-Proteins
TL;DR: This article evaluates the performance of ClusPro 2.0 for targets 46–58 in Rounds 22–27 of CAPRI and confirms that ranking models based on cluster size can reliably identify the best near‐native conformations.
Abstract: The protein docking server ClusPro has been participating in critical assessment of prediction of interactions (CAPRI) since its introduction in 2004. This article evaluates the performance of ClusPro 2.0 for targets 46-58 in Rounds 22-27 of CAPRI. The analysis leads to a number of important observations. First, ClusPro reliably yields acceptable or medium accuracy models for targets of moderate difficulty that have also been successfully predicted by other groups, and fails only for targets that have few acceptable models submitted. Second, the quality of automated docking by ClusPro is very close to that of the best human predictor groups, including our own submissions. This is very important, because servers have to submit results within 48 h and the predictions should be reproducible, whereas human predictors have several weeks and can use any type of information. Third, while we refined the ClusPro results for manual submission by running computationally costly Monte Carlo minimization simulations, we observed significant improvement in accuracy only for two of the six complexes correctly predicted by ClusPro. Fourth, new developments, not seen in previous rounds of CAPRI, are that the top ranked model provided by ClusPro was acceptable or better quality for all these six targets, and that the top ranked model was also the highest quality for five of the six, confirming that ranking models based on cluster size can reliably identify the best near-native conformations.

550 citations


Journal ArticleDOI
01 Dec 2013-Proteins
TL;DR: The fifth evaluation of docking and related scoring methods used in the community‐wide experiment on the Critical Assessment of Predicted Interactions (CAPRI) finds that automatic docking servers exhibit a significantly improved performance, with some servers now performing on par with predictions done by humans.
Abstract: We present the fifth evaluation of docking and related scoring methods used in the community-wide experiment on the Critical Assessment of Predicted Interactions (CAPRI). The evaluation examined predictions submitted for a total of 15 targets in eight CAPRI rounds held during the years 2010-2012. The targets represented one the most diverse set tackled by the CAPRI community so far. They included only 10 "classical" docking and scoring problems. In one of the classical targets, the new challenge was to predict the position of water molecules in the protein-protein interface. The remaining five targets represented other new challenges that involved estimating the relative binding affinity and the effect of point mutations on the stability of designed and natural protein-protein complexes. Although the 10 classical CAPRI targets included two difficult multicomponent systems, and a protein-oligosaccharide complex with which CAPRI participants had little experience, this evaluation indicates that the performance of docking and scoring methods has remained quite robust. More remarkably, we find that automatic docking servers exhibit a significantly improved performance, with some servers now performing on par with predictions done by humans. The performance of CAPRI participants in the new challenges, briefly reviewed here, was mediocre overall, but some groups did relatively well and their approaches suggested ways of improving methods for designing binders and for estimating the free energies of protein assemblies, which should impact the field of protein modeling and design as a whole.

222 citations


Journal ArticleDOI
01 Feb 2013-Proteins
TL;DR: A gapless‐threading method to generate position‐specific structure fragments is developed and it is found that the optimal fragment length for structural assembly is around 10, and at least 100 fragments at each location are needed to achieve optimal structure assembly.
Abstract: Fragment assembly using structural motifs excised from other solved proteins has shown to be an efficient method for ab initio protein-structure prediction. However, how to construct accurate fragments, how to derive optimal restraints from fragments, and what the best fragment length is are the basic issues yet to be systematically examined. In this work, we developed a gapless-threading method to generate position-specific structure fragments. Distance profiles and torsion angle pairs are then derived from the fragments by statistical consistency analysis, which achieved comparable accuracy with the machine-learning-based methods although the fragments were taken from unrelated proteins. When measured by both accuracies of the derived distance profiles and torsion angle pairs, we come to a consistent conclusion that the optimal fragment length for structural assembly is around 10, and at least 100 fragments at each location are needed to achieve optimal structure assembly. The distant profiles and torsion angle pairs as derived by the fragments have been successfully used in QUARK for ab initio protein structure assembly and are provided by the QUARK online server at http://zhanglab.ccmb. med.umich.edu/QUARK/. Proteins 2013. © 2012 Wiley Periodicals, Inc.

198 citations


Journal ArticleDOI
01 Nov 2013-Proteins
TL;DR: In this article, a new E. coli expression strain, LOBSTR (low background strain), which eliminates the most abundant contaminants is derived from the E coli BL21(DE3) strain and carries genomically modified copies of arnA and slyD, whose protein products exhibit reduced affinities to Ni and Co resins.
Abstract: His-tag affinity purification is one of the most commonly used methods to purify recombinant proteins expressed in E. coli. One drawback of using the His-tag is the co-purification of contaminating histidine-rich E. coli proteins. We engineered a new E. coli expression strain, LOBSTR (low background strain), which eliminates the most abundant contaminants. LOBSTR is derived from the E. coli BL21(DE3) strain and carries genomically modified copies of arnA and slyD, whose protein products exhibit reduced affinities to Ni and Co resins, resulting in a much higher purity of the target protein. The use of LOBSTR enables the pursuit of challenging low-expressing protein targets by reducing background contamination with no additional purification steps, materials, or costs, and thus pushes the limits of standard His-tag purifications.

170 citations


Journal ArticleDOI
01 Jan 2013-Proteins
TL;DR: A two‐step refinement protocol, called 3Drefine, to consistently bring the initial model closer to the native structure, which has been evaluated on the CASP benchmark data and it exhibits consistent improvement over the initial structure in both global and local structural quality measures.
Abstract: One of the major limitations of computational protein structure prediction is the deviation of predicted models from their experimentally derived true, native structures. The limitations often hinder the possibility of applying computational protein structure prediction methods in biochemical assignment and drug design that are very sensitive to structural details. Refinement of these low-resolution predicted models to high-resolution structures close to the native state, however, has proven to be extremely challenging. Thus, protein structure refinement remains a largely unsolved problem. Critical assessment of techniques for protein structure prediction (CASP) specifically indicated that most predictors participating in the refinement category still did not consistently improve model quality. Here, we propose a two-step refinement protocol, called 3Drefine, to consistently bring the initial model closer to the native structure. The first step is based on optimization of hydrogen bonding (HB) network and the second step applies atomic-level energy minimization on the optimized model using a composite physics and knowledge-based force fields. The approach has been evaluated on the CASP benchmark data and it exhibits consistent improvement over the initial structure in both global and local structural quality measures. 3Drefine method is also computationally inexpensive, consuming only few minutes of CPU time to refine a protein of typical length (300 residues). 3Drefine web server is freely available at http://sysbio.rnet.missouri.edu/3Drefine/.

157 citations


Journal ArticleDOI
01 Jan 2013-Proteins
TL;DR: Cad‐score, a new evaluation function quantifying differences between physical contacts in a model and the reference structure, reveals a balanced assessment of domain rearrangement, removing the necessity for different treatment of single‐domain, multi‐ domain, and multi‐subunit structures.
Abstract: Evaluation of protein models against the native structure is essential for the development and benchmarking of protein structure prediction methods. Although a number of evaluation scores have been proposed to date, many aspects of model assessment still lack desired robustness. In this study we present CAD-score, a new evaluation function quantifying differences between physical contacts in a model and the reference structure. The new score uses the concept of residue-residue contact area difference (CAD) introduced by Abagyan and Totrov (J Mol Biol 1997; 268:678-685). Contact areas, the underlying basis of the score, are derived using the Voronoi tessellation of protein structure. The newly introduced CAD-score is a continuous function, confined within fixed limits, free of any arbitrary thresholds or parameters. The built-in logic for treatment of missing residues allows consistent ranking of models of any degree of completeness. We tested CAD-score on a large set of diverse models and compared it to GDT-TS, a widely accepted measure of model accuracy. Similarly to GDT-TS, CAD-score showed a robust performance on single-domain proteins, but displayed a stronger preference for physically more realistic models. Unlike GDT-TS, the new score revealed a balanced assessment of domain rearrangement, removing the necessity for different treatment of single-domain, multi-domain, and multi-subunit structures. Moreover, CAD-score makes it possible to assess the accuracy of inter-domain or inter-subunit interfaces directly. In addition, the approach offers an alternative to the superposition-based model clustering. The CAD-score implementation is available both as a web server and a standalone software package at http://www.ibt.lt/bioinformatics/cad-score/.

118 citations


Journal ArticleDOI
01 Apr 2013-Proteins
TL;DR: The notion of surrounding hydrophobicity, which characterizes the hydrophobic behavior of residues in a protein environment, has been applied to the three‐dimensional structures of elongation factor‐Tu proteins and it is found that the thermophilic proteins are enriched with a hydrophilic environment.
Abstract: The stability of thermophilic proteins has been viewed from different perspectives and there is yet no unified principle to understand this stability. It would be valuable to reveal the most important interactions for designing thermostable proteins for such applications as industrial protein engineering. In this work, we have systematically analyzed the importance of various interactions by computing different parameters such as surrounding hydrophobicity, inter-residue interactions, ion-pairs and hydrogen bonds. The importance of each interaction has been determined by its predicted relative contribution in thermophiles versus the same contribution in mesophilic homologues based on a dataset of 373 protein families. We predict that hydrophobic environment is the major factor for the stability of thermophilic proteins and found that 80% of thermophilic proteins analyzed showed higher hydrophobicity than their mesophilic counterparts. Ion pairs, hydrogen bonds, and interaction energy are also important and favored in 68%, 50%, and 62% of thermophilic proteins, respectively. Interestingly, thermophilic proteins with decreased hydrophobic environments display a greater number of hydrogen bonds and/or ion pairs. The systematic elimination of mesophilic proteins based on surrounding hydrophobicity, interaction energy, and ion pairs/hydrogen bonds, led to correctly identifying 95% of the thermophilic proteins in our analyses. Our analysis was also applied to another, more refined set of 102 thermophilic-mesophilic pairs, which again identified hydrophobicity as a dominant property in 71% of the thermophilic proteins. Further, the notion of surrounding hydrophobicity, which characterizes the hydrophobic behavior of residues in a protein environment, has been applied to the three-dimensional structures of elongation factor-Tu proteins and we found that the thermophilic proteins are enriched with a hydrophobic environment. The results obtained in this work highlight the importance of hydrophobicity as the dominating characteristic in the stability of thermophilic proteins, and we anticipate this will be useful in our attempts to engineering thermostable proteins.

106 citations


Journal ArticleDOI
01 Dec 2013-Proteins
TL;DR: This work has developed a case‐based reasoning approach called KBDOCK which systematically identifies and reuses domain family binding sites from the authors' database of nonredundant DDIs and provides a near‐perfect way to model single‐domain protein complexes when full‐homology templates are available.
Abstract: Protein docking algorithms aim to calculate the three-dimensional (3D) structure of a protein complex starting from its unbound components. Although ab initio docking algorithms are improving, there is a growing need to use homology modeling techniques to exploit the rapidly increasing volumes of structural information that now exist. However, most current homology modeling approaches involve finding a pair of complete single-chain structures in a homologous protein complex to use as a 3D template, despite the fact that protein complexes are often formed from one or more domain-domain interactions (DDIs). To model 3D protein complexes by domain-domain homology, we have developed a case-based reasoning approach called KBDOCK which systematically identifies and reuses domain family binding sites from our database of nonredundant DDIs. When tested on 54 protein complexes from the Protein Docking Benchmark, our approach provides a near-perfect way to model single-domain protein complexes when full-homology templates are available, and it extends our ability to model more difficult cases when only partial or incomplete templates exist. These promising early results highlight the need for a new and diverse docking benchmark set, specifically designed to assess homology docking approaches.

104 citations


Journal ArticleDOI
01 Nov 2013-Proteins
TL;DR: A community‐wide assessment of methods to predict the effects of mutations on protein–protein interactions found that large‐scale fitness landscapes should continue to provide an excellent test bed for continued evaluation of both existing and new prediction methodologies.
Abstract: Community-wide blind prediction experiments such as CAPRI and CASP provide an objective measure of the current state of predictive methodology. Here we describe a community-wide assessment of methods to predict the effects of mutations on protein-protein interactions. Twenty-two groups predicted the effects of comprehensive saturation mutagenesis for two designed influenza hemagglutinin binders and the results were compared with experimental yeast display enrichment data obtained using deep sequencing. The most successful methods explicitly considered the effects of mutation on monomer stability in addition to binding affinity, carried out explicit side-chain sampling and backbone relaxation, evaluated packing, electrostatic, and solvation effects, and correctly identified around a third of the beneficial mutations. Much room for improvement remains for even the best techniques, and large-scale fitness landscapes should continue to provide an excellent test bed for continued evaluation of both existing and new prediction methodologies.

100 citations


Journal ArticleDOI
01 Jan 2013-Proteins
TL;DR: A new algorithm, DEEPer (Dead‐End Elimination with Perturbations), that combines the new abilities to handle arbitrarily large backbone perturbations and to generate ensembles of backbone conformations, providing significant advantages for modeling protein mutations and protein–ligand interactions.
Abstract: Computational protein and drug design generally require accurate modeling of protein conformations This modeling typically starts with an experimentally determined protein structure and considers possible conformational changes due to mutations or new ligands The DEE/A* algorithm provably finds the global minimum-energy conformation (GMEC) of a protein assuming that the backbone does not move and the sidechains take on conformations from a set of discrete, experimentally observed conformations called rotamers DEE/A* can efficiently find the overall GMEC for exponentially many mutant sequences Previous improvements to DEE/A* include modeling ensembles of sidechain conformations and either continuous sidechain or backbone flexibility We present a new algorithm, DEEPer (Dead-End Elimination with Perturbations), that combines these advantages and can also handle much more extensive backbone flexibility and backbone ensembles DEEPer provably finds the GMEC or, if desired by the user, all conformations and sequences within a specified energy window of the GMEC It includes the new abilities to handle arbitrarily large backbone perturbations and to generate ensembles of backbone conformations It also incorporates the shear, an experimentally observed local backbone motion never before used in design Additionally, we derive a new method to accelerate DEE/A*-based calculations, indirect pruning, that is particularly useful for DEEPer In 67 benchmark tests on 64 proteins, DEEPer consistently identified lower-energy conformations than previous methods did, indicating more accurate modeling Additional tests demonstrated its ability to incorporate larger, experimentally observed backbone conformational changes and to model realistic conformational ensembles These capabilities provide significant advantages for modeling protein mutations and protein–ligand interactions Proteins 2013 © 2012 Wiley Periodicals, Inc

96 citations


Journal ArticleDOI
01 Aug 2013-Proteins
TL;DR: Past studies exploring the stability, folding, and misfolding behavior of SOD1 are discussed, as well as the therapeutic possibilities of using detailed knowledge of misfolded pathways to target the molecular mechanisms underlying ALS and other neurodegenerative diseases.
Abstract: Enormous strides have been made in the last 100 years to extend human life expectancy and to combat the major infectious diseases. Today, the major challenges for medical science are age-related diseases, including cancer, heart disease, lung disease, renal disease, and late-onset neurodegenerative disease. Of these, only the neurodegenerative diseases represent a class of disease so poorly understood that no general strategies for prevention or treatment exist. These diseases, which include Alzheimer's disease, Parkinson's disease, Huntington's disease, the transmissible spongiform encephalopathies, and amyotrophic lateral sclerosis (ALS), are generally fatal and incurable. The first section of this review summarizes the diversity and common features of the late-onset neurodegenerative diseases, with a particular focus on protein misfolding and aggregation-a recurring theme in the molecular pathology. The second section focuses on the particular case of ALS, a late-onset neurodegenerative disease characterized by the death of central nervous system motor neurons, leading to paralysis and patient death. Of the 10% of ALS cases that show familial inheritance (familial ALS), the largest subset is caused by mutations in the SOD1 gene, encoding the Cu, Zn superoxide dismutase (SOD1). The unusual kinetic stability of SOD1 has provided a unique opportunity for detailed structural characterization of conformational states potentially involved in SOD1-associated ALS. This review discusses past studies exploring the stability, folding, and misfolding behavior of SOD1, as well as the therapeutic possibilities of using detailed knowledge of misfolding pathways to target the molecular mechanisms underlying ALS and other neurodegenerative diseases.

Journal ArticleDOI
01 Dec 2013-Proteins
TL;DR: The PeptiMap protocol, a protocol for the accurate mapping of peptide binding sites on protein structures, is presented, based on experimental evidence that peptide‐binding sites also bind small organic molecules of various shapes and polarity.
Abstract: Peptide-mediated interactions, in which a short linear motif binds to a globular domain, play major roles in cellular regulation. An accurate structural model of this type of interaction is an excellent starting point for the characterization of the binding specificity of a given peptide-binding domain. A number of different protocols have recently been proposed for the accurate modeling of peptide-protein complex structures, given the structure of the protein receptor and the binding site on its surface. When no information about the peptide binding site(s) is a priori available, there is a need for new approaches to locate peptide-binding sites on the protein surface. While several approaches have been proposed for the general identification of ligand binding sites, peptides show very specific binding characteristics, and therefore, there is a need for robust and accurate approaches that are optimized for the prediction of peptide-binding sites. Here, we present PeptiMap, a protocol for the accurate mapping of peptide binding sites on protein structures. Our method is based on experimental evidence that peptide-binding sites also bind small organic molecules of various shapes and polarity. Using an adaptation of ab initio ligand binding site prediction based on fragment mapping (FTmap), we optimize a protocol that specifically takes into account peptide binding site characteristics. In a high-quality curated set of peptide-protein complex structures PeptiMap identifies for most the accurate site of peptide binding among the top ranked predictions. We anticipate that this protocol will significantly increase the number of accurate structural models of peptide-mediated interactions.

Journal ArticleDOI
01 Jul 2013-Proteins
TL;DR: This approach not only eliminates the necessity to create a consensus prediction from possibly contradicting outputs of several predictors but bears the potential to predict conformational switches, i.e., sequence regions that have a high probability to change for example from a coil conformation in solution to an α‐helical transmembrane state.
Abstract: Prediction of transmembrane spans and secondary structure from the protein sequence is generally the first step in the structural characterization of (membrane) proteins. Preference of a stretch of amino acids in a protein to form secondary structure and being placed in the membrane are correlated. Nevertheless, current methods predict either secondary structure or individual transmembrane states. We introduce a method that simultaneously predicts the secondary structure and transmembrane spans from the protein sequence. This approach not only eliminates the necessity to create a consensus prediction from possibly contradicting outputs of several predictors but bears the potential to predict conformational switches, i.e., sequence regions that have a high probability to change for example from a coil conformation in solution to an α-helical transmembrane state. An artificial neural network was trained on databases of 177 membrane proteins and 6048 soluble proteins. The output is a 3 × 3 dimensional probability matrix for each residue in the sequence that combines three secondary structure types (helix, strand, coil) and three environment types (membrane core, interface, solution). The prediction accuracies are 70.3% for nine possible states, 73.2% for three-state secondary structure prediction, and 94.8% for three-state transmembrane span prediction. These accuracies are comparable to state-of-the-art predictors of secondary structure (e.g., Psipred) or transmembrane placement (e.g., OCTOPUS). The method is available as web server and for download at www.meilerlab.org.

Journal ArticleDOI
01 Nov 2013-Proteins
TL;DR: While decoupling sampling and scoring facilitates method development, integration of the two steps can lead to substantial improvements in docking results, and some algorithms that achieve a certain level of integration are discussed.
Abstract: Most structure prediction algorithms consist of initial sampling of the conformational space, followed by rescoring and possibly refinement of a number of selected structures. Here we focus on protein docking, and show that while decoupling sampling and scoring facilitates method development, integration of the two steps can lead to substantial improvements in docking results. Since decoupling is usually achieved by generating a decoy set containing both non-native and near-native docked structures, which can be then used for scoring function construction, we first review the roles and potential pitfalls of decoys in protein-protein docking, and show that some type of decoys are better than others for method development. We then describe three case studies showing that complete decoupling of scoring from sampling is not the best choice for solving realistic docking problems. Although some of the examples are based on our own experience, the results of the CAPRI docking and scoring experiments also show that performing both sampling and scoring generally yields better results than scoring the structures generated by all predictors. Next we investigate how the selection of training and decoy sets affects the performance of the scoring functions obtained. Finally, we discuss pathways to better alignment of the two steps, and show some algorithms that achieve a certain level of integration. Although we focus on protein-protein docking, our observations most likely also apply to other conformational search problems, including protein structure prediction and the docking of small molecules to proteins.

Journal ArticleDOI
01 Apr 2013-Proteins
TL;DR: A novel method for fast in silico mutagenesis of protein–protein complexes to calculate the effect of mutation as a function of pH and a computational strategy to search for mutations that can alter the pH‐dependent binding behavior of IgG to FcRn with the aim of improving the half‐life of therapeutic antibodies in the target organism.
Abstract: Understanding the effects of mutation on pH-dependent protein binding affinity is important in protein design, especially in the area of protein therapeutics. We propose a novel method for fast in silico mutagenesis of protein-protein complexes to calculate the effect of mutation as a function of pH. The free energy differences between the wild type and mutants are evaluated from a molecular mechanics model, combined with calculations of the equilibria of proton binding. The predicted pH-dependent energy profiles demonstrate excellent agreement with experimentally measured pH-dependency of the effect of mutations on the dissociation constants for the complex of turkey ovomucoid third domain (OMTKY3) and proteinase B. The virtual scanning mutagenesis identifies all hotspots responsible for pH-dependent binding of immunoglobulin G (IgG) to neonatal Fc receptor (FcRn) and the results support the current understanding of the salvage mechanism of the antibody by FcRn based on pH-selective binding. The method can be used to select mutations that change the pH-dependent binding profiles of proteins and guide the time consuming and expensive protein engineering experiments. As an application of this method, we propose a computational strategy to search for mutations that can alter the pH-dependent binding behavior of IgG to FcRn with the aim of improving the half-life of therapeutic antibodies in the target organism.

Journal ArticleDOI
01 Jul 2013-Proteins
TL;DR: A 2.0 Å crystal structure of the human NLRP1 CARD is reported as a fusion with the maltose‐binding protein, typical of the death domain superfamily, and suggests potential mechanisms for their association through electrostatic attraction.
Abstract: The NLRP1 inflammasome responds to microbial challenges such as Bacillus anthracis infection and is implicated in autoimmune disease such as vitiligo. Human NLRP1 contains both an N-terminal pyrin domain (PYD) and a C-terminal caspase recruitment domain (CARD), with the latter being essential for its association with the downstream effector procaspase-1. Here we report a 2.0 A crystal structure of the human NLRP1 CARD as a fusion with the maltose-binding protein. The structure reveals the six-helix bundle fold of the NLRP1 CARD, typical of the death domain superfamily. The charge surface of the NLRP1 CARD structure and a procaspase-1 CARD model suggests potential mechanisms for their association through electrostatic attraction.

Journal ArticleDOI
01 Dec 2013-Proteins
TL;DR: This work shows, using previous CAPRI targets, that out of a variety of measures, the global sequence identity between template and target is a simple but reliable predictor of the achievable quality of the docking models, indicating that a well‐defined overall fold is critical for the interaction.
Abstract: Information-driven docking is currently one of the most successful approaches to obtain structural models of protein interactions as demonstrated in the latest round of CAPRI. While various experimental and computational techniques can be used to retrieve information about the binding mode, the availability of three-dimensional structures of the interacting partners remains a limiting factor. Fortunately, the wealth of structural information gathered by large-scale initiatives allows for homology-based modeling of a significant fraction of the protein universe. Defining the limits of information-driven docking based on such homology models is therefore highly relevant. Here we show, using previous CAPRI targets, that out of a variety of measures, the global sequence identity between template and target is a simple but reliable predictor of the achievable quality of the docking models. This indicates that a well-defined overall fold is critical for the interaction. Furthermore, the quality of the data at our disposal to characterize the interaction plays a determinant role in the success of the docking. Given reliable interface information we can obtain acceptable predictions even at low global sequence identity. These results, which define the boundaries between trustworthy and unreliable predictions, should guide both experts and nonexperts in defining the limits of what is achievable by docking. This is highly relevant considering that the fraction of the interactome amenable for docking is only bound to grow as the number of experimentally solved structures increases.

Journal ArticleDOI
01 Nov 2013-Proteins
TL;DR: The role of disulfide on the stability, structure, oligomerization, and amyloidogenecity of native folded or unfolded amyloidsogenic proteins is focused on.
Abstract: More than 20 human diseases, including Alzheimer's disease, Parkinson's disease, and prion disease, originate from the deposition of misfolded proteins. These proteins, referred as amyloidogenic proteins, adopt a β-sheet-rich structure when transformed from soluble state into insoluble amyloid fibrils. Amyloid formation is influenced by a number of factors that affect the intermolecular interaction, including pH, temperature, ion strength, and chemical bonds. In this review, we focus on the role of disulfide on the stability, structure, oligomerization, and amyloidogenecity of native folded or unfolded amyloidogenic proteins. The effects of introduced disulfide bonds on the amyloidogenicity of proteins lacking native disulfide are also reviewed. Proteins 2013; 81:1862–1873. © 2013 Wiley Periodicals, Inc.

Journal ArticleDOI
01 Sep 2013-Proteins
TL;DR: The solvation properties of the DFG‐out protein conformation are calculated using an explicit solvent molecular dynamics simulation and thermodynamic analysis method implemented in WaterMap to predict the enthalpic and entropic costs of water transfer to and from bulk solvent incurred upon association and dissociation of each inhibitor.
Abstract: In our previous work, we proposed that desolvation and resolvation of the binding sites of proteins can serve as the slowest steps during ligand association and dissociation, respectively, and tested this hypothesis on two protein-ligand systems with known binding kinetics behavior. In the present work, we test this hypothesis on another kinetically-determined protein-ligand system—that of p38α and eight Type II BIRB 796 inhibitor analogs. The kon values among the inhibitor analogs are narrowly distributed (104 ≤ kon ≤ 105 M−1 s−1), suggesting a common rate-determining step, whereas the koff values are widely distributed (10−1 ≤ koff ≤ 10−6 s−1), suggesting a spectrum of rate-determining steps. We calculated the solvation properties of the DFG-out protein conformation using an explicit solvent molecular dynamics simulation and thermodynamic analysis method implemented in WaterMap to predict the enthalpic and entropic costs of water transfer to and from bulk solvent incurred upon association and dissociation of each inhibitor. The results suggest that the rate-determining step for association consists of the transfer of a common set of enthalpically favorable solvating water molecules from the binding site to bulk solvent. The rate-determining step for inhibitor dissociation consists of the transfer of water from bulk solvent to specific binding site positions that are unfavorably solvated in the apo protein, and evacuated during ligand association. Different sets of unfavorable solvation are evacuated by each ligand, and the observed dissociation barriers are qualitatively consistent with the calculated solvation free energies of those sets.

Journal ArticleDOI
01 Apr 2013-Proteins
TL;DR: A detailed analysis shows that coarse grained potentials perform better than atomic potentials for realistic unbound docking (where the exact structures of the individual bound proteins are unknown), probably because Atomic potentials are more sensitive to local errors.
Abstract: An atomically detailed potential for docking pairs of proteins is derived using mathematical programming. A refinement algorithm that builds atomically detailed models of the complex and combines coarse grained and atomic scoring is introduced. The refinement step consists of remodeling the interface side chains of the top scoring decoys from rigid docking followed by a short energy minimization. The refined models are then re-ranked using a combination of coarse grained and atomic potentials. The docking algorithm including the refinement and re-ranking, is compared favorably to other leading docking packages like ZDOCK, Cluspro, and PATCHDOCK, on the ZLAB 3.0 Benchmark and a test set of 30 novel complexes. A detailed analysis shows that coarse grained potentials perform better than atomic potentials for realistic unbound docking (where the exact structures of the individual bound proteins are unknown), probably because atomic potentials are more sensitive to local errors. Nevertheless, the atomic potential captures a different signal from the residue potential and as a result a combination of the two scores provides a significantly better prediction than each of the approaches alone.

Journal ArticleDOI
01 Apr 2013-Proteins
TL;DR: It is suggested that salvianolic acid B can significantly inhibit the formation of hIAPP amyloid and disaggregate hI APP fibrils, and photo‐crosslinking based oligomerization studies suggest SalB significantly suppresses the toxic oligomersization ofhIAPP monomers.
Abstract: The misfolding of human islet amyloid polypeptide (hIAPP) is regarded as one of the causative factors of type 2 diabetes mellitus (T2DM). Salvia miltiorrhiza (Danshen), one of the most commonly used of traditional Chinese medicines, is often used in Compound Recipes for treating diabetes, however with unclear mechanisms. Since salvianolic acid B (SalB) is the most abundant bioactive ingredient of salvia miltiorrhiza water-extract. In this study, we tested whether SalB has any effect on the amyloidogenicity of hIAPP. Our results clearly suggest that SalB can significantly inhibit the formation of hIAPP amyloid and disaggregate hIAPP fibrils. Furthermore, photo-crosslinking based oligomerization studies suggest SalB significantly suppresses the toxic oligomerization of hIAPP monomers. Cytotoxicity protection effects on pancreatic INS-1 cells by SalB were also observed using MTT-based assays, potentially due to the inhibition on the membrane disruption effects and attenuated mitochondria impairment induced by hIAPP. These results provide evidence that SalB may further be studied on the possible pharmacological treatment for T2DM.

Journal ArticleDOI
01 Jun 2013-Proteins
TL;DR: The results indicate that synergic/antagonistic anti‐amyloid effects of studied mixtures depend on the selective binding of polyphenols to the known amyloidogenic sequences in the lysozyme chain.
Abstract: The amyloidoses are diseases associated with nonnative folding of proteins and characterized by the presence of protein amyloid aggregates. The ability of quercetin, resveratrol, caffeic acid, and their equimolar mixtures to affect amyloid aggregation of hen egg white lysozyme in vitro was detected by Thioflavin T fluorescence assay. The anti-amyloid activities of tested polyphenols were evaluated by the median depolymerization concentrations DC50 and median inhibition concentrations IC50 . Single substances are more efficient (by at least one order) in the depolymerization of amyloid aggregates assay than in the inhibition of the amyloid formation with IC50 in 10(-4) to 10(-5) M range. Analyzed mixture samples showed synergic or antagonistic effects in both assays. DC50 values ranged from 10(-5) to 10(-8) M and IC50 from 10(-5) to 10(-9) M, respectively. We observed that certain mixtures of studied polyphenols can synergistically inhibit production of amyloids aggregates and are also effective in depolymerization of the aggregates. Synergic or antagonistic effects of studied mixtures were correlated with protein-small ligand docking studies and AFM results. Differences in these activities could be explained by binding of each polyphenol to a different amino acid sequence within the protein. Our results indicate that synergic/antagonistic anti-amyloid effects of studied mixtures depend on the selective binding of polyphenols to the known amyloidogenic sequences in the lysozyme chain. Our findings of the effective reduction of amyloid aggregation of lysozyme by polyphenol mixtures in vitro are of the utter physiological relevance considering the bioavailability and low toxicity of tested phenols.

Journal ArticleDOI
01 May 2013-Proteins
TL;DR: A generic set of active site residues of ω‐ATs that are associated with a strong preference for aromatic substrates are identified, thus guiding the discovery of novel promising enzymes for the biotechnological production of corresponding chiral amines.
Abstract: Apart from their crucial role in metabolism, pyridoxal 5'-phosphate (PLP)-dependent aminotransferases (ATs) constitute a class of enzymes with increasing application in industrial biotechnology. To provide better insight into the structure-function relationships of ATs with biotechnological potential we performed a fundamental bioinformatics analysis of 330 representative sequences of pro- and eukaryotic Class III ATs using a structure-guided approach. The calculated phylogenetic maximum likelihood tree revealed six distinct clades of which the first segregates with a very high bootstrap value of 92%. Most enzymes in this first clade have been functionally well characterized, whereas knowledge about the natural functions and substrates of enzymes in the other branches is sparse. Notably, in those clades 2-6 members of the peculiar class of ω-ATs prevail, many of which have proven useful for the preparation of chiral amines or artificial amino acids. One representative is the ω-AT from Paracoccus denitrificans (PD ω-AT) which catalyzes, for example, the transamination in a novel biocatalytic process for the production of L-homoalanine from L-threonine. To gain structural insight into this important enzyme, its X-ray analysis was carried out at a resolution of 2.6 A, including the covalently bound PLP as well as 5-aminopentanoate as a putative amino donor substrate. On the basis of this crystal structure in conjunction with our phylogenetic analysis, we have identified a generic set of active site residues of ω-ATs that are associated with a strong preference for aromatic substrates, thus guiding the discovery of novel promising enzymes for the biotechnological production of corresponding chiral amines.

Journal ArticleDOI
01 Sep 2013-Proteins
TL;DR: This study suggests that stabilization of thebinding loop and solvation of the binding pocket are important determinants of the dissociation kinetics in mSA.
Abstract: We recently reported the engineering of monomeric streptavidin, mSA, corresponding to one subunit of wild type (wt) streptavidin tetramer. The monomer was designed by homology modeling, in which the streptavidin and rhizavidin sequences were combined to engineer a high affinity binding pocket containing residues from a single subunit only. Although mSA is stable and binds biotin with nanomolar affinity, its fast off rate (koff ) creates practical challenges during applications. We obtained a 1.9 A crystal structure of mSA bound to biotin to understand their interaction in detail, and used the structure to introduce targeted mutations to improve its binding kinetics. To this end, we compared mSA to shwanavidin, which contains a hydrophobic lid containing F43 in the binding pocket and binds biotin tightly. However, the T48F mutation in mSA, which introduces a comparable hydrophobic lid, only resulted in a modest 20-40% improvement in the measured koff . On the other hand, introducing the S25H mutation near the bicyclic ring of bound biotin increased the dissociation half life (t½ ) from 11 to 83 min at 20°C. Molecular dynamics (MD) simulations suggest that H25 stabilizes the binding loop L3,4 by interacting with A47, and protects key intermolecular hydrogen bonds by limiting solvent entry into the binding pocket. Concurrent T48F or T48W mutation clashes with H25 and partially abrogates the beneficial effects of H25. Taken together, this study suggests that stabilization of the binding loop and solvation of the binding pocket are important determinants of the dissociation kinetics in mSA.

Journal ArticleDOI
01 Dec 2013-Proteins
TL;DR: Possible further improvements of the docking approach in particular at the scoring and the flexible refinement steps are discussed and new approaches for the rapid flexible refinement have been developed based on a combination of atomistic representation of the bonded geometry and a CG description of nonbonded interactions.
Abstract: A coarse-grained (CG) protein model implemented in the ATTRACT protein–protein docking program has been employed to predict protein–protein complex structures in CAPRI Rounds 22–27. For six targets, acceptable or better quality solutions have been submitted corresponding to ∼60% of all targets. For one target, promising results on the prediction of the hydration structure at the protein–protein interface have been achieved. New approaches for the rapid flexible refinement have been developed based on a combination of atomistic representation of the bonded geometry and a CG description of nonbonded interactions. Possible further improvements of the docking approach in particular at the scoring and the flexible refinement steps are discussed. Proteins 2013; 81:2167–2174. © 2013 Wiley Periodicals, Inc.

Journal ArticleDOI
01 Jan 2013-Proteins
TL;DR: Evidence that allosteric inhibition of NS5B results from intrinsic features of the enzyme free energy landscape, suggesting a common mechanism for the action of diverse allosterics ligands is presented.
Abstract: Hepatitis C virus (HCV) has infected almost 200 million people worldwide, typically causing chronic liver damage and severe complications such as liver failure. Currently, there are few approved treatments for viral infection. Thus, the HCV RNA-dependent RNA polymerase (gene product NS5B) has emerged as an important target for small molecule therapeutics. Potential therapeutic agents include allosteric inhibitors that bind distal to the enzyme active site. While their mechanism of action is not conclusively known, it has been suggested that certain inhibitors prevent a conformational change in NS5B that is crucial for RNA replication. To gain insight into the molecular origin of long-range allosteric inhibition of NS5B, we employed molecular dynamics simulations of the enzyme with and without an inhibitor bound to the thumb domain. These studies indicate that the presence of an inhibitor in the thumb domain alters both the structure and internal motions of NS5B. Principal components analysis identified motions that are severely attenuated by inhibitor binding. These motions may have functional relevance by facilitating interactions between NS5B and RNA template or nascent RNA duplex, with presence of the ligand leading to enzyme conformations with narrower and thus less accessible RNA binding channels. This study provides the first evidence for a mechanistic basis of allosteric inhibition in NS5B. Moreover, we present evidence that allosteric inhibition of NS5B results from intrinsic features of the enzyme free energy landscape, suggesting a common mechanism for the action of diverse allosteric ligands.

Journal ArticleDOI
David La1, Misun Kong1, William Hoffman1, Youn Im Choi1, Daisuke Kihara 
01 May 2013-Proteins
TL;DR: This study constructed amino acid substitution models that capture mutation patterns at permanent and transient type of protein interfaces, which were found to be different with statistical significance and developed a novel computational method, BindML+, that predicts Permanent and transient protein binding interfaces (PBIs) in protein surfaces.
Abstract: Protein–protein interactions (PPIs) are involved in diverse functions in a cell. To optimize functional roles of interactions, proteins interact with a spectrum of binding affinities. Interactions are conventionally classified into permanent and transient, where the former denotes tight binding between proteins that result in strong complexes, whereas the latter compose of relatively weak interactions that can dissociate after binding to regulate functional activity at specific time point. Knowing the type of interactions has significant implications for understanding the nature and function of PPIs. In this study, we constructed amino acid substitution models that capture mutation patterns at permanent and transient type of protein interfaces, which were found to be different with statistical significance. Using the substitution models, we developed a novel computational method that predicts permanent and transient protein binding interfaces (PBIs) in protein surfaces. Without knowledge of the interacting partner, the method uses a single query protein structure and a multiple sequence alignment of the sequence family. Using a large dataset of permanent and transient proteins, we show that our method, BindML+, performs very well in protein interface classification. A very high area under the curve (AUC) value of 0.957 was observed when predicted protein binding sites were classified. Remarkably, near prefect accuracy was achieved with an AUC of 0.991 when actual binding sites were classified. The developed method will be also useful for protein design of permanent and transient PBIs. © Proteins 2013. © 2012 Wiley Periodicals, Inc.

Journal ArticleDOI
01 Jan 2013-Proteins
TL;DR: A new microscopic method is presented that provides an efficient way for simulating the energetics of water insertion using fully microscopic calculations and provides a substantial improvement over regular microscopic free energy estimates.
Abstract: Consistent description of the effect of internal water in proteins has been a major challenge for both simulation and experimental studies. Describing this effect has been particularly important and elusive in cases of charges in protein interiors. Here, we present a new microscopic method that provides an efficient way for simulating the energetics of water insertion. Instead of performing explicit Monte Carlo (MC) moves on the insertion process, which generally involves an enormous number of rejected attempts, our method is based on generating trial configurations with excess amount of internal water, estimating the relevant free energy by the linear response approximation, and then using a postprocessing MC treatment to filter out a limited number of configurations from a large possible set. Our approach is validated on particularly challenging test cases including the pK(a) of the V66D mutation in Staphylococcal nuclease, Glu286 in cytochrome c oxidase (CcO) and the energetics of a protonated water molecule in the D channel of CcO. The new postprocessing method allows us to reproduce the relevant energetics of highly unstable charges in protein interiors using fully microscopic calculations and provides a substantial improvement over regular microscopic free energy estimates. This advance established the effectiveness of our water insertion strategy in challenging cases that have not been addressed successfully by other microscopic methods. Furthermore, our study provides a new exciting view on the crucial effect of water penetration in key biological systems as well as a new view on the nature of the dielectric in protein interiors.

Journal ArticleDOI
01 Nov 2013-Proteins
TL;DR: It is shown that the hybrid approach can distinctly improve the performance of the individual methods for both bound and unbound structures, and significantly outperformed the state‐of‐art algorithms by around 10% in terms of Matthews's correlation coefficient.
Abstract: Accurate prediction of DNA-binding residues has become a problem of increasing importance in structural bioinformatics. Here, we presented DNABind, a novel hybrid algorithm for identifying these crucial residues by exploiting the complementarity between machine learning- and template-based methods. Our machine learning-based method was based on the probabilistic combination of a structure-based and a sequence-based predictor, both of which were implemented using support vector machines algorithms. The former included our well-designed structural features, such as solvent accessibility, local geometry, topological features, and relative positions, which can effectively quantify the difference between DNA-binding and nonbinding residues. The latter combined evolutionary conservation features with three other sequence attributes. Our template-based method depended on structural alignment and utilized the template structure from known protein-DNA complexes to infer DNA-binding residues. We showed that the template method had excellent performance when reliable templates were found for the query proteins but tended to be strongly influenced by the template quality as well as the conformational changes upon DNA binding. In contrast, the machine learning approach yielded better performance when high-quality templates were not available (about 1/3 cases in our dataset) or the query protein was subject to intensive transformation changes upon DNA binding. Our extensive experiments indicated that the hybrid approach can distinctly improve the performance of the individual methods for both bound and unbound structures. DNABind also significantly outperformed the state-of-art algorithms by around 10% in terms of Matthews's correlation coefficient. The proposed methodology could also have wide application in various protein functional site annotations. DNABind is freely available at http://mleg.cse.sc.edu/DNABind/.

Journal ArticleDOI
01 Feb 2013-Proteins
TL;DR: The crystal structure of PF1127 is reported, a Cas protein of Pyrococcus furiosus DSM 3638 that is composed of 480 amino acids and belongs to the Csx1 family and it is demonstrated thatPF1127 binds double‐stranded DNA and RNA and that this activity requires an intact β‐hairpin and involve the homodimerization of the protein.
Abstract: In many prokaryotic organisms, chromosomal loci known as clustered regularly interspaced short palindromic repeats (CRISPRs) and CRISPR-associated (CAS) genes comprise an acquired immune defense system against invading phages and plasmids. Although many different Cas protein families have been identified, the exact biochemical functions of most of their constituents remain to be determined. In this study, we report the crystal structure of PF1127, a Cas protein of Pyrococcus furiosus DSM 3638 that is composed of 480 amino acids and belongs to the Csx1 family. The C-terminal domain of PF1127 has a unique β-hairpin structure that protrudes out of an α-helix and contains several positively charged residues. We demonstrate that PF1127 binds double-stranded DNA and RNA and that this activity requires an intact β-hairpin and involve the homodimerization of the protein. In contrast, another Csx1 protein from Sulfolobus solfataricus P2 that is composed of 377 amino acids does not have the β-hairpin structure and exhibits no DNA-binding properties under the same experimental conditions. Notably, the C-terminal domain of these two Csx1 proteins is greatly diversified, in contrast to the conserved N-terminal domain, which appears to play a common role in the homodimerization of the protein. Thus, although P. furiosus Csx1 is identified as a nucleic acid-binding protein, other Csx1 proteins are predicted to exhibit different individual biochemical activities. Proteins 2013. © 2012 Wiley Periodicals, Inc.