scispace - formally typeset
Search or ask a question

Showing papers in "Proteins in 2016"


Journal ArticleDOI
01 Jun 2016-Proteins
TL;DR: A first assessment demonstrates that models can sometimes successfully address biological questions that motivate experimental structure determination, and there is continuing progress in accuracy of modeling regions of structure not directly available by comparative modeling, while there is marginal or no progress in some other areas.
Abstract: Modeling of protein structure from amino acid sequence now plays a major role in structural biology. Here we report new developments and progress from the CASP11 community experiment, assessing the state of the art in structure modeling. Notable points include the following: (1) New methods for predicting three dimensional contacts resulted in a few spectacular template free models in this CASP, whereas models based on sequence homology to proteins with experimental structure continue to be the most accurate. (2) Refinement of initial protein models, primarily using molecular dynamics related approaches, has now advanced to the point where the best methods can consistently (though slightly) improve nearly all models. (3) The use of relatively sparse NMR constraints dramatically improves the accuracy of models, and another type of sparse data, chemical crosslinking, introduced in this CASP, also shows promise for producing better models. (4) A new emphasis on modeling protein complexes, in collaboration with CAPRI, has produced interesting results, but also shows the need for more focus on this area. (5) Methods for estimating the accuracy of models have advanced to the point where they are of considerable practical use. (6) A first assessment demonstrates that models can sometimes successfully address biological questions that motivate experimental structure determination. (7) There is continuing progress in accuracy of modeling regions of structure not directly available by comparative modeling, while there is marginal or no progress in some other areas. Proteins 2016; 84(Suppl 1):4-14. © 2016 Wiley Periodicals, Inc.

229 citations


Journal ArticleDOI
Marc F. Lensink, Sameer Velankar1, Andriy Kryshtafovych, Shen You Huang2, Dina Schneidman-Duhovny, Andrej Sali3, Joan Segura4, Narcis Fernandez-Fuentes5, Shruthi Viswanath6, Ron Elber6, Sergei Grudinin7, Petr Popov7, Emilie Neveu7, Hasup Lee, Minkyung Baek, Sangwoo Park, Lim Heo, Gyu Rie Lee, Chaok Seok, Sanbo Qin8, Huan-Xiang Zhou8, David W. Ritchie9, Bernard Maigret10, Marie-Dominique Devignes10, Anisah W. Ghoorah11, Mieczyslaw Torchala12, Raphael A. G. Chaleil12, Paul A. Bates12, Efrat Ben-Zeev13, Miriam Eisenstein13, Surendra S. Negi14, Zhiping Weng15, Thom Vreven15, Brian G. Pierce15, Tyler M. Borrman15, Jinchao Yu16, Françoise Ochsenbein16, Raphael Guerois16, Anna Vangone, João P. G. L. M. Rodrigues, Gydo C. P. van Zundert, Mehdi Nellen, Li C. Xue, Ezgi Karaca, Adrien S. J. Melquiond, Koen M. Visscher, Panagiotis L. Kastritis, Alexandre M. J. J. Bonvin, Xianjin Xu, Liming Qiu, Chengfei Yan, Jilong Li, Zhiwei Ma, Jianlin Cheng, Xiaoqin Zou, Yang Shen17, Lenna X. Peterson18, Hyung Rae Kim18, Amit Roy18, Amit Roy19, Xusi Han18, Juan Esquivel-Rodríguez18, Daisuke Kihara18, Xiaofeng Yu20, Neil J. Bruce20, Jonathan C. Fuller20, Rebecca C. Wade21, Ivan Anishchenko22, Petras J. Kundrotas22, Ilya A. Vakser22, Kenichiro Imai23, Kazunori D. Yamada23, Toshiyuki Oda23, Tsukasa Nakamura24, Kentaro Tomii23, Chiara Pallara, Miguel Romero-Durana, Brian Jiménez-García, Iain H. Moal, Juan Fernández-Recio, Jong Young Joung25, Jong Yun Kim25, Keehyoung Joo25, Jooyoung Lee25, Jooyoung Lee26, Dima Kozakov27, Sandor Vajda27, Scott E. Mottarella27, David R. Hall27, Dmitri Beglov27, Artem B. Mamonov27, Bing Xia27, Tanggis Bohnuud27, Carlos A. Del Carpio28, Carlos A. Del Carpio29, Eichiro Ichiishi30, Nicholas A. Marze, Daisuke Kuroda, Shourya S. Roy Burman, Jeffrey J. Gray31, Edrisse Chermak32, Luigi Cavallo32, Romina Oliva33, Andrey Tovchigrechko34, Shoshana J. Wodak 
01 Jun 2016-Proteins
TL;DR: Results show that the prediction of homodimer assemblies by homology modeling techniques and docking calculations is quite successful for targets featuring large enough subunit interfaces to represent stable associations, and that docking procedures tend to perform better than standard homology modeled techniques.
Abstract: We present the results for CAPRI Round 30, the first joint CASP-CAPRI experiment, which brought together experts from the protein structure prediction and protein-protein docking communities. The Round comprised 25 targets from amongst those submitted for the CASP11 prediction experiment of 2014. The targets included mostly homodimers, a few homotetramers, and two heterodimers, and comprised protein chains that could readily be modeled using templates from the Protein Data Bank. On average 24 CAPRI groups and 7 CASP groups submitted docking predictions for each target, and 12 CAPRI groups per target participated in the CAPRI scoring experiment. In total more than 9500 models were assessed against the 3D structures of the corresponding target complexes. Results show that the prediction of homodimer assemblies by homology modeling techniques and docking calculations is quite successful for targets featuring large enough subunit interfaces to represent stable associations. Targets with ambiguous or inaccurate oligomeric state assignments, often featuring crystal contact-sized interfaces, represented a confounding factor. For those, a much poorer prediction performance was achieved, while nonetheless often providing helpful clues on the correct oligomeric state of the protein. The prediction performance was very poor for genuine tetrameric targets, where the inaccuracy of the homology-built subunit models and the smaller pair-wise interfaces severely limited the ability to derive the correct assembly mode. Our analysis also shows that docking procedures tend to perform better than standard homology modeling techniques and that highly accurate models of the protein components are not always required to identify their association modes with acceptable accuracy. Proteins 2016; 84(Suppl 1):323-348. © 2016 Wiley Periodicals, Inc.

139 citations


Journal ArticleDOI
24 Feb 2016-Proteins
TL;DR: The results clearly demonstrate, in a blind prediction scenario, that coevolution derived contacts can considerably increase the accuracy of template‐free structure modeling.
Abstract: We describe CASP11 de novo blind structure predictions made using the Rosetta structure prediction methodology with both automatic and human assisted protocols. Model accuracy was generally improved using coevolution derived residue-residue contact information as restraints during Rosetta conformational sampling and refinement, particularly when the number of sequences in the family was more than three times the length of the protein. The highlight was the human assisted prediction of T0806, a large and topologically complex target with no homologs of known structure, which had unprecedented accuracy-<3.0 A root-mean-square deviation (RMSD) from the crystal structure over 223 residues. For this target, we increased the amount of conformational sampling over our fully automated method by employing an iterative hybridization protocol. Our results clearly demonstrate, in a blind prediction scenario, that coevolution derived contacts can considerably increase the accuracy of template-free structure modeling. Proteins 2016; 84(Suppl 1):67-75. © 2015 Wiley Periodicals, Inc.

107 citations


Journal ArticleDOI
01 Sep 2016-Proteins
TL;DR: Successful prediction of contacts was shown to be practically helpful in modeling three‐dimensional structures; in particular target T0806 was modeled exceedingly well with accuracy not yet seen for ab initio targets of this size (>250 residues).
Abstract: This article provides a report on the state-of-the-art in the prediction of intra-molecular residue-residue contacts in proteins based on the assessment of the predictions submitted to the CASP11 experiment. The assessment emphasis is placed on the accuracy in predicting long-range contacts. Twenty-nine groups participated in contact prediction in CASP11. At least eight of them used the recently developed evolutionary coupling techniques, with the top group (CONSIP2) reaching precision of 27% on target proteins that could not be modeled by homology. This result indicates a breakthrough in the development of methods based on the correlated mutation approach. Successful prediction of contacts was shown to be practically helpful in modeling three-dimensional structures; in particular target T0806 was modeled exceedingly well with accuracy not yet seen for ab initio targets of this size (>250 residues). Proteins 2016; 84(Suppl 1):131-144. © 2015 Wiley Periodicals, Inc.

94 citations


Journal ArticleDOI
23 Jun 2016-Proteins
TL;DR: The updated MolProbity rotamer‐library distributions derived from an order‐of‐magnitude larger and more stringently quality‐filtered dataset of about 8000 protein chains are described, and the resulting changes and improvements to model validation as seen by users are explained.
Abstract: Here we describe the updated MolProbity rotamer-library distributions derived from an order-of-magnitude larger and more stringently quality-filtered dataset of about 8000 (vs. 500) protein chains, and we explain the resulting changes and improvements to model validation as seen by users. To include only side-chains with satisfactory justification for their given conformation, we added residue-specific filters for electron-density value and model-to-density fit. The combined new protocol retains a million residues of data, while cleaning up false-positive noise in the multi- χ datapoint distributions. It enables unambiguous characterization of conformational clusters nearly 1000-fold less frequent than the most common ones. We describe examples of local interactions that favor these rare conformations, including the role of authentic covalent bond-angle deviations in enabling presumably strained side-chain conformations. Further, along with favored and outlier, an allowed category (0.3-2.0% occurrence in reference data) has been added, analogous to Ramachandran validation categories. The new rotamer distributions are used for current rotamer validation in MolProbity and PHENIX, and for rotamer choice in PHENIX model-building and refinement. The multi-dimensional χ distributions and Top8000 reference dataset are freely available on GitHub. These rotamers are termed "ultimate" because data sampling and quality are now fully adequate for this task, and also because we believe the future of conformational validation should integrate side-chain with backbone criteria. Proteins 2016; 84:1177-1189. © 2016 Wiley Periodicals, Inc.

90 citations


Journal ArticleDOI
01 Sep 2016-Proteins
TL;DR: In the CASP11 refinement experiment, three different refinement methods that utilize overall structure relaxation, loop modeling, and quality assessment of multiple initial structures are tested, concluding that the overall relaxation method can consistently improve model quality.
Abstract: Protein structures predicted by state-of-the-art template-based methods may still have errors when the template proteins are not similar enough to the target protein. Overall target structure may deviate from the template structures owing to differences in sequences. Structural information for some local regions such as loops may not be available when there are sequence insertions or deletions. Those structural aspects that originate from deviations from templates can be dealt with by ab initio structure refinement methods to further improve model accuracy. In the CASP11 refinement experiment, we tested three different refinement methods that utilize overall structure relaxation, loop modeling, and quality assessment of multiple initial structures. From this experiment, we conclude that the overall relaxation method can consistently improve model quality. Loop modeling is the most useful when the initial model structure is high quality, with GDT-HA >60. The method that used multiple initial structures further refined the already refined models; the minor improvements with this method raise the issue of problem with the current energy function. Future research directions are also discussed. Proteins 2016; 84(Suppl 1):293-301. © 2015 Wiley Periodicals, Inc.

77 citations


Journal ArticleDOI
20 Jan 2016-Proteins
TL;DR: It is suggested the use of automated scores similar to previous CASPs would provide a good system of evaluating performance, even in the absence of comprehensive manual assessment, and methodological advances enabled de novo modeling of much larger domain structures than was previously possible and allowed prediction of functional sites.
Abstract: We present an assessment of 'template-free modeling' (FM) in CASP11and ROLL. Community-wide server performance suggested the use of automated scores similar to previous CASPs would provide a good system of evaluating performance, even in the absence of comprehensive manual assessment. The CASP11 FM category included several outstanding examples, including successful prediction by the Baker group of a 256-residue target (T0806-D1) that lacked sequence similarity to any existing template. The top server model prediction by Zhang's Quark, which was apparently selected and refined by several manual groups, encompassed the entire fold of target T0837-D1. Methods from the same two groups tended to dominate overall CASP11 FM and ROLL rankings. Comparison of top FM predictions with those from the previous CASP experiment revealed progress in the category, particularly reflected in high prediction accuracy for larger protein domains. FM prediction models for two cases were sufficient to provide functional insights that were otherwise not obtainable by traditional sequence analysis methods. Importantly, CASP11 abstracts revealed that alignment-based contact prediction methods brought about much of the CASP11 progress, producing both of the functionally relevant models as well as several of the other outstanding structure predictions. These methodological advances enabled de novo modeling of much larger domain structures than was previously possible and allowed prediction of functional sites. Proteins 2016; 84(Suppl 1):51-66. © 2015 Wiley Periodicals, Inc.

76 citations


Journal ArticleDOI
01 Sep 2016-Proteins
TL;DR: Assessment of the model accuracy estimation methods participating in CASP11 finds methods that predict accuracy on the basis of a single model perform comparably to consensus methods in picking the best models and in the estimation of how accurate is the local structure.
Abstract: The article presents assessment of the model accuracy estimation methods participating in CASP11. The results of the assessment are expected to be useful to both-developers of the methods and users who way too often are presented with structural models without annotations of accuracy. The main emphasis is placed on the ability of techniques to identify the best models from among several available. Bivariate descriptive statistics and ROC analysis are used to additionally assess the overall correctness of the predicted model accuracy scores, the correlation between the predicted and observed accuracy of models, the effectiveness in distinguishing between good and bad models, the ability to discriminate between reliable and unreliable regions in models, and the accuracy of the coordinate error self-estimates. A rigid-body measure (GDT_TS) and three local-structure-based scores (LDDT, CADaa, and SphereGrinder) are used as reference measures for evaluating methods' performance. Consensus methods, taking advantage of the availability of several models for the same target protein, perform well on the majority of tasks. Methods that predict accuracy on the basis of a single model perform comparably to consensus methods in picking the best models and in the estimation of how accurate is the local structure. More groups than in previous experiments submitted reasonable error estimates of their own models, most likely in response to a recommendation from CASP and the increasing demand from users. Proteins 2016; 84(Suppl 1):349-369. © 2015 Wiley Periodicals, Inc.

75 citations


Journal ArticleDOI
01 Sep 2016-Proteins
TL;DR: Challenges still exist in long‐range beta‐strand folding, domain parsing, and the uncertainty of secondary structure prediction; the latter of which was found to affect nearly all aspects of FM structure predictions, from fragment identification, target classification, structure assembly, to final model selection.
Abstract: We tested two pipelines developed for template-free protein structure prediction in the CASP11 experiment. First, the QUARK pipeline constructs structure models by reassembling fragments of continuously distributed lengths excised from unrelated proteins. Five free-modeling (FM) targets have the model successfully constructed by QUARK with a TM-score above 0.4, including the first model of T0837-D1, which has a TM-score = 0.736 and RMSD = 2.9 A to the native. Detailed analysis showed that the success is partly attributed to the high-resolution contact map prediction derived from fragment-based distance-profiles, which are mainly located between regular secondary structure elements and loops/turns and help guide the orientation of secondary structure assembly. In the Zhang-Server pipeline, weakly scoring threading templates are re-ordered by the structural similarity to the ab initio folding models, which are then reassembled by I-TASSER based structure assembly simulations; 60% more domains with length up to 204 residues, compared to the QUARK pipeline, were successfully modeled by the I-TASSER pipeline with a TM-score above 0.4. The robustness of the I-TASSER pipeline can stem from the composite fragment-assembly simulations that combine structures from both ab initio folding and threading template refinements. Despite the promising cases, challenges still exist in long-range beta-strand folding, domain parsing, and the uncertainty of secondary structure prediction; the latter of which was found to affect nearly all aspects of FM structure predictions, from fragment identification, target classification, structure assembly, to final model selection. Significant efforts are needed to solve these problems before real progress on FM could be made. Proteins 2016; 84(Suppl 1):76-86. © 2015 Wiley Periodicals, Inc.

75 citations


Journal ArticleDOI
01 Apr 2016-Proteins
TL;DR: The phosphoepitope of AT8 was characterized through both peptide binding studies and costructures with phosphopeptides, and it was shown that AT8 bound to the triply phosphorylated tau peptide 30‐fold stronger than to the pS202/pT205 peptide, supporting the role of pS208 in AT8 recognition.
Abstract: Microtubule-associated protein tau becomes abnormally phosphorylated in Alzheimer's disease and other tauopathies and forms aggregates of paired helical filaments (PHF-tau). AT8 is a PHF-tau-specific monoclonal antibody that is a commonly used marker of neuropathology because of its recognition of abnormally phosphorylated tau. Previous reports described the AT8 epitope to include pS202/pT205. Our studies support and extend previous findings by also identifying pS208 as part of the binding epitope. We characterized the phosphoepitope of AT8 through both peptide binding studies and costructures with phosphopeptides. From the cocrystal structure of AT8 Fab with the diphosphorylated (pS202/pT205) peptide, it appeared that an additional phosphorylation at S208 would also be accommodated by AT8. Phosphopeptide binding studies showed that AT8 bound to the triply phosphorylated tau peptide (pS202/pT205/pS208) 30-fold stronger than to the pS202/pT205 peptide, supporting the role of pS208 in AT8 recognition. We also show that the binding kinetics of the triply phosphorylated peptide pS202/pT205/pS208 was remarkably similar to that of PHF-tau. The costructure of AT8 Fab with a pS202/pT205/pS208 peptide shows that the interaction interface involves all six CDRs and tau residues 202-209. All three phosphorylation sites are recognized by AT8, with pT205 acting as the anchor. Crystallization of the Fab/peptide complex under acidic conditions shows that CDR-L2 is prone to unfolding and precludes peptide binding, and may suggest a general instability in the antibody.

74 citations


Journal ArticleDOI
01 Sep 2016-Proteins
TL;DR: Analysis of the CASP submission from the Feig group focused on refinement success versus amount of sampling, refinement of different secondary structure elements and whether refinement varied as a function of which group provided initial models.
Abstract: Protein structure refinement during CASP11 by the Feig group was described. Molecular dynamics simulations were used in combination with an improved selection and averaging protocol. On average, modest refinement was achieved with some targets improved significantly. Analysis of the CASP submission from our group focused on refinement success versus amount of sampling, refinement of different secondary structure elements and whether refinement varied as a function of which group provided initial models. The refinement of local stereochemical features was examined via the MolProbity score and an updated protocol was developed that can generate high-quality structures with very low MolProbity scores for most starting structures with modest computational effort. Proteins 2016; 84(Suppl 1):282-292. © 2015 Wiley Periodicals, Inc.

Journal ArticleDOI
01 Sep 2016-Proteins
TL;DR: The CONSIP2 pipeline, the results and show that where the method underperformed, the major factor was relying on a fixed set of parameters for the initial sequence alignments and not attempting to perform domain splitting as a preprocessing step.
Abstract: Here we present the results of residue–residue contact predictions achieved in CASP11 by the CONSIP2 server, which is based around our MetaPSICOV contact prediction method. On a set of 40 target domains with a median family size of around 40 effective sequences, our server achieved an average top-L/5 long-range contact precision of 27%. MetaPSICOV method bases on a combination of classical contact prediction features, enhanced with three distinct covariation methods embedded in a two-stage neural network predictor. Some unique features of our approach are (1) the tuning between the classical and covariation features depending on the depth of the input alignment and (2) a hybrid approach to generate deepest possible multiple-sequence alignments by combining jackHMMer and HHblits. We discuss the CONSIP2 pipeline, our results and show that where the method underperformed, the major factor was relying on a fixed set of parameters for the initial sequence alignments and not attempting to perform domain splitting as a preprocessing step. Proteins 2015. © 2015 The Authors. Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.

Journal ArticleDOI
01 Oct 2016-Proteins
TL;DR: A new method of hydrating protein structures based on a semi‐empirical modification of a popular program for protein hydration Dowser, and the usage of protocols AutoDock Vina, and WaterDock is presented.
Abstract: A new method of hydrating protein structures, which we call Dowser++, is presented. The method is based on a semi-empirical modification of a popular program for protein hydration Dowser, and the usage of protocols AutoDock Vina, and WaterDock. The positions of water molecules predicted by Dowser++ were compared with experimental data for a set of 14 high-resolution crystal structures of oligopeptide-binding protein (OppA) containing a large number of resolved internal water molecules, as well as for the D- and K-channels of cytochrome c oxidase, and the recent data on PSII. Comparison is also made with the predictions of the original Dowser, and its improved version, Dowser+, described in our previous publication. We also present a model for quantitative estimation of the quality of water molecules placement made by a program, which includes an assumption of possible false negative data from the crystallographic analysis. The comparison of predictions made by Dowser++, Dowser and Dowser+ demonstrates significant improvement of predictive power of the new method. Proteins 2016; 84:1347-1357. © 2016 Wiley Periodicals, Inc.

Journal ArticleDOI
01 Sep 2016-Proteins
TL;DR: It was found that the inclusion of QUARK‐TBM simulations as an intermediate modeling step could help improve the quality of the I‐TASSER models for both Easy and Hard TBM targets, and the introduction of atomic‐level structure refinements following the reduced modeling simulations.
Abstract: We report the structure prediction results of a new composite pipeline for template-based modeling (TBM) in the 11th CASP experiment. Starting from multiple structure templates identified by LOMETS based meta-threading programs, the QUARK ab initio folding program is extended to generate initial full-length models under strong constraints from template alignments. The final atomic models are then constructed by I-TASSER based fragment reassembly simulations, followed by the fragment-guided molecular dynamic simulation and the MQAP-based model selection. It was found that the inclusion of QUARK-TBM simulations as an intermediate modeling step could help improve the quality of the I-TASSER models for both Easy and Hard TBM targets. Overall, the average TM-score of the first I-TASSER model is 12% higher than that of the best LOMETS templates, with the RMSD in the same threading-aligned regions reduced from 5.8 to 4.7 A. Nevertheless, there are nearly 18% of TBM domains with the templates deteriorated by the structure assembly pipeline, which may be attributed to the errors of secondary structure and domain orientation predictions that propagate through and degrade the procedures of template identification and final model selections. To examine the record of progress, we made a retrospective report of the I-TASSER pipeline in the last five CASP experiments (CASP7-11). The data show no clear progress of the LOMETS threading programs over PSI-BLAST; but obvious progress on structural improvement relative to threading templates was witnessed in recent CASP experiments, which is probably attributed to the integration of the extended ab initio folding simulation with the threading assembly pipeline and the introduction of atomic-level structure refinements following the reduced modeling simulations. Proteins 2016; 84(Suppl 1):233-246. © 2015 Wiley Periodicals, Inc.

Journal ArticleDOI
01 Sep 2016-Proteins
TL;DR: This study provides a structural foundation for improved reactivator design for the treatment of organophosphate intoxication and reveals relatively poor positioning for reactivation of paraoxon‐inhibited human acetylcholinesterase in complex with the oximes HI6 and 2‐PAM.
Abstract: Irreversible inhibition of the essential nervous system enzyme acetylcholinesterase by organophosphate nerve agents and pesticides may quickly lead to death. Oxime reactivators currently used as antidotes are generally less effective against pesticide exposure than nerve agent exposure, and pesticide exposure constitutes the majority of cases of organophosphate poisoning in the world. The current lack of published structural data specific to human acetylcholinesterase organophosphate-inhibited and oxime-bound states hinders development of effective medical treatments. We have solved structures of human acetylcholinesterase in different states in complex with the organophosphate insecticide, paraoxon, and oximes. Reaction with paraoxon results in a highly perturbed acyl loop that causes a narrowing of the gorge in the peripheral site that may impede entry of reactivators. This appears characteristic of acetylcholinesterase inhibition by organophosphate insecticides but not nerve agents. Additional changes seen at the dimer interface are novel and provide further examples of the disruptive effect of paraoxon. Ternary structures of paraoxon-inhibited human acetylcholinesterase in complex with the oximes HI6 and 2-PAM reveals relatively poor positioning for reactivation. This study provides a structural foundation for improved reactivator design for the treatment of organophosphate intoxication. Proteins 2016; 84:1246-1256. © 2016 Wiley Periodicals, Inc.

Journal ArticleDOI
01 Jun 2016-Proteins
TL;DR: Monte Carlo simulations and coarse‐grained modeling have been used to analyze Histatin 5, an unstructured short cationic salivary peptide known to have anticandidical properties, to achieve a molecular understanding and a physico‐chemical insight of the obtained SAXS results.
Abstract: Monte Carlo simulations and coarse-grained modeling have been used to analyze Histatin 5, an unstructured short cationic salivary peptide known to have anticandidical properties. The calculated scattering functions have been compared with intensity curves and the distance distribution function P(r) obtained from small angle X-ray scattering (SAXS), at both high and low salt concentrations. The aim was to achieve a molecular understanding and a physico-chemical insight of the obtained SAXS results and to gain information of the conformational changes of Histatin 5 due to altering salt content, charge distribution, and net charge. From a modeling perspective, the accuracy of the electrostatic interactions are of special interest. The used coarse-grained model was based on the primitive model in which charged hard spheres differing in charge and in size represent the ionic particles, and the solvent only enters the model through its relative permittivity. The Hamiltonian of the model comprises three different contributions: (i) excluded volumes, (ii) electrostatic, and (iii) van der Waals interactions. Even though the model can be considered as gross omitting all atomistic details, a great correspondence is obtained with the experimental results. Proteins 2016; 84:777-791. © 2016 Wiley Periodicals, Inc.

Journal ArticleDOI
01 Sep 2016-Proteins
TL;DR: Fe2+ promotes the oligomerization by enhancing the peptide‐peptide interaction in the Aβ1‐42‐Zn2+ system, which suggested, Fe2+ promote the oligomersization by enhances the peptides' interaction.
Abstract: The metal ions Zn(2+) , Cu(2+) , and Fe(2+) play a significant role in the aggregation mechanism of Aβ peptides. However, the nature of binding between metal and peptide has remained elusive; the detailed information on this from the experimental study is very difficult. Density functional theory (dft) (M06-2X/6-311++G (2df,2pd) +LANL2DZ) has employed to determine the force field resulting due to metal and histidine interaction. We performed 200 ns molecular dynamics (MD) simulation on Aβ1-42 -Zn(2+) , Aβ1-42 -Cu(2+) , and Aβ1-42 -Fe(2+) systems in explicit water with different combination of coordinating residues including the three Histidine residues in the N-terminal. The present investigation, the Aβ1-42 -Zn(2+) system possess three turn conformations separated by coil structure. Zn(2+) binding caused the loss of the helical structure of N-terminal residues which transformed into the S-shaped conformation. Zn(2+) has reduced the coil and increases the turn content of the peptide compared with experimental study. On the other hand, the Cu(2+) binds with peptide, β sheet formation is observed at the N-terminal residues of the peptide. Fe(2+) binding is to promote the formation of Glu22-Lys28 salt-bridge which stabilized the turn conformation in the Phe19-Gly25 residues, subsequently β sheets were observed at His13-Lys18 and Gly29-Gly37 residues. The turn conformation facilitates the β sheets are arranged in parallel by enhancing the hydrophobic contact between Gly25 and Met35, Lys16 and Met35, Leu17 and Leu34, Val18 and Leu34 residues. The Fe(2+) binding reduced the helix structure and increases the β sheet content in the peptide, which suggested, Fe(2+) promotes the oligomerization by enhancing the peptide-peptide interaction. Proteins 2016; 84:1257-1274. © 2016 Wiley Periodicals, Inc.

Journal ArticleDOI
16 Sep 2016-Proteins
TL;DR: Testing it on a non‐redundant dataset of 41 TMPs and 285 soluble proteins, and applying strict performance measures, TMSEG outperformed the state‐of‐the‐art in the authors' hands and provides an add‐on improvement for any existing method to benefit from.
Abstract: Transmembrane proteins (TMPs) are important drug targets because they are essential for signaling, regulation, and transport. Despite important breakthroughs, experimental structure determination remains challenging for TMPs. Various methods have bridged the gap by predicting transmembrane helices (TMHs), but room for improvement remains. Here, we present TMSEG, a novel method identifying TMPs and accurately predicting their TMHs and their topology. The method combines machine learning with empirical filters. Testing it on a non-redundant dataset of 41 TMPs and 285 soluble proteins, and applying strict performance measures, TMSEG outperformed the state-of-the-art in our hands. TMSEG correctly distinguished helical TMPs from other proteins with a sensitivity of 98 ± 2% and a false positive rate as low as 3 ± 1%. Individual TMHs were predicted with a precision of 87 ± 3% and recall of 84 ± 3%. Furthermore, in 63 ± 6% of helical TMPs the placement of all TMHs and their inside/outside topology was correctly predicted. There are two main features that distinguish TMSEG from other methods. First, the errors in finding all helical TMPs in an organism are significantly reduced. For example, in human this leads to 200 and 1600 fewer misclassifications compared to the second and third best method available, and 4400 fewer mistakes than by a simple hydrophobicity-based method. Second, TMSEG provides an add-on improvement for any existing method to benefit from. Proteins 2016; 84:1706-1716. © 2016 Wiley Periodicals, Inc.

Journal ArticleDOI
11 Jan 2016-Proteins
TL;DR: It is demonstrated that relative, rather than the absolute, change of the folding and binding free energy serves as a good indicator for SAV association with disease.
Abstract: Single amino acid variations (SAV) occurring in human population result in natural differences between individuals or cause diseases. It is well understood that the molecular effect of SAV can be manifested as changes of the wild type characteristics of the corresponding protein, among which are the protein stability and protein interactions. Typically the effect of SAV on protein stability and interactions was assessed via the changes of the wild type folding and binding free energies. However, in terms of SAV affecting protein functionally and disease susceptibility, one wants to know to what extend the wild type function is perturbed by the SAV. Here it is demonstrated that relative, rather than the absolute, change of the folding and binding free energy serves as a good indicator for SAV association with disease. Using HumVar as a source for disease-causing SAV and experimentally determined free energy changes from ProTherm and SKEMPI databases, correlation coefficients (CC) between the disease index (Pd) and relative folding (Ppr,f) and binding (Ppr,b) probability indexes, respectively, was achieved. The obtained CCs demonstrated the applicability of the proposed approach and it served as good indicator for SAV association with disease.

Journal ArticleDOI
01 Jan 2016-Proteins
TL;DR: Three notable findings are presented as steps toward designing more thermophilic cutinase: surface salt bridge optimization produced enthalpic stabilization, mutations to proline reduced the entropy loss upon folding, and the lack of a correlative increase in the temperature optimum of catalytic activity with thermodynamic stability suggests that the active site is locally denatured at a temperature below the Tm of the global structure.
Abstract: Cutinases are powerful hydrolases that can cleave ester bonds of polyesters such as poly(ethylene terephthalate) (PET), opening up new options for enzymatic routes for polymer recycling and surface modification reactions. Cutinase from Aspergillus oryzae (AoC) is promising owing to the presence of an extended groove near the catalytic triad which is important for the orientation of polymeric chains. However, the catalytic efficiency of AoC on rigid polymers like PET is limited by its low thermostability; as it is essential to work at or over the glass transition temperature (Tg) of PET, that is, 70 °C. Consequently, in this study we worked toward the thermostabilization of AoC. Use of Rosetta computational protein design software in conjunction with rational design led to a 6 °C improvement in the thermal unfolding temperature (Tm) and a 10-fold increase in the half-life of the enzyme activity at 60 °C. Surprisingly, thermostabilization did not improve the rate or temperature optimum of enzyme activity. Three notable findings are presented as steps toward designing more thermophilic cutinase: (a) surface salt bridge optimization produced enthalpic stabilization, (b) mutations to proline reduced the entropy loss upon folding, and (c) the lack of a correlative increase in the temperature optimum of catalytic activity with thermodynamic stability suggests that the active site is locally denatured at a temperature below the Tm of the global structure.

Journal ArticleDOI
09 Mar 2016-Proteins
TL;DR: The role of the Protein Structure Prediction Center (predictioncenter.org) is outlined in conducting the CASP11 and CASP ROLL experiments, the experiment statistics are discussed, and an overview of the present CASP infrastructure is provided.
Abstract: We outline the role of the Protein Structure Prediction Center (predictioncenter.org) in conducting the CASP11 and CASP ROLL experiments, discuss the experiment statistics, and provide an overview of the present CASP infrastructure. The biggest changes compared to the previous CASPs are the implementation of the evaluation system incorporating practically all evaluation measures, statistical tests, and visualization tools historically used by the CASP assessors, the expansion of the infrastructure to incorporate new categories of contact-assisted and multimeric predictions, and the redesign of the assessors' web-workspace enabling assessments based on multiple measures for different group categories and target sets. Proteins 2016; 84(Suppl 1):15-19. © 2016 Wiley Periodicals, Inc.

Journal ArticleDOI
15 Jun 2016-Proteins
TL;DR: A blind experiment in the refinement of protein structure predictions, the fourth such experiment since CASP8, where the best groups were able to improve more than 70% of the targets from the starting models, and by an average of 3–5% in the standard CASP measures.
Abstract: CASP11 (the 11th Meeting on the Critical Assessment of Protein Structure Prediction) ran a blind experiment in the refinement of protein structure predictions, the fourth such experiment since CASP8. As with the previous experiments, the predictors were provided with one starting structure from the server models of each of a selected set of template-based modeling targets and asked to refine the coordinates of the starting structure toward native. We assessed the refined structures with the Z-scores of the standard CASP measures, which compare the model-target similarities of the models from all the predictors. Furthermore, we assessed the refined structures with "relative measures," which compare the improvement in accuracy of each model with respect to the starting structure. The latter provides an assessment of the extent to which each predictor group is able to improve the starting structures toward native. We utilized heat maps to display improvements in the Calpha-Calpha distance matrix for each model. The heat maps labeled with each element of secondary structure helped us to identify regions of refinement toward native in each model. Most positively scoring models show modest improvements in multiple regions of the structure, while in some models we were able to identify significant repositioning of N/C-terminal segments and internal elements of secondary structure. The best groups were able to improve more than 70% of the targets from the starting models, and by an average of 3-5% in the standard CASP measures. Proteins 2016; 84(Suppl 1):260-281. © 2016 Wiley Periodicals, Inc.

Journal ArticleDOI
01 Oct 2016-Proteins
TL;DR: A multi‐scale pipeline for an accurate, yet efficient prediction of peptides binding to the Hsp70 chaperone BiP by combining sequence‐based prediction with molecular docking and MMPBSA calculations is presented.
Abstract: Substrate binding to Hsp70 chaperones is involved in many biological processes, and the identification of potential substrates is important for a comprehensive understanding of these events. We present a multi-scale pipeline for an accurate, yet efficient prediction of peptides binding to the Hsp70 chaperone BiP by combining sequence-based prediction with molecular docking and MMPBSA calculations. First, we measured the binding of 15mer peptides from known substrate proteins of BiP by peptide array (PA) experiments and performed an accuracy assessment of the PA data by fluorescence anisotropy studies. Several sequence-based prediction models were fitted using this and other peptide binding data. A structure-based position-specific scoring matrix (SB-PSSM) derived solely from structural modeling data forms the core of all models. The matrix elements are based on a combination of binding energy estimations, molecular dynamics simulations, and analysis of the BiP binding site, which led to new insights into the peptide binding specificities of the chaperone. Using this SB-PSSM, peptide binders could be predicted with high selectivity even without training of the model on experimental data. Additional training further increased the prediction accuracies. Subsequent molecular docking (DynaDock) and MMGBSA/MMPBSA-based binding affinity estimations for predicted binders allowed the identification of the correct binding mode of the peptides as well as the calculation of nearly quantitative binding affinities. The general concept behind the developed multi-scale pipeline can readily be applied to other protein-peptide complexes with linearly bound peptides, for which sufficient experimental binding data for the training of classical sequence-based prediction models is not available. Proteins 2016; 84:1390-1407. © 2016 Wiley Periodicals, Inc.

Journal ArticleDOI
01 Nov 2016-Proteins
TL;DR: The first computational analysis of 13 somatic missense PTEN mutations associated with endometriosis, endometrial cancer and ovarian cancer finds that a majority of the mutations are associated in conserved positions within the active site and are clustered within the signature motif.
Abstract: The phosphatase and tensin homolog deleted on chromosome ten (PTEN) gene encodes a tumor suppressor phosphatase that has recently been found to be frequently mutated in patients with endometriosis, endometrial cancer and ovarian cancer. Here, we present the first computational analysis of 13 somatic missense PTEN mutations associated with these phenotypes. We found that a majority of the mutations are associated in conserved positions within the active site and are clustered within the signature motif, which contain residues that play a crucial role in loop conformation and are essential for catalysis. In silico analyses were utilized to identify the putative effects of these mutations. In addition, coarse-grained models of both wild-type (WT) PTEN and mutants were constructed using elastic network models to explore the interplay of the structural and global dynamic effects that the mutations have on the relationship between genotype and phenotype. The effects of the mutations reveal that the local structure and interactions affect polarity, protein structure stability, electrostatic surface potential, and global dynamics of the protein. Our results offer new insight into the role in which PTEN missense mutations contribute to the molecular mechanism and genotypic-phenotypic correlation of endometriosis, endometrial cancer, and ovarian cancer. Proteins 2016; 84:1625-1643. © 2016 Wiley Periodicals, Inc.

Journal ArticleDOI
Yuan Ping Pang1
21 Jul 2016-Proteins
TL;DR: Results suggest that FF12MC may be used for protein simulations to study kinetics and thermodynamics of miniprotein folding as well as protein structure and dynamics.
Abstract: Specialized to simulate proteins in molecular dynamics (MD) simulations with explicit solvation, FF12MC is a combination of a new protein simulation protocol employing uniformly reduced atomic masses by tenfold and a revised AMBER forcefield FF99 with (i) shortened CH bonds, (ii) removal of torsions involving a nonperipheral sp(3) atom, and (iii) reduced 1-4 interaction scaling factors of torsions ϕ and ψ. This article reports that in multiple, distinct, independent, unrestricted, unbiased, isobaric-isothermal, and classical MD simulations FF12MC can (i) simulate the experimentally observed flipping between left- and right-handed configurations for C14-C38 of BPTI in solution, (ii) autonomously fold chignolin, CLN025, and Trp-cage with folding times that agree with the experimental values, (iii) simulate subsequent unfolding and refolding of these miniproteins, and (iv) achieve a robust Z score of 1.33 for refining protein models TMR01, TMR04, and TMR07. By comparison, the latest general-purpose AMBER forcefield FF14SB locks the C14-C38 bond to the right-handed configuration in solution under the same protein simulation conditions. Statistical survival analysis shows that FF12MC folds chignolin and CLN025 in isobaric-isothermal MD simulations 2-4 times faster than FF14SB under the same protein simulation conditions. These results suggest that FF12MC may be used for protein simulations to study kinetics and thermodynamics of miniprotein folding as well as protein structure and dynamics. Proteins 2016; 84:1490-1516. © 2016 The Authors Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.

Journal ArticleDOI
01 Mar 2016-Proteins
TL;DR: Site‐directed mutagenesis of BaiE from Clostridium scindens VPI 12708 confirm that these residues are essential for catalysis and also the importance of other conserved residues, Tyr54 and Arg146, which are involved in substrate binding and affect catalytic turnover.
Abstract: Conversion of the primary bile acids cholic acid (CA) and chenodeoxycholic acid (CDCA) to the secondary bile acids deoxycholic acid (DCA) and lithocholic acid (LCA) is performed by a few species of intestinal bacteria in the genus Clostridium through a multistep biochemical pathway that removes a 7α-hydroxyl group. The rate-determining enzyme in this pathway is bile acid 7α-dehydratase (baiE). In this study, crystal structures of apo-BaiE and its putative product-bound [3-oxo-Δ(4,6) -lithocholyl-Coenzyme A (CoA)] complex are reported. BaiE is a trimer with a twisted α + β barrel fold with similarity to the Nuclear Transport Factor 2 (NTF2) superfamily. Tyr30, Asp35, and His83 form a catalytic triad that is conserved across this family. Site-directed mutagenesis of BaiE from Clostridium scindens VPI 12708 confirm that these residues are essential for catalysis and also the importance of other conserved residues, Tyr54 and Arg146, which are involved in substrate binding and affect catalytic turnover. Steady-state kinetic studies reveal that the BaiE homologs are able to turn over 3-oxo-Δ(4) -bile acid and CoA-conjugated 3-oxo-Δ(4) -bile acid substrates with comparable efficiency questioning the role of CoA-conjugation in the bile acid metabolism pathway.

Journal ArticleDOI
27 Jan 2016-Proteins
TL;DR: Protein target structures for the Critical Assessment of Structure Prediction round 11 (CASP11) and CASP ROLL were split into domains and classified into categories suitable for assessment of template‐based modeling (TBM) and free modeling (FM) based on their evolutionary relatedness to existing structures classified by theECOD database.
Abstract: Protein target structures for the Critical Assessment of Structure Prediction round 11 (CASP11) and CASP ROLL were split into domains and classified into categories suitable for assessment of template-based modeling (TBM) and free modeling (FM) based on their evolutionary relatedness to existing structures classified by the Evolutionary Classification of Protein Domains (ECOD) database. First, target structures were divided into domain-based evaluation units. Target splits were based on the domain organization of available templates as well as the performance of servers on whole targets compared to split target domains. Second, evaluation units were classified into TBM and FM categories using a combination of measures that evaluate prediction quality and template detectability. Generally, target domains with sequence-related templates and good server prediction performance were classified as TBM, whereas targets without sequence-identifiable templates and low server performance were classified as FM. As in previous CASP experiments, the boundaries for classification were blurred due to the presence of significant insertions and deteriorations in the targets with respect to homologous templates, as well as the presence of templates with partial coverage of new folds. The FM category included 45 target domains, which represents an unprecedented number of difficult CASP targets provided for modeling. Proteins 2016; 84(Suppl 1):20-33. © 2016 Wiley Periodicals, Inc.

Journal ArticleDOI
01 Jan 2016-Proteins
TL;DR: The consideration and ranking of different rotameric states for a mutated residue was found to be essential to achieve satisfactory agreement with the reference data and this new model provides a reasonable agreement with experiment for absolute folding free energies of several β‐barrel membrane proteins.
Abstract: Obtaining a quantitative description of the membrane proteins stability is crucial for understanding many biological processes. However the advance in this direction has remained a major challenge for both experimental studies and molecular modeling. One of the possible directions is the use of coarse-grained models but such models must be carefully calibrated and validated. Here we use a recent progress in benchmark studies on the energetics of amino acid residue and peptide membrane insertion and membrane protein stability in refining our previously developed coarse-grained model (Vicatos et al., Proteins 2014;82:1168). Our refined model parameters were fitted and/or tested to reproduce water/membrane partitioning energetics of amino acid side chains and a couple of model peptides. This new model provides a reasonable agreement with experiment for absolute folding free energies of several β-barrel membrane proteins as well as effects of point mutations on a relative stability for one of those proteins, OmpLA. The consideration and ranking of different rotameric states for a mutated residue was found to be essential to achieve satisfactory agreement with the reference data.

Journal ArticleDOI
01 May 2016-Proteins
TL;DR: The results indicate a preference toward NADPH for all IREDs and explain why, despite their sequence similarity to β‐hydroxyacid dehydrogenases (β‐HADs), no conversion of β‐Hydroxyacids has been observed.
Abstract: Chiral amines are valuable building blocks for the production of a variety of pharmaceuticals, agrochemicals and other specialty chemicals. Only recently, imine reductases (IREDs) were discovered which catalyze the stereoselective reduction of imines to chiral amines. Although several IREDs were biochemically characterized in the last few years, knowledge of the reaction mechanism and the molecular basis of substrate specificity and stereoselectivity is limited. To gain further insights into the sequence-function relationships, the Imine Reductase Engineering Database (www.IRED.BioCatNet.de) was established and a systematic analysis of 530 putative IREDs was performed. A standard numbering scheme based on R-IRED-Sk was introduced to facilitate the identification and communication of structurally equivalent positions in different proteins. A conservation analysis revealed a highly conserved cofactor binding region and a predominantly hydrophobic substrate binding cleft. Two IRED-specific motifs were identified, the cofactor binding motif GLGxMGx(5 )[ATS]x(4) Gx(4) [VIL]WNR[TS]x(2) [KR] and the active site motif Gx[DE]x[GDA]x[APS]x(3){K}x[ASL]x[LMVIAG]. Our results indicate a preference toward NADPH for all IREDs and explain why, despite their sequence similarity to β-hydroxyacid dehydrogenases (β-HADs), no conversion of β-hydroxyacids has been observed. Superfamily-specific conservations were investigated to explore the molecular basis of their stereopreference. Based on our analysis and previous experimental results on IRED mutants, an exclusive role of standard position 187 for stereoselectivity is excluded. Alternatively, two standard positions 139 and 194 were identified which are superfamily-specifically conserved and differ in R- and S-selective enzymes.

Journal ArticleDOI
01 Sep 2016-Proteins
TL;DR: The improved performance of the nns server method is attributed to successful fold recognition achieved by combining several methods including CRFalign and to the current modeling formulation that can incorporate native‐like structural aspects present in multiple templates.
Abstract: For the template-based modeling (TBM) of CASP11 targets, we have developed three new protein modeling protocols (nns for server prediction and LEE and LEER for human prediction) by improving upon our previous CASP protocols (CASP7 through CASP10). We applied the powerful global optimization method of conformational space annealing to three stages of optimization, including multiple sequence-structure alignment, three-dimensional (3D) chain building, and side-chain remodeling. For more successful fold recognition, a new alignment method called CRFalign was developed. It can incorporate sensitive positional and environmental dependence in alignment scores as well as strong nonlinear correlations among various features. Modifications and adjustments were made to the form of the energy function and weight parameters pertaining to the chain building procedure. For the side-chain remodeling step, residue-type dependence was introduced to the cutoff value that determines the entry of a rotamer to the side-chain modeling library. The improved performance of the nns server method is attributed to successful fold recognition achieved by combining several methods including CRFalign and to the current modeling formulation that can incorporate native-like structural aspects present in multiple templates. The LEE protocol is identical to the nns one except that CASP11-released server models are used as templates. The success of LEE in utilizing CASP11 server models indicates that proper template screening and template clustering assisted by appropriate cluster ranking promises a new direction to enhance protein 3D modeling. Proteins 2016; 84(Suppl 1):221-232. © 2015 Wiley Periodicals, Inc.