scispace - formally typeset
Search or ask a question

Showing papers in "BMC Structural Biology in 2009"


Journal ArticleDOI
TL;DR: This work has implemented a method that predicts the relative surface accessibility of an amino acid and simultaneously predicts the reliability for each prediction, in the form of a Z-score, which is comparable to the performance of the currently best public available method, Real-SPINE.
Abstract: Background Estimation of the reliability of specific real value predictions is nontrivial and the efficacy of this is often questionable. It is important to know if you can trust a given prediction and therefore the best methods associate a prediction with a reliability score or index. For discrete qualitative predictions, the reliability is conventionally estimated as the difference between output scores of selected classes. Such an approach is not feasible for methods that predict a biological feature as a single real value rather than a classification. As a solution to this challenge, we have implemented a method that predicts the relative surface accessibility of an amino acid and simultaneously predicts the reliability for each prediction, in the form of a Z-score.

618 citations


Journal ArticleDOI
TL;DR: Improved model selection is obtained by using a composite scoring function operating on single models in order to enrich higher quality models which are subsequently used to calculate the structural consensus.
Abstract: The selection of the most accurate protein model from a set of alternatives is a crucial step in protein structure prediction both in template-based and ab initio approaches. Scoring functions have been developed which can either return a quality estimate for a single model or derive a score from the information contained in the ensemble of models for a given sequence. Local structural features occurring more frequently in the ensemble have a greater probability of being correct. Within the context of the CASP experiment, these so called consensus methods have been shown to perform considerably better in selecting good candidate models, but tend to fail if the best models are far from the dominant structural cluster. In this paper we show that model selection can be improved if both approaches are combined by pre-filtering the models used during the calculation of the structural consensus. Our recently published QMEAN composite scoring function has been improved by including an all-atom interaction potential term. The preliminary model ranking based on the new QMEAN score is used to select a subset of reliable models against which the structural consensus score is calculated. This scoring function called QMEANclust achieves a correlation coefficient of predicted quality score and GDT_TS of 0.9 averaged over the 98 CASP7 targets and perform significantly better in selecting good models from the ensemble of server models than any other groups participating in the quality estimation category of CASP7. Both scoring functions are also benchmarked on the MOULDER test set consisting of 20 target proteins each with 300 alternatives models generated by MODELLER. QMEAN outperforms all other tested scoring functions operating on individual models, while the consensus method QMEANclust only works properly on decoy sets containing a certain fraction of near-native conformations. We also present a local version of QMEAN for the per-residue estimation of model quality (QMEANlocal) and compare it to a new local consensus-based approach. Improved model selection is obtained by using a composite scoring function operating on single models in order to enrich higher quality models which are subsequently used to calculate the structural consensus. The performance of consensus-based methods such as QMEANclust highly depends on the composition and quality of the model ensemble to be analysed. Therefore, performance estimates for consensus methods based on large meta-datasets (e.g. CASP) might overrate their applicability in more realistic modelling situations with smaller sets of models based on individual methods.

147 citations


Journal ArticleDOI
TL;DR: It appears that the consensus prediction tool is slightly more objective than individual prediction methods alone and suggests several previously not identified amino acid stretches as potential amyloidogenic determinants, which (although several of them may be overpredictions) require further experimental studies.
Abstract: Amyloidoses are a group of usually fatal diseases, probably caused by protein misfolding and subsequent aggregation into amyloid fibrillar deposits. The mechanisms involved in amyloid fibril formation are largely unknown and are the subject of current, intensive research. In an attempt to identify possible amyloidogenic regions in proteins for further experimental investigation, we have developed and present here a publicly available online tool that utilizes five different and independently published methods, to form a consensus prediction of amyloidogenic regions in proteins, using only protein primary structure data. It appears that the consensus prediction tool is slightly more objective than individual prediction methods alone and suggests several previously not identified amino acid stretches as potential amyloidogenic determinants, which (although several of them may be overpredictions) require further experimental studies. The tool is available at: http://biophysics.biol.uoa.gr/AMYLPRED . Utilizing molecular graphics programs, like O and PyMOL, as well as the algorithm DSSP, it was found that nearly all experimentally verified amyloidogenic determinants (short peptide stretches favouring aggregation and subsequent amyloid formation), and several predicted, with the aid of the tool AMYLPRED, but not experimentally verified amyloidogenic determinants, are located on the surface of the relevant amyloidogenic proteins. This finding may be important in efforts directed towards inhibiting amyloid fibril formation. The most significant result of this work is the observation that virtually all, to date, experimentally determined amyloidogenic determinants and the majority of predicted, but not yet experimentally verified short amyloidogenic stretches, lie 'exposed' on the surface of the relevant amyloidogenic proteins, and also several of them have the ability to act as conformational 'switches'. Experiments, focused on these fragments, should be performed to test this idea.

129 citations


Journal ArticleDOI
TL;DR: The Mesothelin superfamily of proteins, which includes mesothelin, mesotheli precursor, megakaryocyte potentiating factor, MPFL, stereocilin and otoancorin, are predicted to have superhelical structures with ARM-type repeats and it is suggested that all of these function assuperhelical lectins to bind the carbohydrate moieties of extracellular glycoproteins.
Abstract: Mesothelin is a 40 kDa protein present on the surface of normal mesothelial cells and overexpressed in many human tumours, including mesothelioma and ovarian and pancreatic adenocarcinoma. It forms a strong and specific complex with MUC16, which is also highly expressed on the surface of mesothelioma and ovarian cancer cells. This binding has been suggested to be the basis of ovarian cancer metastasis. Knowledge of the structure of this protein will be useful, for example, in building a structural model of the MUC16-mesothelin complex. Mesothelin is produced as a precursor, which is cleaved by furin to produce the N-terminal half, which is called the megakaryocyte potentiating factor (MPF), and the C-terminal half, which is mesothelin. Little is known about the function of mesothelin and there is no information on its possible three-dimensional structure. Mesothelin has been reported to be homologous to the deafness-related inner ear proteins otoancorin and stereocilin, for neither of which the three-dimensional structure is known. The BLAST and PSI-BLAST searches confirmed that mesothelin and mesothelin precursor proteins are remotely homologous to stereocilin and otoancorin and more closely homologous to the hypothetical protein MPFL (MPF-like). Secondary structure prediction servers predicted a predominantly helical structure for both mesothelin and mesothelin precursor proteins and also for stereocilin and otoancorin. Three-dimensional structure prediction servers INHUB and I-TASSER produced structural models for mesothelin, which consisted of superhelical structures with ARM-type repeats in conformity with the secondary structure predictions. Similar ARM-type superhelical repeat structures were predicted by 3D-PSSM server for mesothelin precursor and for stereocilin and otoancorin proteins. The mesothelin superfamily of proteins, which includes mesothelin, mesothelin precursor, megakaryocyte potentiating factor, MPFL, stereocilin and otoancorin, are predicted to have superhelical structures with ARM-type repeats. We suggest that all of these function as superhelical lectins to bind the carbohydrate moieties of extracellular glycoproteins.

101 citations


Journal ArticleDOI
TL;DR: Comparison of informational and structural properties of the hemagglutinin (HA) of H5N1 virus and human influenza virus subtypes may help to better understand the interaction of influenza virus with its receptor(s) and to identify new therapeutic targets for drug development.
Abstract: Background Epidemics caused by highly pathogenic avian influenza virus (HPAIV) are a continuing threat to human health and to the world's economy. The development of approaches, which help to understand the significance of structural changes resulting from the alarming mutational propensity for human-to-human transmission of HPAIV, is of particularly interest. Here we compare informational and structural properties of the hemagglutinin (HA) of H5N1 virus and human influenza virus subtypes, which are important for the receptor/virus interaction.

84 citations


Journal ArticleDOI
TL;DR: Insight is provided into the mechanism of NQO2 inhibition by imatinib, with potential implications for drug design and treatment of chronic myelogenous leukemia in patients.
Abstract: Imatinib represents the first in a class of drugs targeted against chronic myelogenous leukemia to enter the clinic, showing excellent efficacy and specificity for Abl, Kit, and PDGFR kinases. Recent screens carried out to find off-target proteins that bind to imatinib identified the oxidoreductase NQO2, a flavoprotein that is phosphorylated in a chronic myelogenous leukemia cell line. We examined the inhibition of NQO2 activity by the Abl kinase inhibitors imatinib, nilotinib, and dasatinib, and obtained IC50 values of 80 nM, 380 nM, and >100 μM, respectively. Using electronic absorption spectroscopy, we show that imatinib binding results in a perturbation of the protein environment around the flavin prosthetic group in NQO2. We have determined the crystal structure of the complex of imatinib with human NQO2 at 1.75 A resolution, which reveals that imatinib binds in the enzyme active site, adjacent to the flavin isoalloxazine ring. We find that phosphorylation of NQO2 has little effect on enzyme activity and is therefore likely to regulate other aspects of NQO2 function. The structure of the imatinib-NQO2 complex demonstrates that imatinib inhibits NQO2 activity by competing with substrate for the active site. The overall conformation of imatinib when bound to NQO2 resembles the folded conformation observed in some kinase complexes. Interactions made by imatinib with residues at the rim of the active site provide an explanation for the binding selectivity of NQO2 for imatinib, nilotinib, and dasatinib. These interactions also provide a rationale for the lack of inhibition of the related oxidoreductase NQO1 by these compounds. Taken together, these studies provide insight into the mechanism of NQO2 inhibition by imatinib, with potential implications for drug design and treatment of chronic myelogenous leukemia in patients.

81 citations


Journal ArticleDOI
Lingling Wu1, Hong-Wen Gao1, Nai-Yun Gao1, Fang-Fang Chen1, Ling Chen1 
TL;DR: A characterization method for the intermolecular weak interaction of PFOA with HSA is suggested, potentially useful for elucidating the toxigenicity of perfluorochemicals when combined with biomolecular function effect, transmembrane transport, toxicological testing and the other experiments.
Abstract: Recently, perfluorooctanoic acid (PFOA) has become a significant issue in many aspects of environmental ecology, toxicology, pathology and life sciences because it may have serious effects on the endocrine, immune and nervous systems and can lead to embryonic deformities and other diseases. Human serum albumin (HSA) is the major protein component of blood plasma and is called a multifunctional plasma carrier protein because of its ability to bind an unusually broad spectrum of ligands. The interaction of PFOA with HSA was investigated in the normal physiological condition by equilibrium dialysis, fluorospectrometry, isothermal titration calorimetry (ITC) and circular dichroism (CD). The non-covalent interaction is resulted from hydrogen bond, van der Waals force and hydrophobic stack. PFOA binding to HSA accorded with two-step binding model with the saturation binding numbers of PFOA, only 1 in the hydrophobic intracavity of HSA and 12 on the exposed outer surface. The interaction of PFOA with HSA is spontaneous and results in change of HSA conformation. The possible binding sites were speculated. The present work suggested a characterization method for the intermolecular weak interaction. It is potentially useful for elucidating the toxigenicity of perfluorochemicals when combined with biomolecular function effect, transmembrane transport, toxicological testing and the other experiments.

77 citations


Journal ArticleDOI
TL;DR: Three-dimensional models of two different catalytic states of P-glycoprotein that were developed based on the crystal structures of two bacterial multidrug transporters suggest that the protein has multiple binding sites in agreement with experimental evidence.
Abstract: P-glycoprotein belongs to the family of ATP-binding cassette proteins which hydrolyze ATP to catalyse the translocation of their substrates through membranes. This protein extrudes a large range of components out of cells, especially therapeutic agents causing a phenomenon known as multidrug resistance. Because of its clinical interest, its activity and transport function have been largely characterized by various biochemical studies. In the absence of a high-resolution structure of P-glycoprotein, homology modeling is a useful tool to help interpretation of experimental data and potentially guide experimental studies. We present here three-dimensional models of two different catalytic states of P-glycoprotein that were developed based on the crystal structures of two bacterial multidrug transporters. Our models are supported by a large body of biochemical data. Measured inter-residue distances correlate well with distances derived from cross-linking data. The nucleotide-free model features a large cavity detected in the protein core into which ligands of different size were successfully docked. The locations of docked ligands compare favorably with those suggested by drug binding site mutants. Our models can interpret the effects of several mutants in the nucleotide-binding domains (NBDs), within the transmembrane domains (TMDs) or at the NBD:TMD interface. The docking results suggest that the protein has multiple binding sites in agreement with experimental evidence. The nucleotide-bound models are exploited to propose different pathways of signal transmission upon ATP binding/hydrolysis which could lead to the elaboration of conformational changes needed for substrate translocation. We identified a cluster of aromatic residues located at the interface between the NBD and the TMD in opposite halves of the molecule which may contribute to this signal transmission. Our models may characterize different steps in the catalytic cycle and may be important tools to understand the structure-function relationship of P-glycoprotein.

76 citations


Journal ArticleDOI
TL;DR: A new mapping is created between SCOP and CATH and a consistent benchmark set is defined which is shown to largely reduce errors made by structure comparison methods such as TM-Align and has useful further applications, e.g. for machine learning methods being trained for protein structure classification.
Abstract: Background SCOP and CATH are widely used as gold standards to benchmark novel protein structure comparison methods as well as to train machine learning approaches for protein structure classification and prediction. The two hierarchies result from different protocols which may result in differing classifications of the same protein. Ignoring such differences leads to problems when being used to train or benchmark automatic structure classification methods. Here, we propose a method to compare SCOP and CATH in detail and discuss possible applications of this analysis.

75 citations


Journal ArticleDOI
TL;DR: CRYSTALP2 provides relatively accurate crystallization propensity predictions for a given protein chain that either outperform or complement the existing approaches and is complementary to the predictions of the most recent ParCrys and XtalPred methods.
Abstract: Background Current protocols yield crystals for <30% of known proteins, indicating that automatically identifying crystallizable proteins may improve high-throughput structural genomics efforts. We introduce CRYSTALP2, a kernel-based method that predicts the propensity of a given protein sequence to produce diffraction-quality crystals. This method utilizes the composition and collocation of amino acids, isoelectric point, and hydrophobicity, as estimated from the primary sequence, to generate predictions. CRYSTALP2 extends its predecessor, CRYSTALP, by enabling predictions for sequences of unrestricted size and provides improved prediction quality.

74 citations


Journal ArticleDOI
TL;DR: The results contribute to better understanding of the origin of the novel A/H1N1 influenza virus, provide a tool for monitoring its molecular evolution and predicts hotspots associated with enhanced infectivity in humans and identify therapeutic and diagnostic targets for prevention and treatment of A/h1n1 infection.
Abstract: The novel A/H1N1 influenza virus, which recently emerged in North America is most closely related to North American H1N1/N2 swine viruses. Until the beginning of 2009, North American swine H1N1/N2 viruses have only sporadically infected humans as dead-end hosts. In 2009 the A/H1N1 virus acquired the capacity to spread efficiently by human to human transmission. The novel A/H1N1 influenza virus has struck thousands of people in more than 70 countries and killed more than 140, representing a public health emergency of international concern. Here we have studied properties of hemagglutinin of A/H1N1 which may modulate virus/receptor interaction. Analyses by ISM bioinformatics platform of the HA1 protein of North American swine H1N1/N2 viruses and the new A/H1N1 showed that both groups of viruses differed in conserved characteristics that reflect a distinct propensity of these viruses to undergo a specific interaction with swine or human host proteins or receptors. Swine H1N1/N2 viruses that sporadically infected humans featured both the swine and the human interaction pattern. Substitutions F71S, T128S, E302K, M314L in HA1 of swine H1N1 viruses from North America are identified as critical for the human interaction pattern of A/H1N1 and residues D94, D196 and D274 are predicted to be "hot-spots" for polymorphisms which could increase infectivity of A/H1N1 virus. At least one of these residues has already emerged in the A/H1N1 isolates from Spain, Italy and USA. The domain 286-326 was identified to be involved in virus/receptor interaction. Our results (i) contribute to better understanding of the origin of the novel A/H1N1 influenza virus, (ii) provide a tool for monitoring its molecular evolution (iii) predicts hotspots associated with enhanced infectivity in humans and (iv) identify therapeutic and diagnostic targets for prevention and treatment of A/H1N1 infection.

Journal ArticleDOI
TL;DR: Molecular dynamics simulations of a closed homology model and an open crystal structure of Burkholderia cepacia lipase in water and toluene were performed to investigate the influence of solvents on structure, dynamics, and the conformational transition of the lid.
Abstract: Background The characteristic of most lipases is the interfacial activation at a lipid interface or in non-polar solvents. Interfacial activation is linked to a large conformational change of a lid, from a closed to an open conformation which makes the active site accessible for substrates. While for many lipases crystal structures of the closed and open conformation have been determined, the pathway of the conformational transition and possible bottlenecks are unknown. Therefore, molecular dynamics simulations of a closed homology model and an open crystal structure of Burkholderia cepacia lipase in water and toluene were performed to investigate the influence of solvents on structure, dynamics, and the conformational transition of the lid.

Journal ArticleDOI
TL;DR: By modeling how the protein structural fluctuations respond to residue-position-specific perturbations, the highly efficient perturbation and correlation analysis can be used to dissect the functional conformational changes in various proteins with a residue level of detail.
Abstract: Backgrounds: It is increasingly recognized that protein functions often require intricate conformational dynamics, which involves a network of key amino acid residues that couple spatially separated functional sites Tremendous efforts have been made to identify these key residues by experimental and computational means Results: We have performed a large-scale evaluation of the predictions of dynamically important residues by a variety of computational protocols including three based on the perturbation and correlation analysis of a coarse-grained elastic model This study is performed for two lists of test cases with >500 pairs of protein structures The dynamically important residues predicted by the perturbation and correlation analysis are found to be strongly or moderately conserved in >67% of test cases They form a sparse network of residues which are clustered both in 3D space and along protein sequence Their overall conservation is attributed to their dynamic role rather than ligand binding or high network connectivity Conclusion: By modeling how the protein structural fluctuations respond to residue-positionspecific perturbations, our highly efficient perturbation and correlation analysis can be used to dissect the functional conformational changes in various proteins with a residue level of detail The predictions of dynamically important residues serve as promising targets for mutational and functional studies

Journal ArticleDOI
TL;DR: Accurate predictions of multi-class maps may provide valuable constraints for improved ab initio and template-based prediction of protein structures, naturally incorporate multiple templates, and yield state-of-the-art binary maps.
Abstract: Background Prediction of protein structures from their sequences is still one of the open grand challenges of computational biology. Some approaches to protein structure prediction, especially ab initio ones, rely to some extent on the prediction of residue contact maps. Residue contact map predictions have been assessed at the CASP competition for several years now. Although it has been shown that exact contact maps generally yield correct three-dimensional structures, this is true only at a relatively low resolution (3–4 A from the native structure). Another known weakness of contact maps is that they are generally predicted ab initio, that is not exploiting information about potential homologues of known structure.

Journal ArticleDOI
TL;DR: Though the basic architecture of the fold is well conserved in these proteins, significant differences exist in their sequence, nature of substrate and oligomerization, which led to predictions regarding the functional classification and identification of possible catalytic residues of a number of hot dog fold-containing hypothetical proteins whose structures were determined in high throughput structural genomics projects.
Abstract: Background The hot dog fold has been found in more than sixty proteins since the first report of its existence about a decade ago. The fold appears to have a strong association with fatty acid biosynthesis, its regulation and metabolism, as the proteins with this fold are predominantly coenzyme A-binding enzymes with a variety of substrates located at their active sites.

Journal ArticleDOI
TL;DR: Detailed structural description of the S. aureus PrsA-PPIase lays the foundation for structure-based design of enzyme inhibitors and findings on the role of the conserved active site histidines help in designing further experiments to solve the detailed catalytic mechanism.
Abstract: Staphylococcus aureus is a Gram-positive pathogenic bacterium causing many kinds of infections from mild respiratory tract infections to life-threatening states as sepsis. Recent emergence of S. aureus strains resistant to numerous antibiotics has created a need for new antimicrobial agents and novel drug targets. S. aureus PrsA is a membrane associated extra-cytoplasmic lipoprotein which contains a parvulin-type peptidyl-prolyl cis-trans isomerase domain. PrsA is known to act as an essential folding factor for secreted proteins in Gram-positive bacteria and thus it is a potential target for antimicrobial drugs against S. aureus. We have solved a high-resolution solution structure of the parvulin-type peptidyl-prolyl cis-trans isomerase domain of S. aureus PrsA (PrsA-PPIase). The results of substrate peptide titrations pinpoint the active site and demonstrate the substrate preference of the enzyme. With detailed NMR spectroscopic investigation of the orientation and tautomeric state of the active site histidines we are able to give further insight into the structure of the catalytic site. NMR relaxation analysis gives information on the dynamic behaviour of PrsA-PPIase. Detailed structural description of the S. aureus PrsA-PPIase lays the foundation for structure-based design of enzyme inhibitors. The structure resembles hPin1-type parvulins both structurally and regarding substrate preference. Even though a wealth of structural data is available on parvulins, the catalytic mechanism has yet to be resolved. The structure of S. aureus PrsA-PPIase and our findings on the role of the conserved active site histidines help in designing further experiments to solve the detailed catalytic mechanism.

Journal ArticleDOI
TL;DR: The method was robust toward small differences in initial structures (different crystallisation conditions or a co-crystallised ligand), although large displacements of catalytic residues often resulted in substrate poses that did not pass the geometric filter criteria.
Abstract: Background Previously, ways to adapt docking programs that were developed for modelling inhibitor-receptor interaction have been explored. Two main issues were discussed. First, when trying to model catalysis a reaction intermediate of the substrate is expected to provide more valid information than the ground state of the substrate. Second, the incorporation of protein flexibility is essential for reliable predictions.

Journal ArticleDOI
TL;DR: The present system verifies that sequence divergence including information of unaligned regions is a good indicator of ID regions, and for the first time estimates the complete fractioning of structured/un-structured regions in human TFs, also revealing structural domains without homology to known structures.
Abstract: In addition to structural domains, most eukaryotic proteins possess intrinsically disordered (ID) regions. Although ID regions often play important functional roles, their accurate identification is difficult. As human transcription factors (TFs) constitute a typical group of proteins with long ID regions, we regarded them as a model of all proteins and attempted to accurately classify TFs into structural domains and ID regions. Although an extremely high fraction of ID regions besides DNA binding and/or other domains was detected in human TFs in our previous investigation, 20% of the residues were left unassigned. In this report, we exploit the generally higher sequence divergence in ID regions than in structural regions to completely divide proteins into structural domains and ID regions. The new dichotomic system first identifies domains of known structures, followed by assignment of structural domains and ID regions with a combination of pre-existing tools and a newly developed program based on sequence divergence, taking un-aligned regions into consideration. The system was found to be highly accurate: its application to a set of proteins with experimentally verified ID regions had an error rate as low as 2%. Application of this system to human TFs (401 proteins) showed that 38% of the residues were in structural domains, while 62% were in ID regions. The preponderance of ID regions makes a sharp contrast to TFs of Escherichia coli (229 proteins), in which only 5% fell in ID regions. The method also revealed that 4.0% and 11.8% of the total length in human and E. coli TFs, respectively, are comprised of structural domains whose structures have not been determined. The present system verifies that sequence divergence including information of unaligned regions is a good indicator of ID regions. The system for the first time estimates the complete fractioning of structured/un-structured regions in human TFs, also revealing structural domains without homology to known structures. These predicted novel structural domains are good targets of structural genomics. When applied to other proteins, the system is expected to uncover more novel structural domains.

Journal ArticleDOI
TL;DR: Strikingly, the positive surface charge considered key to PCNA's role as a sliding clamp is dramatically reduced in the halophilic protein, and bound cations within the solvation shell of Hv PCNA may permit sliding along negatively charged DNA by reducing electrostatic repulsion effects.
Abstract: The high intracellular salt concentration required to maintain a halophilic lifestyle poses challenges to haloarchaeal proteins that must stay soluble, stable and functional in this extreme environment. Proliferating cell nuclear antigen (PCNA) is a fundamental protein involved in maintaining genome integrity, with roles in both DNA replication and repair. To investigate the halophilic adaptation of such a key protein we have crystallised and solved the structure of Haloferax volcanii PCNA (Hv PCNA) to a resolution of 2.0 A. The overall architecture of Hv PCNA is very similar to other known PCNAs, which are highly structurally conserved. Three commonly observed adaptations in halophilic proteins are higher surface acidity, bound ions and increased numbers of intermolecular ion pairs (in oligomeric proteins). Hv PCNA possesses the former two adaptations but not the latter, despite functioning as a homotrimer. Strikingly, the positive surface charge considered key to PCNA's role as a sliding clamp is dramatically reduced in the halophilic protein. Instead, bound cations within the solvation shell of Hv PCNA may permit sliding along negatively charged DNA by reducing electrostatic repulsion effects. The extent to which individual proteins adapt to halophilic conditions varies, presumably due to their diverse characteristics and roles within the cell. The number of ion pairs observed in the Hv PCNA monomer-monomer interface was unexpectedly low. This may reflect the fact that the trimer is intrinsically stable over a wide range of salt concentrations and therefore additional modifications for trimer maintenance in high salt conditions are not required. Halophilic proteins frequently bind anions and cations and in Hv PCNA cation binding may compensate for the remarkable reduction in positive charge in the pore region, to facilitate functional interactions with DNA. In this way, Hv PCNA may harness its environment as opposed to simply surviving in extreme halophilic conditions.

Journal ArticleDOI
TL;DR: This contribution compared two different approaches to explore the flexibility space of protein domains: i) molecular dynamics (MD-space), and ii) the study of the structural changes within superfamily (SF-space).
Abstract: It is well known the strong relationship between protein structure and flexibility, on one hand, and biological protein function, on the other hand Technically, protein flexibility exploration is an essential task in many applications, such as protein structure prediction and modeling In this contribution we have compared two different approaches to explore the flexibility space of protein domains: i) molecular dynamics (MD-space), and ii) the study of the structural changes within superfamily (SF-space) Our analysis indicates that the MD-space and the SF-space display a significant overlap, but are still different enough to be considered as complementary The SF-space space is wider but less complex than the MD-space, irrespective of the number of members in the superfamily Also, the SF-space does not sample all possibilities offered by the MD-space, but often introduces very large changes along just a few deformation modes, whose number tend to a plateau as the number of related folds in the superfamily increases Theoretically, we obtained two conclusions First, that function restricts the access to some flexibility patterns to evolution, as we observe that when a superfamily member changes to become another, the path does not completely overlap with the physical deformability Second, that conformational changes from variation in a superfamily are larger and much simpler than those allowed by physical deformability Methodologically, the conclusion is that both spaces studied are complementary, and have different size and complexity We expect this fact to have application in fields as 3D-EM/X-ray hybrid models or ab initio protein folding

Journal ArticleDOI
TL;DR: It is found that all conserved residues of the Asp-box support its structure, whereas the residues in variable positions are generally used for other purposes, which strongly suggests the existence of a novel class of hybrid β-propellers.
Abstract: The Asp-box is a short sequence and structure motif that folds as a well-defined β-hairpin. It is present in different folds, but occurs most prominently as repeats in β-propellers. Asp-box β-propellers are known to be characteristically irregular and to occur in many medically important proteins, most of which are glycosidase enzymes, but they are otherwise not well characterized and are only rarely treated as a distinct β-propeller family. We have analyzed the sequence, structure, function and occurrence of the Asp-box and s-Asp-box -a related shorter variant, and provide a comprehensive classification and computational analysis of the Asp-box β-propeller family. We find that all conserved residues of the Asp-box support its structure, whereas the residues in variable positions are generally used for other purposes. The Asp-box clearly has a structural role in β-propellers and is highly unlikely to be involved in ligand binding. Sequence analysis of the Asp-box β-propeller family reveals it to be very widespread especially in bacteria and suggests a wide functional range. Disregarding the Asp-boxes, sequence conservation of the propeller blades is very low, but a distinct pattern of residues with specific properties have been identified. Interestingly, Asp-boxes are occasionally found very close to other propeller-associated repeats in extensive mixed-motif stretches, which strongly suggests the existence of a novel class of hybrid β-propellers. Structural analysis reveals that the top and bottom faces of Asp-box β-propellers have striking and consistently different loop properties; the bottom is structurally conserved whereas the top shows great structural variation. Interestingly, only the top face is used for functional purposes in known structures. A structural analysis of the 10-bladed β-propeller fold, which has so far only been observed in the Asp-box family, reveals that the inner strands of the blades are unusually far apart, which explains the surprisingly large diameter of the central tunnel of sortilin. We have provided new insight into the structure and function of the Asp-box motif and of Asp-box β-propellers, and expect that the classification and analysis presented here will prove helpful in interpreting future data on Asp-box proteins in general and on Asp-box β-propellers in particular.

Journal ArticleDOI
TL;DR: The detailed comparison between the DNA-free Pfu Hjm structure and the structure of Hel308 complexed with DNA suggests similar DNA unwinding and translocation mechanisms, which could be generalized to all of the members in the same family.
Abstract: Pyrococcus furiosus Hjm (Pfu Hjm) is a structure-specific DNA helicase that was originally identified by in vitro screening for Holliday junction migration activity. It belongs to helicase superfamily 2, and shares homology with the human DNA polymerase Θ (PolΘ), HEL308, and Drosophila Mus308 proteins, which are involved in DNA repair. Previous biochemical and genetic analyses revealed that Pfu Hjm preferentially binds to fork-related Y-structured DNAs and unwinds their double-stranded regions, suggesting that this helicase is a functional counterpart of the bacterial RecQ helicase, which is essential for genome maintenance. Elucidation of the DNA unwinding and translocation mechanisms by Pfu Hjm will require its three-dimensional structure at atomic resolution. We determined the crystal structures of Pfu Hjm, in two apo-states and two nucleotide bound forms, at resolutions of 2.0–2.7 A. The overall structures and the local conformations around the nucleotide binding sites are almost the same, including the side-chain conformations, irrespective of the nucleotide-binding states. The architecture of Hjm was similar to that of Archaeoglobus fulgidus Hel308 complexed with DNA. An Hjm-DNA complex model, constructed by fitting the five domains of Hjm onto the corresponding Hel308 domains, indicated that the interaction of Hjm with DNA is similar to that of Hel308. Notably, sulphate ions bound to Hjm lie on the putative DNA binding surfaces. Electron microscopic analysis of an Hjm-DNA complex revealed substantial flexibility of the double stranded region of DNA, presumably due to particularly weak protein-DNA interactions. Our present structures allowed reasonable homology model building of the helicase region of human PolΘ, indicating the strong conformational conservation between archaea and eukarya. The detailed comparison between our DNA-free Pfu Hjm structure and the structure of Hel308 complexed with DNA suggests similar DNA unwinding and translocation mechanisms, which could be generalized to all of the members in the same family. Structural comparison also implied a minor rearrangement of the five domains during DNA unwinding reaction. The unexpected small contact between the DNA duplex region and the enzyme appears to be advantageous for processive helicase activity.

Journal ArticleDOI
TL;DR: Examination of the SBDS protein structure and domain movements together with its possible interaction with large ribosomal subunit proteins suggest that these proteins could participate in ribosome function.
Abstract: Background Defects in the human Shwachman-Bodian-Diamond syndrome (SBDS) protein-coding gene lead to the autosomal recessive disorder characterised by bone marrow dysfunction, exocrine pancreatic insufficiency and skeletal abnormalities. This protein is highly conserved in eukaryotes and archaea but is not found in bacteria. Although genomic and biophysical studies have suggested involvement of this protein in RNA metabolism and in ribosome biogenesis, its interacting partners remain largely unknown.

Journal ArticleDOI
TL;DR: An object-oriented Python/C++ library to help the development of new docking methods for bioinformatics, which can handle molecules at coarse-grained or atomic resolution and allows users to rapidly develop new software.
Abstract: Macromolecular docking is a challenging field of bioinformatics. Developing new algorithms is a slow process generally involving routine tasks that should be found in a robust library and not programmed from scratch for every new software application. We present an object-oriented Python/C++ library to help the development of new docking methods. This library contains low-level routines like PDB-format manipulation functions as well as high-level tools for docking and analyzing results. We also illustrate the ease of use of this library with the detailed implementation of a 3-body docking procedure. The PTools library can handle molecules at coarse-grained or atomic resolution and allows users to rapidly develop new software. The library is already in use for protein-protein and protein-DNA docking with the ATTRACT program and for simulation analysis. This library is freely available under the GNU GPL license, together with detailed documentation.

Journal ArticleDOI
TL;DR: The problem of predicting specific DNA-binding sites in terms of contacts between the residue environments of proteins and the identity of a mononucleotide or a dinucleotide step in DNA is formulated and most residue-nucleotide contacts can be predicted with high accuracy using only sequence and evolutionary information.
Abstract: DNA recognition by proteins is one of the most important processes in living systems. Therefore, understanding the recognition process in general, and identifying mutual recognition sites in proteins and DNA in particular, carries great significance. The sequence and structural dependence of DNA-binding sites in proteins has led to the development of successful machine learning methods for their prediction. However, all existing machine learning methods predict DNA-binding sites, irrespective of their target sequence and hence, none of them is helpful in identifying specific protein-DNA contacts. In this work, we formulate the problem of predicting specific DNA-binding sites in terms of contacts between the residue environments of proteins and the identity of a mononucleotide or a dinucleotide step in DNA. The aim of this work is to take a protein sequence or structural features as inputs and predict for each amino acid residue if it binds to DNA at locations identified by one of the four possible mononucleotides or one of the 10 unique dinucleotide steps. Contact predictions are made at various levels of resolution viz. in terms of side chain, backbone and major or minor groove atoms of DNA. Significant differences in residue preferences for specific contacts are observed, which combined with other features, lead to promising levels of prediction. In general, PSSM-based predictions, supported by secondary structure and solvent accessibility, achieve a good predictability of ~70–80%, measured by the area under the curve (AUC) of ROC graphs. The major and minor groove contact predictions stood out in terms of their poor predictability from sequences or PSSM, which was very strongly (>20 percentage points) compensated by the addition of secondary structure and solvent accessibility information, revealing a predominant role of local protein structure in the major/minor groove DNA-recognition. Following a detailed analysis of results, a web server to predict mononucleotide and dinucleotide-step contacts using PSSM was developed and made available at http://sdcpred.netasa.org/ or http://tardis.nibio.go.jp/netasa/sdcpred/ . Most residue-nucleotide contacts can be predicted with high accuracy using only sequence and evolutionary information. Major and minor groove contacts, however, depend profoundly on the local structure. Overall, this study takes us a step closer to the ultimate goal of predicting mutual recognition sites in protein and DNA sequences.

Journal ArticleDOI
TL;DR: It is suggested that molecular testing/diagnostics of JAK2 should extend beyond V617F and exon 12 mutations, and perhaps should encompass most of the pseudo-kinase domain-coding region.
Abstract: Background The functional relevance of many of the recently detected JAK2 mutations, except V617F and exon 12 mutants, in patients with chronic myeloproliferative neoplasia (MPN) has been significantly overlooked. To explore atomic-level explanations of the possible mutational effects from those overlooked mutants, we performed a set of molecular dynamics simulations on clinically observed mutants, including newly discovered mutations (K539L, R564L, L579F, H587N, S591L, H606Q, V617I, V617F, C618R, L624P, whole exon 14-deletion) and control mutants (V617C, V617Y, K603Q/N667K).

Journal ArticleDOI
TL;DR: The per-atom ASA, although not the determinant of the chemical shift, thus provides a way to directly correlate chemical shift information to the atomic coordinates and their reported chemical shift value.
Abstract: Chemical shifts obtained from NMR experiments are an important tool in determining secondary, even tertiary, protein structure. The main repository for chemical shift data is the BioMagResBank, which provides NMR-STAR files with this type of information. However, it is not trivial to link this information to available coordinate data from the PDB for non-backbone atoms due to atom and chain naming differences, as well as sequence numbering changes. We here describe the analysis of a consistent set of chemical shift and coordinate data, in which we focus on the relationship between the per-atom solvent accessible surface area (ASA) in the reported coordinates and their reported chemical shift value. The data is available online on http://www.ebi.ac.uk/pdbe/docs/NMR/shiftAnalysis/index.html . Atoms with zero per-atom ASA have a significantly larger chemical shift dispersion and often have a different chemical shift distribution compared to those that are solvent accessible. With higher per-atom ASA, the chemical shift values also tend towards random coil values. The per-atom ASA, although not the determinant of the chemical shift, thus provides a way to directly correlate chemical shift information to the atomic coordinates.

Journal ArticleDOI
TL;DR: The presentation at a single website of data on interactions between a ligand and specific residues on the enzyme alongside data on the movement that these interactions induce, should lead to new insights into the mechanisms of these enzymes in particular, and help in trying to understand the general process of ligand-induced domain closure in enzymes.
Abstract: Background Conformational change induced by the binding of a substrate or coenzyme is a poorly understood stage in the process of enzyme catalysed reactions. For enzymes that exhibit a domain movement, the conformational change can be clearly characterized and therefore the opportunity exists to gain an understanding of the mechanisms involved. The development of the non-redundant database of protein domain movements contains examples of ligand-induced domain movements in enzymes, but this valuable data has remained unexploited.

Journal ArticleDOI
TL;DR: This work structurally and functionally characterise a novel avidin named xenavidin, which is to the authors' knowledge the first reported avidin from a frog, and provides information about the biochemically and structurally important determinants of biotin binding.
Abstract: Avidins are proteins with extraordinarily high ligand-binding affinity, a property which is used in a wide array of life science applications Even though useful for biotechnology and nanotechnology, the biological function of avidins is not fully understood Here we structurally and functionally characterise a novel avidin named xenavidin, which is to our knowledge the first reported avidin from a frog Xenavidin was identified from an EST sequence database for Xenopus tropicalis and produced in insect cells using a baculovirus expression system The recombinant xenavidin was found to be homotetrameric based on gel filtration analysis Biacore sensor analysis, fluorescently labelled biotin and radioactive biotin were used to evaluate the biotin-binding properties of xenavidin - it binds biotin with high affinity though less tightly than do chicken avidin and bacterial streptavidin X-ray crystallography revealed structural conservation around the ligand-binding site, while some of the loop regions have a unique design The location of structural water molecules at the entrance and/or within the ligand-binding site may have a role in determining the characteristic biotin-binding properties of xenavidin The novel data reported here provide information about the biochemically and structurally important determinants of biotin binding This information may facilitate the discovery of novel tools for biotechnology

Journal ArticleDOI
TL;DR: It is proposed that the active site of GAPDH can accommodate the substrate in multiple conformations at multiple locations during the initial encounter and demonstrate the plasticity of the substrate binding site.
Abstract: The structure, function and reaction mechanism of glyceraldehyde 3-phosphate dehydrogenase (GAPDH) have been extensively studied. Based on these studies, three anion binding sites have been identified, one 'Ps' site (for binding the C-3 phosphate of the substrate) and two sites, 'Pi' and 'new Pi', for inorganic phosphate. According to the original flip-flop model, the substrate phosphate group switches from the 'Pi' to the 'Ps' site during the multistep reaction. In light of the discovery of the 'new Pi' site, a modified flip-flop mechanism, in which the C-3 phosphate of the substrate binds to the 'new Pi' site and flips to the 'Ps' site before the hydride transfer, was proposed. An alternative model based on a number of structures of B. stearothermophilus GAPDH ternary complexes (non-covalent and thioacyl intermediate) proposes that in the ternary Michaelis complex the C-3 phosphate binds to the 'Ps' site and flips from the 'Ps' to the 'new Pi' site during or after the redox step. We determined the crystal structure of Cryptosporidium parvum GAPDH in the apo and holo (enzyme + NAD) state and the structure of the ternary enzyme-cofactor-substrate complex using an active site mutant enzyme. The C. parvum GAPDH complex was prepared by pre-incubating the enzyme with substrate and cofactor, thereby allowing free movement of the protein structure and substrate molecules during their initial encounter. Sulfate and phosphate ions were excluded from purification and crystallization steps. The quality of the electron density map at 2A resolution allowed unambiguous positioning of the substrate. In three subunits of the homotetramer the C-3 phosphate group of the non-covalently bound substrate is in the 'new Pi' site. A concomitant movement of the phosphate binding loop is observed in these three subunits. In the fourth subunit the C-3 phosphate occupies an unexpected site not seen before and the phosphate binding loop remains in the substrate-free conformation. Orientation of the substrate with respect to the active site histidine and serine (in the mutant enzyme) also varies in different subunits. The structures of the C. parvum GAPDH ternary complex and other GAPDH complexes demonstrate the plasticity of the substrate binding site. We propose that the active site of GAPDH can accommodate the substrate in multiple conformations at multiple locations during the initial encounter. However, the C-3 phosphate group clearly prefers the 'new Pi' site for initial binding in the active site.