scispace - formally typeset
Search or ask a question

Showing papers in "BMC Structural Biology in 2007"


Journal ArticleDOI
TL;DR: It may be possible to improve epitope prediction methods through training on datasets which include only immune epitopes and through utilizing more features characterizing epitopes, for example, the evolutionary conservation score.
Abstract: The ability to predict antibody binding sites (aka antigenic determinants or B-cell epitopes) for a given protein is a precursor to new vaccine design and diagnostics. Among the various methods of B-cell epitope identification X-ray crystallography is one of the most reliable methods. Using these experimental data computational methods exist for B-cell epitope prediction. As the number of structures of antibody-protein complexes grows, further interest in prediction methods using 3D structure is anticipated. This work aims to establish a benchmark for 3D structure-based epitope prediction methods. Two B-cell epitope benchmark datasets inferred from the 3D structures of antibody-protein complexes were defined. The first is a dataset of 62 representative 3D structures of protein antigens with inferred structural epitopes. The second is a dataset of 82 structures of antibody-protein complexes containing different structural epitopes. Using these datasets, eight web-servers developed for antibody and protein binding sites prediction have been evaluated. In no method did performance exceed a 40% precision and 46% recall. The values of the area under the receiver operating characteristic curve for the evaluated methods were about 0.6 for ConSurf, DiscoTope, and PPI-PRED methods and above 0.65 but not exceeding 0.70 for protein-protein docking methods when the best of the top ten models for the bound docking were considered; the remaining methods performed close to random. The benchmark datasets are included as a supplement to this paper. It may be possible to improve epitope prediction methods through training on datasets which include only immune epitopes and through utilizing more features characterizing epitopes, for example, the evolutionary conservation score. Notwithstanding, overall poor performance may reflect the generality of antigenicity and hence the inability to decipher B-cell epitopes as an intrinsic feature of the protein. It is an open question as to whether ultimately discriminatory features can be found.

208 citations


Journal ArticleDOI
TL;DR: Statistics of amino acid compositions in binding versus non-binding regions- general as well as in each different secondary structure conformation are compiled and neural networks give a moderate success of prediction, which is expected to improve when structures of more protein-carbohydrate complexes become available in future.
Abstract: Background Protein-Carbohydrate interactions are crucial in many biological processes with implications to drug targeting and gene expression. Nature of protein-carbohydrate interactions may be studied at individual residue level by analyzing local sequence and structure environments in binding regions in comparison to non-binding regions, which provide an inherent control for such analyses. With an ultimate aim of predicting binding sites from sequence and structure, overall statistics of binding regions needs to be compiled. Sequence-based predictions of binding sites have been successfully applied to DNA-binding proteins in our earlier works. We aim to apply similar analysis to carbohydrate binding proteins. However, due to a relatively much smaller region of proteins taking part in such interactions, the methodology and results are significantly different. A comparison of protein-carbohydrate complexes has also been made with other protein-ligand complexes.

177 citations


Journal ArticleDOI
TL;DR: The structure-/function studies presented here revealed a new mechanism, in which the energy contribution of a conserved H-bond is modulated by surrounding intramolecular interactions to achieve a switch between low- and high-affinity binding.
Abstract: Bone morphogenetic proteins (BMPs) are key regulators in the embryonic development and postnatal tissue homeostasis in all animals. Loss of function or dysregulation of BMPs results in severe diseases or even lethality. Like transforming growth factors β (TGF-βs), activins, growth and differentiation factors (GDFs) and other members of the TGF-β superfamily, BMPs signal by assembling two types of serine/threonine-kinase receptor chains to form a hetero-oligomeric ligand-receptor complex. BMP ligand receptor interaction is highly promiscuous, i.e. BMPs bind more than one receptor of each subtype, and a receptor bind various ligands. The activin type II receptors are of particular interest, since they bind a large number of diverse ligands. In addition they act as high-affinity receptors for activins but are also low-affinity receptors for BMPs. ActR-II and ActR-IIB therefore represent an interesting example how affinity and specificity might be generated in a promiscuous background. Here we present the high-resolution structures of the ternary complexes of wildtype and a variant BMP-2 bound to its high-affinity type I receptor BMPR-IA and its low-affinity type II receptor ActR-IIB and compare them with the known structures of binary and ternary ligand-receptor complexes of BMP-2. In contrast to activin or TGF-β3 no changes in the dimer architecture of the BMP-2 ligand occur upon complex formation. Functional analysis of the ActR-IIB binding epitope shows that hydrophobic interactions dominate in low-affinity binding of BMPs; polar interactions contribute only little to binding affinity. However, a conserved H-bond in the center of the type II ligand-receptor interface, which does not contribute to binding in the BMP-2 – ActR-IIB interaction can be mutationally activated resulting in a BMP-2 variant with high-affinity for ActR-IIB. Further mutagenesis studies were performed to elucidate the binding mechanism allowing us to construct BMP-2 variants with defined type II receptor binding properties. Binding specificity of BMP-2 for its three type II receptors BMPR-II, Act-RII and ActR-IIB is encoded on single amino acid level. Exchange of only one or two residues results in BMP-2 variants with a dramatically altered type II receptor specificity profile, possibly allowing construction of BMP-2 variants that address a single type II receptor. The structure-/function studies presented here revealed a new mechanism, in which the energy contribution of a conserved H-bond is modulated by surrounding intramolecular interactions to achieve a switch between low- and high-affinity binding.

175 citations


Journal ArticleDOI
TL;DR: Positions of diverse peripheral proteins and peptides in the lipid bilayer can be accurately predicted using their 3D structures that represent a proper membrane-bound conformation and oligomeric state, and have membrane binding elements present.
Abstract: Three-dimensional (3D) structures of numerous peripheral membrane proteins have been determined. Biological activity, stability, and conformations of these proteins depend on their spatial positions with respect to the lipid bilayer. However, these positions are usually undetermined. We report the first large-scale computational study of monotopic/peripheral proteins with known 3D structures. The optimal translational and rotational positions of 476 proteins are determined by minimizing energy of protein transfer from water to the lipid bilayer, which is approximated by a hydrocarbon slab with a decadiene-like polarity and interfacial regions characterized by water-permeation profiles. Predicted membrane-binding sites, protein tilt angles and membrane penetration depths are consistent with spin-labeling, chemical modification, fluorescence, NMR, mutagenesis, and other experimental studies of 53 peripheral proteins and peptides. Experimental membrane binding affinities of peripheral proteins were reproduced in cases that did not involve a helix-coil transition, specific binding of lipids, or a predominantly electrostatic association. Coordinates of all examined peripheral proteins and peptides with the calculated hydrophobic membrane boundaries, subcellular localization, topology, structural classification, and experimental references are available through the Orientations of Proteins in Membranes (OPM) database. Positions of diverse peripheral proteins and peptides in the lipid bilayer can be accurately predicted using their 3D structures that represent a proper membrane-bound conformation and oligomeric state, and have membrane binding elements present. The success of the implicit solvation model suggests that hydrophobic interactions are usually sufficient to determine the spatial position of a protein in the membrane, even when electrostatic interactions or specific binding of lipids are substantial. Our results demonstrate that most peripheral proteins not only interact with the membrane surface, but penetrate through the interfacial region and reach the hydrocarbon interior, which is consistent with published experimental studies.

126 citations


Journal ArticleDOI
TL;DR: The neohesperidin dihydrochalcone binding site at the human sweet taste receptor is identified, which overlaps with those for the sweetener cyclamate and the sweet taste inhibitor lactisole, suggesting a general role of these amino acid positions in allosterism and pointing to a common architecture of the heptahelical domains of class C GPCRs.
Abstract: Background Differences in sweet taste perception among species depend on structural variations of the sweet taste receptor. The commercially used isovanillyl sweetener neohesperidin dihydrochalcone activates the human but not the rat sweet receptor TAS1R2+TAS1R3. Analysis of interspecies combinations and chimeras of rat and human TAS1R2+TAS1R3 suggested that the heptahelical domain of human TAS1R3 is crucial for the activation of the sweet receptor by neohesperidin dihydrochalcone.

115 citations


Journal ArticleDOI
TL;DR: This is the first structure of a multicopper oxidase which allowed the detection of two intermediates in the molecular oxygen reduction and splitting, providing general insights into the reductive cleavage of the O-O bonds, a leading problem in many areas of biology.
Abstract: Background Laccases belong to multicopper oxidases, a widespread class of enzymes implicated in many oxidative functions in pathogenesis, immunogenesis and morphogenesis of organisms and in the metabolic turnover of complex organic substances. They catalyze the coupling between the four one-electron oxidations of a broad range of substrates with the four-electron reduction of dioxygen to water. These catalytic processes are made possible by the contemporaneous presence of at least four copper ion sites, classified according to their spectroscopic properties: one type 1 (T1) site where the electrons from the reducing substrates are accepted, one type 2 (T2), and a coupled binuclear type 3 pair (T3) which are assembled in a T2/T3 trinuclear cluster where the electrons are transferred to perform the O2 reduction to H2O.

111 citations


Journal ArticleDOI
Ozlem Keskin1
TL;DR: Results suggest that the pre-equilibrium concept holds for antibodies and the promiscuity of antibodies can also be explained this hypothesis: a limited number of conformational states driven by intrinsic motions of an antibody might be adequate to bind to different antigens.
Abstract: How antibodies recognize and bind to antigens can not be totally explained by rigid shape and electrostatic complimentarity models. Alternatively, pre-existing equilibrium hypothesis states that the native state of an antibody is not defined by a single rigid conformation but instead with an ensemble of similar conformations that co-exist at equilibrium. Antigens bind to one of the preferred conformations making this conformation more abundant shifting the equilibrium. Here, two antibodies, a germline antibody of 36–65 Fab and a monoclonal antibody, SPE7 are studied in detail to elucidate the mechanism of antibody-antigen recognition and to understand how a single antibody recognizes different antigens. An elastic network model, Anisotropic Network Model (ANM) is used in the calculations. Pre-existing equilibrium is not restricted to apply to antibodies. Intrinsic fluctuations of eight proteins, from different classes of proteins, such as enzymes, binding and transport proteins are investigated to test the suitability of the method. The intrinsic fluctuations are compared with the experimentally observed ligand induced conformational changes of these proteins. The results show that the intrinsic fluctuations obtained by theoretical methods correlate with structural changes observed when a ligand is bound to the protein. The decomposition of the total fluctuations serves to identify the different individual modes of motion, ranging from the most cooperative ones involving the overall structure, to the most localized ones. Results suggest that the pre-equilibrium concept holds for antibodies and the promiscuity of antibodies can also be explained this hypothesis: a limited number of conformational states driven by intrinsic motions of an antibody might be adequate to bind to different antigens.

104 citations


Journal ArticleDOI
TL;DR: Analysis and comparison of ar/R selectivity filters suggest that rice and maize MIPs could transport more diverse solutes than Arabidopsis Mips and small residues are group-conserved in the helix-helix interface of MIP structures, which might help to preserve the hour-glass fold in MIP structure.
Abstract: The major intrinsic proteins (MIPs) facilitate the transport of water and neutral solutes across the lipid bilayers. Plant MIPs are believed to be important in cell division and expansion and in water transport properties in response to environmental conditions. More than 30 MIP sequences have been identified in Arabidopsis thaliana, maize and rice. Plasma membrane intrinsic proteins (PIPs), tonoplast intrinsic proteins (TIPs), Nod26-like intrinsic protein (NIPs) and small and basic intrinsic proteins (SIPs) are subfamilies of plant MIPs. Despite sequence diversity, all the experimentally determined structures belonging to the MIP superfamily have the same "hour-glass" fold. We have structurally characterized 39 rice and 31 maize MIPs and compared them with that of Arabidopsis. Homology models of 105 MIPs from all three plant species were built. Structure-based sequence alignments were generated and the residues in the helix-helix interfaces were analyzed. Small residues (Gly/Ala/Ser/Thr) are found to be highly conserved as a group in the helix-helix interface of MIP structures. Individual families sometimes prefer one or another of the residues from this group. The narrow aromatic/arginine (ar/R) selectivity filter in MIPs has been shown to provide an important constriction for solute permeability. Ar/R regions were analyzed and compared between the three plant species. Seventeen TIP, NIP and SIP members from rice and maize have ar/R signatures that are not found in Arabidopsis. A subgroup of rice and maize NIPs has small residues in three of the four positions in the ar/R tetrad, resulting in a wider constriction. These MIP members could transport larger solute molecules. Small residues are group-conserved in the helix-helix interface of MIP structures and they seem to be important for close helix-helix interactions. Such conservation might help to preserve the hour-glass fold in MIP structures. Analysis and comparison of ar/R selectivity filters suggest that rice and maize MIPs could transport more diverse solutes than Arabidopsis MIPs. Thus the MIP members show conservation in helix-helix interfaces and diversity in aromatic/arginine selectivity filters. The former is related to structural stability and the later can be linked to functional diversity.

104 citations


Journal ArticleDOI
TL;DR: A new sequence representation that uses k-spaced amino acid pairs is shown to be the most efficient in the prediction of the flexible/rigid regions of protein sequences.
Abstract: Traditionally, it is believed that the native structure of a protein corresponds to a global minimum of its free energy. However, with the growing number of known tertiary (3D) protein structures, researchers have discovered that some proteins can alter their structures in response to a change in their surroundings or with the help of other proteins or ligands. Such structural shifts play a crucial role with respect to the protein function. To this end, we propose a machine learning method for the prediction of the flexible/rigid regions of proteins (referred to as FlexRP); the method is based on a novel sequence representation and feature selection. Knowledge of the flexible/rigid regions may provide insights into the protein folding process and the 3D structure prediction. The flexible/rigid regions were defined based on a dataset, which includes protein sequences that have multiple experimental structures, and which was previously used to study the structural conservation of proteins. Sequences drawn from this dataset were represented based on feature sets that were proposed in prior research, such as PSI-BLAST profiles, composition vector and binary sequence encoding, and a newly proposed representation based on frequencies of k-spaced amino acid pairs. These representations were processed by feature selection to reduce the dimensionality. Several machine learning methods for the prediction of flexible/rigid regions and two recently proposed methods for the prediction of conformational changes and unstructured regions were compared with the proposed method. The FlexRP method, which applies Logistic Regression and collocation-based representation with 95 features, obtained 79.5% accuracy. The two runner-up methods, which apply the same sequence representation and Support Vector Machines (SVM) and Naive Bayes classifiers, obtained 79.2% and 78.4% accuracy, respectively. The remaining considered methods are characterized by accuracies below 70%. Finally, the Naive Bayes method is shown to provide the highest sensitivity for the prediction of flexible regions, while FlexRP and SVM give the highest sensitivity for rigid regions. A new sequence representation that uses k-spaced amino acid pairs is shown to be the most efficient in the prediction of the flexible/rigid regions of protein sequences. The proposed FlexRP method provides the highest prediction accuracy of about 80%. The experimental tests show that the FlexRP and SVM methods achieved high overall accuracy and the highest sensitivity for rigid regions, while the best quality of the predictions for flexible regions is achieved by the Naive Bayes method.

100 citations


Journal ArticleDOI
TL;DR: The alignments produced by different methods tend to agree to a considerable extent, but the agreement is lower for the more challenging pairs, and the results for the comparison to reference alignments are encouraging, but indicate that there is still room for improvement.
Abstract: Background Several methods are currently available for the comparison of protein structures. These methods have been analysed regarding the performance in the identification of structurally/evolutionary related proteins, but so far there has been less focus on the objective comparison between the alignments produced by different methods.

94 citations


Journal ArticleDOI
TL;DR: A comprehensive map of the differences in mutation frequencies, location and contact energies, and the changes in residue volume and charge is obtained – both in the mutated ( original) amino acids and in the mutant amino acids in the different secondary structure types.
Abstract: Most genetic disorders are linked to missense mutations as even minor changes in the size or properties of an amino acid can alter or prevent the function of the protein. Further, the effect of a mutation is also dependent on the sequence and structure context of the alteration. We investigated the spectrum of disease-causing missense mutations in secondary structure elements in proteins with numerous known mutations and for which an experimentally defined three-dimensional structure is available. We obtained a comprehensive map of the differences in mutation frequencies, location and contact energies, and the changes in residue volume and charge – both in the mutated (original) amino acids and in the mutant amino acids in the different secondary structure types. We collected information for 44 different proteins involved in a large number of diseases. The studied proteins contained a total of 2413 mutations of which 1935 (80%) appeared in secondary structures. Differences in mutation patterns between secondary structures and whole proteins were generally not statistically significant whereas within the secondary structural elements numerous highly significant features were observed. Numerous trends in mutated and mutant amino acids are apparent. Among the original residues, arginine clearly has the highest relative mutability. The overall relative mutability among mutant residues is highest for cysteine and tryptophan. The mutability values are higher for mutated residues than for mutant residues. Arginine and glycine are among the most mutated residues in all secondary structures whereas the other amino acids have large variations in mutability between structure types. Statistical analysis was used to reveal trends in different secondary structural elements, residue types as well as for the charge and volume changes.

Journal ArticleDOI
TL;DR: Overall, it is suggested and discussed that the assembly of protein-protein complexes is enabled and probably promoted by protein disorder.
Abstract: Background The idea that the assembly of protein complexes is linked with protein disorder has been inferred from a few large complexes, such as the viral capsid or bacterial flagellar system, only. The relationship, which suggests that larger complexes have more disorder, has never been systematically tested. The recent high-throughput analyses of protein-protein interactions and protein complexes in the cell generated data that enable to address this issue by bioinformatic means.

Journal ArticleDOI
TL;DR: The results suggest that intermolecular hydrophobic interactions are essential for the hyperthermostability of EstE1 and will provide guideline for rational design of a thermostable esterase/lipase using the lipolytic enzymes showing structural similarity to EstE 1.
Abstract: EstE1 is a hyperthermophilic esterase belonging to the hormone-sensitive lipase family and was originally isolated by functional screening of a metagenomic library constructed from a thermal environmental sample. Dimers and oligomers may have been evolutionally selected in thermophiles because intersubunit interactions can confer thermostability on the proteins. The molecular mechanisms of thermostabilization of this extremely thermostable esterase are not well understood due to the lack of structural information. Here we report for the first time the 2.1-A resolution crystal structure of EstE1. The three-dimensional structure of EstE1 exhibits a classic α/β hydrolase fold with a central parallel-stranded beta sheet surrounded by alpha helices on both sides. The residues Ser154, Asp251, and His281 form the catalytic triad motif commonly found in other α/β hydrolases. EstE1 exists as a dimer that is formed by hydrophobic interactions and salt bridges. Circular dichroism spectroscopy and heat inactivation kinetic analysis of EstE1 mutants, which were generated by structure-based site-directed mutagenesis of amino acid residues participating in EstE1 dimerization, revealed that hydrophobic interactions through Val274 and Phe276 on the β8 strand of each monomer play a major role in the dimerization of EstE1. In contrast, the intermolecular salt bridges contribute less significantly to the dimerization and thermostability of EstE1. Our results suggest that intermolecular hydrophobic interactions are essential for the hyperthermostability of EstE1. The molecular mechanism that allows EstE1 to endure high temperature will provide guideline for rational design of a thermostable esterase/lipase using the lipolytic enzymes showing structural similarity to EstE1.

Journal ArticleDOI
TL;DR: The presented data suggest that the mollusc chitin synthase fulfils an important enzymatic role in the coordinated formation of larval bivalve shells with the potential to contribute via signal transduction pathways to the implementation of hierarchical patterns into chit in mineral-composites such as prismatic, nacre, and crossed-lamellar shell types.
Abstract: Chitin self-assembly provides a dynamic extracellular biomineralization interface. The insoluble matrix of larval shells of the marine bivalve mollusc Mytilus galloprovincialis consists of chitinous material that is distributed and structured in relation to characteristic shell features. Mollusc shell chitin is synthesized via a complex transmembrane chitin synthase with an intracellular myosin motor domain. Enzymatic mollusc chitin synthesis was investigated in vivo by using the small-molecule drug NikkomycinZ, a structural analogue to the sugar donor substrate UDP-N-acetyl-D-glucosamine (UDP-GlcNAc). The impact on mollusc shell formation was analyzed by binocular microscopy, polarized light video microscopy in vivo, and scanning electron microscopy data obtained from shell material formed in the presence of NikkomycinZ. The partial inhibition of chitin synthesis in vivo during larval development by NikkomycinZ (5 μM – 10 μM) dramatically alters the structure and thus the functionality of the larval shell at various growth fronts, such as the bivalve hinge and the shell's edges. Provided that NikkomycinZ mainly affects chitin synthesis in molluscs, the presented data suggest that the mollusc chitin synthase fulfils an important enzymatic role in the coordinated formation of larval bivalve shells. It can be speculated that chitin synthesis bears the potential to contribute via signal transduction pathways to the implementation of hierarchical patterns into chitin mineral-composites such as prismatic, nacre, and crossed-lamellar shell types.

Journal ArticleDOI
TL;DR: Evaluation of protein structures for which there are X-ray and NMR models shows that the same disulfide bond can exist in different configurations in different models, and it is an open question which form of the disulfides is the functional configuration.
Abstract: Allosteric disulfide bonds regulate protein function when they break and/or form. They typically have a -RHStaple configuration, which is defined by the sign of the five chi angles that make up the disulfide bond. All disulfides in NMR and X-ray protein structures as well as in refined structure datasets were compared and contrasted for configuration and strain energy. The mean dihedral strain energy of 55,005 NMR structure disulfides was twice that of 42,690 X-ray structure disulfides. Moreover, the energies of all twenty types of disulfide bond was higher in NMR structures than X-ray structures, where there was an exponential decrease in the mean strain energy as the incidence of the disulfide type increased. Evaluation of protein structures for which there are X-ray and NMR models shows that the same disulfide bond can exist in different configurations in different models. A disulfide bond configuration that is rare in X-ray structures is the -LHStaple. In NMR structures, this disulfide is characterised by a particularly high potential energy and very short α-carbon distance. The HIV envelope glycoprotein gp120, for example, is regulated by thiol/disulfide exchange and contains allosteric -RHStaple bonds that can exist in the -LHStaple configuration. It is an open question which form of the disulfide is the functional configuration.

Journal ArticleDOI
TL;DR: It is hypothesised that charged group propensity is important in the context of protein solubility and the prevention of aggregation in order to facilitate the reduction of folding fluctuations in proteins of the higher growth temperature organisms.
Abstract: The database of protein structures contains representatives from organisms with a range of growth temperatures. Various properties have been studied in a search for the molecular basis of protein adaptation to higher growth temperature. Charged groups have emerged as key distinguishing factors for proteins from thermophiles and mesophiles. A dataset of 291 thermophile-derived protein structures is compared with mesophile proteins. Calculations of electrostatic interactions support the importance of charges, but indicate that increases in charge contribution to folded state stabilisation do not generally correlate with the numbers of charged groups. Relative propensities of charged groups vary, such as the substitution of glutamic for aspartic acid sidechains. Calculations suggest an energetic basis, with less dehydration for longer sidechains. Most other properties studied show weak or insignificant separation of proteins from moderate thermophiles or hyperthermophiles and mesophiles, including an estimate of the difference in sidechain rotameric entropy upon protein folding. An exception is increased burial of alanine and proline residues and decreased burial of phenylalanine, methionine, tyrosine and tryptophan in hyperthermophile proteins compared to those from mesophiles. Since an increase in the number of charged groups for hyperthermophile proteins is separable from charged group contribution to folded state stability, we hypothesise that charged group propensity is important in the context of protein solubility and the prevention of aggregation. Accordingly we find some separation between mesophile and hyperthermophile proteins when looking at the largest surface patch that does not contain a charged sidechain. With regard to our observation that aromatic sidechains are less buried in hyperthermophile proteins, further analysis indicates that the placement of some of these groups may facilitate the reduction of folding fluctuations in proteins of the higher growth temperature organisms.

Journal ArticleDOI
TL;DR: This is the first structure of a Rieske oxygenase that oxidizes substrates with five aromatic rings to be reported and has large sequence identity to other bacterial Rieskel ferredoxins whose structures are known and demonstrates a high structural homology.
Abstract: The initial step involved in oxidative hydroxylation of monoaromatic and polyaromatic compounds by the microorganism Sphingobium yanoikuyae strain B1 (B1), previously known as Sphingomonas yanoikuyae strain B1 and Beijerinckia sp. strain B1, is performed by a set of multiple terminal Rieske non-heme iron oxygenases. These enzymes share a single electron donor system consisting of a reductase and a ferredoxin (BPDO-FB1). One of the terminal Rieske oxygenases, biphenyl 2,3-dioxygenase (BPDO-OB1), is responsible for B1's ability to dihydroxylate large aromatic compounds, such as chrysene and benzo[a]pyrene. In this study, crystal structures of BPDO-OB1 in both native and biphenyl bound forms are described. Sequence and structural comparisons to other Rieske oxygenases show this enzyme to be most similar, with 43.5 % sequence identity, to naphthalene dioxygenase from Pseudomonas sp. strain NCIB 9816-4. While structurally similar to naphthalene 1,2-dioxygenase, the active site entrance is significantly larger than the entrance for naphthalene 1,2-dioxygenase. Differences in active site residues also allow the binding of large aromatic substrates. There are no major structural changes observed upon binding of the substrate. BPDO-FB1 has large sequence identity to other bacterial Rieske ferredoxins whose structures are known and demonstrates a high structural homology; however, differences in side chain composition and conformation around the Rieske cluster binding site are noted. This is the first structure of a Rieske oxygenase that oxidizes substrates with five aromatic rings to be reported. This ability to catalyze the oxidation of larger substrates is a result of both a larger entrance to the active site as well as the ability of the active site to accommodate larger substrates. While the biphenyl ferredoxin is structurally similar to other Rieske ferredoxins, there are distinct changes in the amino acids near the iron-sulfur cluster. Because this ferredoxin is used by multiple oxygenases present in the B1 organism, this ferredoxin-oxygenase system provides the structural platform to dissect the balance between promiscuity and selectivity in protein-protein electron transport systems.

Journal ArticleDOI
TL;DR: The concerted recruitment by TakP of the substrate group with a cation could represent a first step in the coupled transport of both partners, providing the driving force for solute import.
Abstract: The import of solutes into the bacterial cytoplasm involves several types of membrane transporters, which may be driven by ATP hydrolysis (ABC transporters) or by an ion or H+ electrochemical membrane potential, as in the tripartite ATP-independent periplasmic system (TRAP). In both the ABC and TRAP systems, a specific periplasmic protein from the ESR family (Extracytoplasmic Solute Receptors) is often involved for the recruitment of the solute and its presentation to the membrane complex. In Rhodobacter sphaeroides, TakP (previously named SmoM) is an ESR from a TRAP transporter and binds α-keto acids in vitro. We describe the high-resolution crystal structures of TakP in its unliganded form and as a complex with sodium-pyruvate. The results show a limited "Venus flytrap" conformational change induced by substrate binding. In the liganded structure, a cation (most probably a sodium ion) is present and plays a key role in the association of the pyruvate to the protein. The structure of the binding pocket gives a rationale for the relative affinities of various ligands that were tested from a fluorescence assay. The protein appears to be dimeric in solution and in the crystals, with a helix-swapping structure largely participating in the dimer formation. A 30 A-long water channel buried at the dimer interface connects the two ligand binding cavities of the dimer. The concerted recruitment by TakP of the substrate group with a cation could represent a first step in the coupled transport of both partners, providing the driving force for solute import. Furthermore, the unexpected dimeric structure of TakP suggests a molecular mechanism of solute uptake by the dimeric ESR via a channel that connects the binding sites of the two monomers.

Journal ArticleDOI
TL;DR: The crystal structure of a lectin isolated from Canavalia gladiata seeds is reported, describing a new binding pocket, which may be related to pathogen resistance activity in ConA-like lectins; a site where a non-protein amino-acid, α-aminobutyric acid (Abu), is bound.
Abstract: Lectins are mainly described as simple carbohydrate-binding proteins. Previous studies have tried to identify other binding sites, which possible recognize plant hormones, secondary metabolites, and isolated amino acid residues. We report the crystal structure of a lectin isolated from Canavalia gladiata seeds (CGL), describing a new binding pocket, which may be related to pathogen resistance activity in ConA-like lectins; a site where a non-protein amino-acid, α-aminobutyric acid (Abu), is bound. The overall structure of native CGL and complexed with α-methyl-mannoside and Abu have been refined at 2.3 A and 2.31 A resolution, respectively. Analysis of the electron density maps of the CGL structure shows clearly the presence of Abu, which was confirmed by mass spectrometry. The presence of Abu in a plant lectin structure strongly indicates the ability of lectins on carrying secondary metabolites. Comparison of the amino acids composing the site with other legume lectins revealed that this site is conserved, providing an evidence of the biological relevance of this site. This new action of lectins strengthens their role in defense mechanisms in plants.

Journal ArticleDOI
TL;DR: The relevant role of hydrophobic interactions and entropy in driving protein-DNA association is indicated by analyses of interaction character showing that, together, the favorable polar and unfavorable polar/hydrophobic-polar interactions mostly cancel.
Abstract: To understand the energetics of the interaction between protein and DNA we analyzed 39 crystallographically characterized complexes with the HINT (Hydropathic INTeractions) computational model. HINT is an empirical free energy force field based on solvent partitioning of small molecules between water and 1-octanol. Our previous studies on protein-ligand complexes demonstrated that free energy predictions were significantly improved by taking into account the energetic contribution of water molecules that form at least one hydrogen bond with each interacting species. An initial correlation between the calculated HINT scores and the experimentally determined binding free energies in the protein-DNA system exhibited a relatively poor r2 of 0.21 and standard error of ± 1.71 kcal mol-1. However, the inclusion of 261 waters that bridge protein and DNA improved the HINT score-free energy correlation to an r2 of 0.56 and standard error of ± 1.28 kcal mol-1. Analysis of the water role and energy contributions indicate that 46% of the bridging waters act as linkers between amino acids and nucleotide bases at the protein-DNA interface, while the remaining 54% are largely involved in screening unfavorable electrostatic contacts. This study quantifies the key energetic role of bridging waters in protein-DNA associations. In addition, the relevant role of hydrophobic interactions and entropy in driving protein-DNA association is indicated by analyses of interaction character showing that, together, the favorable polar and unfavorable polar/hydrophobic-polar interactions (i.e., desolvation) mostly cancel.

Journal ArticleDOI
TL;DR: Detailed analysis of below threshold Meta-BASIC hits may push limits further for distant homology detection in the 'midnight zone' of homology.
Abstract: Background PD-(D/E)XK nucleases constitute a large and highly diverse superfamily of enzymes that display little sequence similarity despite retaining a common core fold and a few critical active site residues. This makes identification of new PD-(D/E)XK nuclease families a challenging task as they usually escape detection with standard sequence-based methods. We developed a modified transitive meta profile search approach and to consider the structural diversity of PD-(D/E)XK nuclease fold more thoroughly we analyzed also lower than threshold Meta-BASIC hits to select potentially correct predictions placed among unreliable or incorrect ones.

Journal ArticleDOI
TL;DR: It is suggested that Brd1 BD1 and BD2 may possess distinctive roles and cooperate to regulate Brd2 functions, which is monomeric in solution and dynamically interacts with H4-AcK12.
Abstract: Brd2 is a transcriptional regulator and belongs to BET family, a less characterized novel class of bromodomain-containing proteins. Brd2 contains two tandem bromodomains (BD1 and BD2, 46% sequence identity) in the N-terminus and a conserved motif named ET (extra C-terminal) domain at the C-terminus that is also present in some other bromodomain proteins. The two bromodomains have been shown to bind the acetylated histone H4 and to be responsible for mitotic retention on chromosomes, which is probably a distinctive feature of BET family proteins. Although the crystal structure of Brd2 BD1 is reported, no structure features have been characterized for Brd2 BD2 and its interaction with acetylated histones. Here we report the solution structure of human Brd2 BD2 determined by NMR. Although the overall fold resembles the bromodomains from other proteins, significant differences can be found in loop regions, especially in the ZA loop in which a two amino acids insertion is involved in an uncommon π-helix, termed π D. The helix π D forms a portion of the acetyl-lysine binding site, which could be a structural characteristic of Brd2 BD2 and other BET bromodomains. Unlike Brd2 BD1, BD2 is monomeric in solution. With NMR perturbation studies, we have mapped the H4-AcK12 peptide binding interface on Brd2 BD2 and shown that the binding was with low affinity (2.9 mM) and in fast exchange. Using NMR and mutational analysis, we identified several residues important for the Brd2 BD2-H4-AcK12 peptide interaction and probed the potential mechanism for the specific recognition of acetylated histone codes by Brd2 BD2. Brd2 BD2 is monomeric in solution and dynamically interacts with H4-AcK12. The additional secondary elements in the long ZA loop may be a common characteristic of BET bromodomains. Surrounding the ligand-binding cavity, five aspartate residues form a negatively charged collar that serves as a secondary binding site for H4-AcK12. We suggest that Brd2 BD1 and BD2 may possess distinctive roles and cooperate to regulate Brd2 functions. The structure basis of Brd2 BD2 will help to further characterize the functions of Brd2 and its BET members.

Journal ArticleDOI
TL;DR: The helix-strand-helix motif common to these three folds provides support for the theory of an 'ancient peptide world' by demonstrating how an ancestral fragment can give rise to 3 different folds.
Abstract: Histones organize the genomic DNA of eukaryotes into chromatin. The four core histone subunits consist of two consecutive helix-strand-helix motifs and are interleaved into heterodimers with a unique fold. We have searched for the evolutionary origin of this fold using sequence and structure comparisons, based on the hypothesis that folded proteins evolved by combination of an ancestral set of peptides, the antecedent domain segments. Our results suggest that an antecedent domain segment, corresponding to one helix-strand-helix motif, gave rise divergently to the N-terminal substrate recognition domain of Clp/Hsp100 proteins and to the helical part of the extended ATPase domain found in AAA+ proteins. The histone fold arose subsequently from the latter through a 3D domain-swapping event. To our knowledge, this is the first example of a genetically fixed 3D domain swap that led to the emergence of a protein family with novel properties, establishing domain swapping as a mechanism for protein evolution. The helix-strand-helix motif common to these three folds provides support for our theory of an 'ancient peptide world' by demonstrating how an ancestral fragment can give rise to 3 different folds.

Journal ArticleDOI
TL;DR: The structural preferences of hydrophobic cluster species, which are frequently encountered in globular domains of proteins, are reported here to help the analysis of HCA plots and provide an original fundamental insight into the structural bricks of protein folds.
Abstract: Hydrophobic Cluster Analysis (HCA) is an efficient way to compare highly divergent sequences through the implicit secondary structure information directly derived from hydrophobic clusters. However, its efficiency and application are currently limited by the need of user expertise. In order to help the analysis of HCA plots, we report here the structural preferences of hydrophobic cluster species, which are frequently encountered in globular domains of proteins. These species are characterized only by their hydrophobic/non-hydrophobic dichotomy. This analysis has been extended to loop-forming clusters, using an appropriate loop alphabet.

Journal ArticleDOI
TL;DR: Five atom classification systems are selected and their efficiency for the development of amino acid atom potentials and torsion angle potentials have been optimized to include the orientation of amino acids in such a way that altered backbone conformation in different secondary structural regions can be included for the prediction model.
Abstract: Understanding and predicting protein stability upon point mutations has wide-spread importance in molecular biology. Several prediction models have been developed in the past with various algorithms. Statistical potentials are one of the widely used algorithms for the prediction of changes in stability upon point mutations. Although the methods provide flexibility and the capability to develop an accurate and reliable prediction model, it can be achieved only by the right selection of the structural factors and optimization of their parameters for the statistical potentials. In this work, we have selected five atom classification systems and compared their efficiency for the development of amino acid atom potentials. Additionally, torsion angle potentials have been optimized to include the orientation of amino acids in such a way that altered backbone conformation in different secondary structural regions can be included for the prediction model. This study also elaborates the importance of classifying the mutations according to their solvent accessibility and secondary structure specificity. The prediction efficiency has been calculated individually for the mutations in different secondary structural regions and compared. Results show that, in addition to using an advanced atom description, stepwise regression and selection of atoms are necessary to avoid the redundancy in atom distribution and improve the reliability of the prediction model validation. Comparing to other atom classification models, Melo-Feytmans model shows better prediction efficiency by giving a high correlation of 0.85 between experimental and theoretical ΔΔG with 84.06% of the mutations correctly predicted out of 1538 mutations. The theoretical ΔΔG values for the mutations in partially buried β-strands generated by the structural training dataset from PISCES gave a correlation of 0.84 without performing the Gaussian apodization of the torsion angle distribution. After the Gaussian apodization, the correlation increased to 0.92 and prediction accuracy increased from 80% to 88.89% respectively. These findings were useful for the optimization of the Melo-Feytmans atom classification system and implementing them to develop the statistical potentials. It was also significant that the prediction efficiency of mutations in the partially buried β-strands improves with the help of Gaussian apodization of the torsion angle distribution. All these comparisons and optimization techniques demonstrate their advantages as well as the restrictions for the development of the prediction model. These findings will be quite helpful not only for the protein stability prediction, but also for various structure solutions in future.

Journal ArticleDOI
TL;DR: These test computations show that a large scale high resolution protein structure prediction is possible, not only for small but also for large protein domains, and that it should be based on a hierarchical approach to the modeling protocol.
Abstract: Although experimental methods for determining protein structure are providing high resolution structures, they cannot keep the pace at which amino acid sequences are resolved on the scale of entire genomes. For a considerable fraction of proteins whose structures will not be determined experimentally, computational methods can provide valuable information. The value of structural models in biological research depends critically on their quality. Development of high-accuracy computational methods that reliably generate near-experimental quality structural models is an important, unsolved problem in the protein structure modeling. Large sets of structural decoys have been generated using reduced conformational space protein modeling tool CABS. Subsequently, the reduced models were subject to all-atom reconstruction. Then, the resulting detailed models were energy-minimized using state-of-the-art all-atom force field, assuming fixed positions of the alpha carbons. It has been shown that a very short minimization leads to the proper ranking of the quality of the models (distance from the native structure), when the all-atom energy is used as the ranking criterion. Additionally, we performed test on medium and low accuracy decoys built via classical methods of comparative modeling. The test placed our model evaluation procedure among the state-of-the-art protein model assessment methods. These test computations show that a large scale high resolution protein structure prediction is possible, not only for small but also for large protein domains, and that it should be based on a hierarchical approach to the modeling protocol. We employed Molecular Mechanics with fixed alpha carbons to rank-order the all-atom models built on the scaffolds of the reduced models. Our tests show that a physic-based approach, usually considered computationally too demanding for large-scale applications, can be effectively used in such studies.

Journal ArticleDOI
TL;DR: The crystal structure of E. coli TdcF is determined and it is shown that TDCF is capable of binding several low molecular weight metabolites bearing a carboxylate group, although the interaction with 2-ketobutyrate appears to be the most well defined.
Abstract: The YjgF/YER057c/UK114 family of proteins is widespread in nature, but has as yet no clearly defined biological role. Members of the family exist as homotrimers and are characterised by intersubunit clefts that are delineated by well-conserved residues; these sites are likely to be of functional significance, yet catalytic activity has never been detected for any member of this family. The gene encoding the TdcF protein of E. coli, a YjgF/YER057c/UK114 family member, resides in an operon that strongly suggests a role in the metabolism of 2-ketobutyrate for this protein. We have determined the crystal structure of E. coli TdcF by molecular replacement to a maximum resolution of 1.6 A. Structures are also presented of TdcF complexed with a variety of ligands. The TdcF structure closely resembles those of all YjgF/YER057c/UK114 family members determined thus far. It has the trimeric quaternary structure and intersubunit cavities characteristic of this family of proteins. We show that TdcF is capable of binding several low molecular weight metabolites bearing a carboxylate group, although the interaction with 2-ketobutyrate appears to be the most well defined. These observations may be indicative of a role for TdcF in sensing this potentially toxic metabolite.

Journal ArticleDOI
TL;DR: The detailed statistical analysis of diverse proteins links protein evolution to the biophysics of protein thermodynamic stability and folding and the basic structural features of conserved sequence regions are identified.
Abstract: Conserved protein sequence regions are extremely useful for identifying and studying functionally and structurally important regions. By means of an integrated analysis of large-scale protein structure and sequence data, structural features of conserved protein sequence regions were identified. Helices and turns were found to be underrepresented in conserved regions, while strands were found to be overrepresented. Similar numbers of loops were found in conserved and random regions. These results can be understood in light of the structural constraints on different secondary structure elements, and their role in protein structural stabilization and topology. Strands can tolerate fewer sequence changes and nonetheless keep their specific shape and function. They thus tend to be more conserved than helices, which can keep their shape and function with more changes. Loop behavior can be explained by the presence of both constrained and freely changing loops in proteins. Our detailed statistical analysis of diverse proteins links protein evolution to the biophysics of protein thermodynamic stability and folding. The basic structural features of conserved sequence regions are also important determinants of protein structure motifs and their function.

Journal ArticleDOI
TL;DR: This finding opens a gate towards protein engineering and subsequent protein design to refine the desired binding properties and preferences of lectins derived from PA-IIL, an approach that could have strong potential for drug design.
Abstract: Lectins are proteins of non-immune origin capable of binding saccharide structures with high specificity and affinity. Considering the high encoding capacity of oligosaccharides, this makes lectins important for adhesion and recognition. The present study is devoted to the PA-IIL lectin from Pseudomonas aeruginosa, an opportunistic human pathogen capable of causing lethal complications in cystic fibrosis patients. The lectin may play an important role in the process of virulence, recognizing specific saccharide structures and subsequently allowing the bacteria to adhere to the host cells. It displays high values of affinity towards monosaccharides, especially fucose – a feature caused by unusual binding mode, where two calcium ions participate in the interaction with saccharide. Investigating and understanding the nature of lectin-saccharide interactions holds a great potential of use in the field of drug design, namely the targeting and delivery of active compounds to the proper site of action. In vitro site-directed mutagenesis of the PA-IIL lectin yielded three single point mutants that were investigated both structurally (by X-ray crystallography) and functionally (by isothermal titration calorimetry). The mutated amino acids (22–23–24 triad) belong to the so-called specificity binding loop responsible for the monosaccharide specificity of the lectin. The mutation of the amino acids resulted in changes to the thermodynamic behaviour of the mutants and subsequently in their relative preference towards monosaccharides. Correlation of the measured data with X-ray structures provided the molecular basis for rationalizing the affinity changes. The mutations either prevent certain interactions to be formed or allow formation of new interactions – both of afore mentioned have strong effects on the saccharide preferences. Mutagenesis of amino acids forming the specificity binding loop allowed identification of one amino acid that is crucial for definition of the lectin sugar preference. Altering specificity loop amino acids causes changes in saccharide-binding preferences of lectins derived from PA-IIL, via creation or blocking possible binding interactions. This finding opens a gate towards protein engineering and subsequent protein design to refine the desired binding properties and preferences, an approach that could have strong potential for drug design.

Journal ArticleDOI
TL;DR: The data allow us to propose a model of the Grb7 SH2 domain/G7-18NATE interaction and to rationalize the basis for the observed binding specificity and affinity and it is proposed that the current study will assist with the development of second generation Grb 7 SH2domain inhibitors.
Abstract: Human g rowth factor r eceptor b ound protein 7 (Grb7) is an adapter protein that mediates the coupling of tyrosine kinases with their downstream signaling pathways. Grb7 is frequently overexpressed in invasive and metastatic human cancers and is implicated in cancer progression via its interaction with the ErbB2 receptor and focal adhesion kinase (FAK) that play critical roles in cell proliferation and migration. It is thus a prime target for the development of novel anti-cancer therapies. Recently, an inhibitory peptide (G7-18NATE) has been developed which binds specifically to the Grb7 SH2 domain and is able to attenuate cancer cell proliferation and migration in various cancer cell lines. As a first step towards understanding how Grb7 may be inhibited by G7-18NATE, we solved the crystal structure of the Grb7 SH2 domain to 2.1 A resolution. We describe the details of the peptide binding site underlying target specificity, as well as the dimer interface of Grb 7 SH2. Dimer formation of Grb7 was determined to be in the μM range using analytical ultracentrifugation for both full-length Grb7 and the SH2 domain alone, suggesting the SH2 domain forms the basis of a physiological dimer. ITC measurements of the interaction of the G7-18NATE peptide with the Grb7 SH2 domain revealed that it binds with a binding affinity of Kd = ~35.7 μM and NMR spectroscopy titration experiments revealed that peptide binding causes perturbations to both the ligand binding surface of the Grb7 SH2 domain as well as to the dimer interface, suggesting that dimerisation of Grb7 is impacted on by peptide binding. Together the data allow us to propose a model of the Grb7 SH2 domain/G7-18NATE interaction and to rationalize the basis for the observed binding specificity and affinity. We propose that the current study will assist with the development of second generation Grb7 SH2 domain inhibitors, potentially leading to novel inhibitors of cancer cell migration and invasion.