scispace - formally typeset
Search or ask a question
Journal ArticleDOI

ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins.

01 Jul 2007-Nucleic Acids Research (Oxford University Press)-Vol. 35, pp 407-410
TL;DR: The quality scores of a protein are displayed in the context of all known protein structures and problematic parts of a structure are shown and highlighted in a 3D molecule viewer in the ProSA-web service.
Abstract: A major problem in structural biology is the recognition of errors in experimental and theoretical models of protein structures. The ProSA program (Protein Structure Analysis) is an established tool which has a large user base and is frequently employed in the refinement and validation of experimental protein structures and in structure prediction and modeling. The analysis of protein structures is generally a difficult and cumbersome exercise. The new service presented here is a straightforward and easy to use extension of the classic ProSA program which exploits the advantages of interactive web-based applications for the display of scores and energy plots that highlight potential problems spotted in protein structures. In particular, the quality scores of a protein are displayed in the context of all known protein structures and problematic parts of a structure are shown and highlighted in a 3D molecule viewer. The service specifically addresses the needs encountered in the validation of protein structures obtained from X-ray analysis, NMR spectroscopy and theoretical calculations. ProSA-web is accessible at https://prosa.services.came.sbg.ac.at.
Citations
More filters
Journal ArticleDOI
TL;DR: This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications.
Abstract: Functional characterization of a protein sequence is a common goal in biology, and is usually facilitated by having an accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described.

3,495 citations

Journal ArticleDOI
TL;DR: This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications.
Abstract: Functional characterization of a protein sequence is one of the most frequent problems in biology. This task is usually facilitated by accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described.

3,006 citations

Journal ArticleDOI
TL;DR: The ability of the newly introduced QMEAN Z-score to detect experimentally solved protein structures containing significant errors, as well as to evaluate theoretical protein models is demonstrated.
Abstract: Motivation: Quality assessment of protein structures is an important part of experimental structure validation and plays a crucial role in protein structure prediction, where the predicted models may contain substantial errors. Most current scoring functions are primarily designed to rank alternative models of the same sequence supporting model selection, whereas the prediction of the absolute quality of an individual protein model has received little attention in the field. However, reliable absolute quality estimates are crucial to assess the suitability of a model for specific biomedical applications. Results: In this work, we present a new absolute measure for the quality of protein models, which provides an estimate of the ‘degree of nativeness’ of the structural features observed in a model and describes the likelihood that a given model is of comparable quality to experimental structures. Model quality estimates based on the QMEAN scoring function were normalized with respect to the number of interactions. The resulting scoring function is independent of the size of the protein and may therefore be used to assess both monomers and entire oligomeric assemblies. Model quality scores for individual models are then expressed as ‘Z-scores’ in comparison to scores obtained for high-resolution crystal structures. We demonstrate the ability of the newly introduced QMEAN Z-score to detect experimentally solved protein structures containing significant errors, as well as to evaluate theoretical protein models. In a comprehensive QMEAN Z-score analysis of all experimental structures in the PDB, membrane proteins accumulate on one side of the score spectrum and thermostable proteins on the other. Proteins from the thermophilic organism Thermatoga maritima received significantly higher QMEAN Z-scores in a pairwise comparison with their homologous mesophilic counterparts, underlining the significance of the QMEAN Z-score as an estimate of protein stability. Availability: The Z-score calculation has been integrated in the QMEAN server available at: http://swissmodel.expasy.org/qmean. Contact: torsten.schwede@unibas.ch Supplementary information:Supplementary data are available at Bioinformatics online.

1,844 citations


Cites background from "ProSA-web: interactive web service ..."

  • ...The performance of QMEAN with respect to other stateof-the-art methods such as ProSA (Sippl, 1993) and DFIRE (Zhou and Zhou, 2002) has also been recently assessed in an independent study (Rykunov and Fiser, 2010)....

    [...]

  • ...In contrast to QMEAN, the ProSA Z-score shows a clear correlation with protein size which limits its application as an absolute quality measure....

    [...]

  • ...The prediction of absolute model quality has rarely been addressed in the literature: the pioneering tool ProSA (Sippl, 1993) has primarily been developed to evaluate experimental structures and estimates the statistical significance of a structure by comparing its knowledge-based score to random structures with the same sequence....

    [...]

  • ...The ProSA (Wiederstein and Sippl, 2007) analysis of the two structure can be found in Supplementary Figure S6....

    [...]

  • ...The ProSA Z-score can hardly be used as a measure of absolute model quality as it is highly dependent on the protein size (i.e. the energy gap between the native fold and random decoy structures increases with protein size)....

    [...]

Journal ArticleDOI
TL;DR: A new computer program, called SHIFTX2, is described which is capable of rapidly and accurately calculating diamagnetic 1H, 13C and 15N chemical shifts from protein coordinate data and will open the door to many long-anticipated applications of chemical shift prediction to protein structure determination, refinement and validation.
Abstract: A new computer program, called SHIFTX2, is described which is capable of rapidly and accurately calculating diamagnetic 1H, 13C and 15N chemical shifts from protein coordinate data. Compared to its predecessor (SHIFTX) and to other existing protein chemical shift prediction programs, SHIFTX2 is substantially more accurate (up to 26% better by correlation coefficient with an RMS error that is up to 3.3× smaller) than the next best performing program. It also provides significantly more coverage (up to 10% more), is significantly faster (up to 8.5×) and capable of calculating a wider variety of backbone and side chain chemical shifts (up to 6×) than many other shift predictors. In particular, SHIFTX2 is able to attain correlation coefficients between experimentally observed and predicted backbone chemical shifts of 0.9800 (15N), 0.9959 (13Cα), 0.9992 (13Cβ), 0.9676 (13C′), 0.9714 (1HN), 0.9744 (1Hα) and RMS errors of 1.1169, 0.4412, 0.5163, 0.5330, 0.1711, and 0.1231 ppm, respectively. The correlation between SHIFTX2’s predicted and observed side chain chemical shifts is 0.9787 (13C) and 0.9482 (1H) with RMS errors of 0.9754 and 0.1723 ppm, respectively. SHIFTX2 is able to achieve such a high level of accuracy by using a large, high quality database of training proteins (>190), by utilizing advanced machine learning techniques, by incorporating many more features (χ2 and χ3 angles, solvent accessibility, H-bond geometry, pH, temperature), and by combining sequence-based with structure-based chemical shift prediction techniques. With this substantial improvement in accuracy we believe that SHIFTX2 will open the door to many long-anticipated applications of chemical shift prediction to protein structure determination, refinement and validation. SHIFTX2 is available both as a standalone program and as a web server (http://www.shiftx2.ca).

578 citations


Cites methods from "ProSA-web: interactive web service ..."

  • ...This collection of *250 high resolution X-ray structures was then analyzed for structural defects using a number of structure validation programs including VADAR (Willard et al. 2003), PROSA (Wiederstein and Sippl 2007), and WHAT_CHECK (Hooft et al. 1996)....

    [...]

  • ...2003), PROSA (Wiederstein and Sippl 2007), and WHAT_CHECK (Hooft et al....

    [...]

Journal ArticleDOI
29 Sep 2016-Nature
TL;DR: A global map of abundant, double-stranded DNA viruses complete with genomic and ecological contexts is presented to present a necessary foundation for the meaningful integration of viruses into ecosystem models where they act as key players in nutrient cycling and trophic networks.
Abstract: Ocean microbes drive biogeochemical cycling on a global scale. However, this cycling is constrained by viruses that affect community composition, metabolic activity, and evolutionary trajectories. Owing to challenges with the sampling and cultivation of viruses, genome-level viral diversity remains poorly described and grossly understudied, with less than 1% of observed surface-ocean viruses known. Here we assemble complete genomes and large genomic fragments from both surface- and deep-ocean viruses sampled during the Tara Oceans and Malaspina research expeditions, and analyse the resulting 'global ocean virome' dataset to present a global map of abundant, double-stranded DNA viruses complete with genomic and ecological contexts. A total of 15,222 epipelagic and mesopelagic viral populations were identified, comprising 867 viral clusters (defined as approximately genus-level groups). This roughly triples the number of known ocean viral populations and doubles the number of candidate bacterial and archaeal virus genera, providing a near-complete sampling of epipelagic communities at both the population and viral-cluster level. We found that 38 of the 867 viral clusters were locally or globally abundant, together accounting for nearly half of the viral populations in any global ocean virome sample. While two-thirds of these clusters represent newly described viruses lacking any cultivated representative, most could be computationally linked to dominant, ecologically relevant microbial hosts. Moreover, we identified 243 viral-encoded auxiliary metabolic genes, of which only 95 were previously known. Deeper analyses of four of these auxiliary metabolic genes (dsrC, soxYZ, P-II (also known as glnB) and amoC) revealed that abundant viruses may directly manipulate sulfur and nitrogen cycling throughout the epipelagic ocean. This viral catalog and functional analyses provide a necessary foundation for the meaningful integration of viruses into ecosystem models where they act as key players in nutrient cycling and trophic networks.

557 citations

References
More filters
Journal ArticleDOI
TL;DR: The goals of the PDB are described, the systems in place for data deposition and access, how to obtain further information and plans for the future development of the resource are described.
Abstract: The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.

34,239 citations

Journal ArticleDOI
01 Dec 1993-Proteins
TL;DR: Techniques based on knowledge based mean fields which can be used to judge the quality of protein folds are presented, used to identify misfolded structures as well as faulty parts of structural models.
Abstract: A major problem in the determination of the three-dimensional structure of proteins concerns the quality of the structural models obtained from the interpretation of experimental data. New developments in X-ray crystallography and nuclear magnetic resonance spectroscopy have accelerated the process of structure determination and the biological community is confronted with a steadily increasing number of experimentally determined protein folds. However, in the recent past several experimentally determined protein structures have been proven to contain major errors, indicating that in some cases the interpretation of experimental data is difficult and may yield incorrect models. Such problems can be avoided when computational methods are employed which complement experimental structure determinations. A prerequisite of such computational tools is that they are independent of the parameters obtained from a particular experiment. In addition such techniques are able to support and accelerate experimental structure determinations. Here we present techniques based on knowledge based mean fields which can be used to judge the quality of protein folds. The methods can be used to identify misfolded structures as well as faulty parts of structural models. The techniques are even applicable in cases where only the C alpha trace of a protein conformation is available. The capabilities of the technique are demonstrated using correct and incorrect protein folds.

1,980 citations

Journal ArticleDOI
14 Sep 2006-Nature
TL;DR: The observed, outward-facing conformation reflects the ATP-bound state, with the two nucleotide-binding domains in close contact and the two transmembrane domains forming a central cavity—presumably the drug translocation pathway—that is shielded from the inner leaflet of the lipid bilayer and from the cytoplasm, but exposed to the outer leaflet and the extracellular space.
Abstract: Multidrug transporters of the ABC family facilitate the export of diverse cytotoxic drugs across cell membranes. This is clinically relevant, as tumour cells may become resistant to agents used in chemotherapy. To understand the molecular basis of this process, we have determined the 3.0 A crystal structure of a bacterial ABC transporter (Sav1866) from Staphylococcus aureus. The homodimeric protein consists of 12 transmembrane helices in an arrangement that is consistent with cross-linking studies and electron microscopic imaging of the human multidrug resistance protein MDR1, but critically different from that reported for the bacterial lipid flippase MsbA. The observed, outward-facing conformation reflects the ATP-bound state, with the two nucleotide-binding domains in close contact and the two transmembrane domains forming a central cavity—presumably the drug translocation pathway—that is shielded from the inner leaflet of the lipid bilayer and from the cytoplasm, but exposed to the outer leaflet and the extracellular space. Multidrug efflux transporters cause serious problems in cancer chemotherapy and in the treatment of bacterial infections. A puzzling aspect of their biology is how a single transporter can recognize and transport such a wide variety of structurally dissimilar compounds. The publication of the crystal structures of two quite different multidrug efflux transporters will help to solve the mystery. In the first study, the structure of AcrB — a multidrug efflux transporter from E. coli — was determined. Its three constituent subunits were captured at different steps in the transport cycle: prior to substrate binding, substrate-bound, and post-extrusion. The voluminous multidrug binding pocket handles multiple substrates via multi-site binding. The second study determined the structure of an ATP-driven multidrug transporter from S. aureus. The clinical relevance of this 'ABC' family of transporters derives from the fact that they catalyse the extrusion of various cytotoxic compounds used in cancer therapy. The structure, with the transporter in the outward-facing conformation, is a useful model of human homologues and may initiate the rational design of drugs aimed at interfering with the extrusion of agents used in chemotherapy.

1,244 citations

Journal ArticleDOI
TL;DR: A prototype of a new approach to the folding problem of polypeptide chains based on the analysis of known protein structures, which derives the energy potentials for the atomic interactions of all amino acid residue pairs as a function of the distance between the involved atoms is presented.

1,086 citations

Journal ArticleDOI
07 Sep 2001-Science
TL;DR: The structure of MsbA can serve as a model for the MDR-ABC transporters that confer multidrug resistance to cancer cells and infectious microorganisms.
Abstract: Multidrug resistance (MDR) is a serious medical problem and presents a major challenge to the treatment of disease and the development of novel therapeutics. ABC transporters that are associated with multidrug resistance (MDR-ABC transporters) translocate hydrophobic drugs and lipids from the inner to the outer leaflet of the cell membrane. To better elucidate the structural basis for the “flip-flop” mechanism of substrate movement across the lipid bilayer, we have determined the structure of the lipid flippase MsbA from Escherichia coli by x-ray crystallography to a resolution of 4.5 angstroms. MsbA is organized as a homodimer with each subunit containing six transmembrane α-helices and a nucleotide-binding domain. The asymmetric distribution of charged residues lining a central chamber suggests a general mechanism for the translocation of substrate by MsbA and other MDR-ABC transporters. The structure of MsbA can serve as a model for the MDR-ABC transporters that confer multidrug resistance to cancer cells and infectious microorganisms.

643 citations


"ProSA-web: interactive web service ..." refers background in this paper

  • ...Subfigures (A–C) show the results for a monomer of MsbA (PDB code 1JSQ, chain A (17))....

    [...]