scispace - formally typeset
Search or ask a question

Showing papers on "Protein Data Bank published in 2003"


Journal ArticleDOI
TL;DR: PISCES is a public server for culling sets of protein sequences from the Protein Data Bank by sequence identity and structural quality criteria and provides better lists than servers that use BLAST, which is unable to identify many relationships below 40% sequence identity.
Abstract: PISCES is a public server for culling sets of protein sequences from the Protein Data Bank (PDB) by sequence identity and structural quality criteria. PISCES can provide lists culled from the entire PDB or from lists of PDB entries or chains provided by the user. The sequence identities are obtained from PSI-BLAST alignments with position-specific substitution matrices derived from the non-redundant protein sequence database. PISCES therefore provides better lists than servers that use BLAST, which is unable to identify many relationships below 40% sequence identity and often overestimates sequence identity by aligning only well-conserved fragments. PDB sequences are updated weekly. PISCES can also cull non-PDB sequences provided by the user as a list of GenBank identifiers, a FASTA format file, or BLAST/PSI-BLAST output.

1,649 citations


Journal ArticleDOI
TL;DR: A comprehensive software package for the analysis, reconstruction and visualization of three-dimensional nucleic acid structures that can handle antiparallel and parallel double helices, single-stranded structures, triplexes, quadruplexes and other complex tertiary folding motifs found in both DNA and RNA structures is presented.
Abstract: We present a comprehensive software package, 3DNA, for the analysis, reconstruction and visualization of three-dimensional nucleic acid structures. Starting from a coordinate file in Protein Data Bank (PDB) format, 3DNA can handle antiparallel and parallel double helices, single-stranded structures, triplexes, quadruplexes and other complex tertiary folding motifs found in both DNA and RNA structures. The analysis routines identify and categorize all base interactions and classify the double helical character of appropriate base pair steps. The program makes use of a recently recommended reference frame for the description of nucleic acid base pair geometry and a rigorous matrix-based scheme to calculate local conformational parameters and rebuild the structure from these parameters. The rebuilding routines produce rectangular block representations of nucleic acids as well as full atomic models with the sugar-phosphate backbone and publication quality 'standardized' base stacking diagrams. Utilities are provided to locate the base pairs and helical regions in a structure and to reorient structures for effective visualization. Regular helical models based on X-ray diffraction measurements of various repeating sequences can also be generated within the program.

1,598 citations


Book ChapterDOI
TL;DR: This work states that functional characterization of a protein sequence is one of the most frequent problems in biology and comparative or homology modeling can sometimes provide a useful 3D model for a protein (target) that is related to at least one known protein structure (template).
Abstract: Publisher Summary Functional characterization of a protein sequence is one of the most frequent problems in biology. This task is usually facilitated by accurate three-dimensional (3D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3D model for a protein (target) that is related to at least one known protein structure (template). A 3D structure of proteins from the same family is more conserved than their primary sequences. Therefore, if similarity between two proteins is detectable at the sequence level, structural similarity can usually be assumed. Comparative modeling usually starts by searching the Protein Data Bank (PDB) of known protein structures using the target sequence as the query. This search is generally done by comparing the target sequence with the sequence of each of the structures in the database. Comparative modeling consists of five steps: (1) search for related protein structures, (2) selection of one or more templates, (3) target–template alignment, (4) model building, and (5) model evaluation. If the model is not satisfactory, some or all of the steps can be repeated. There are several computer programs and Web servers that automate the comparative modeling process. The first Web server for automated comparative modeling was the Swiss-Model server, followed by CPHModels and ModWeb. These servers accept a sequence from a user and return an all-atom comparative model when possible.

1,559 citations


Journal ArticleDOI
TL;DR: The fortran program ESPript was created in 1993, to display on a PostScript figure multiple sequence alignments adorned with secondary structure elements of each sequence of known 3D structure.
Abstract: The fortran program ESPript was created in 1993, to display on a PostScript figure multiple sequence alignments adorned with secondary structure elements. A web server was made available in 1999 and ESPript has been linked to three major web tools: ProDom which identifies protein domains, PredictProtein which predicts secondary structure elements and NPS@ which runs sequence alignment programs. A web server named ENDscript was created in 2002 to facilitate the generation of ESPript figures containing a large amount of information. ENDscript uses programs such as BLAST, Clustal and PHYLODENDRON to work on protein sequences and such as DSSP, CNS and MOLSCRIPT to work on protein coordinates. It enables the creation, from a single Protein Data Bank identifier, of a multiple sequence alignment figure adorned with secondary structure elements of each sequence of known 3D structure. Similar 3D structures are superimposed in turn with the program PROFIT and a final figure is drawn with BOBSCRIPT, which shows sequence and structure conservation along the Cα trace of the query. ESPript and ENDscript are available at http://genopole.toulouse.inra.fr/ESPript.

1,308 citations


Journal ArticleDOI
TL;DR: VADAR (Volume Area Dihedral Angle Reporter) is a comprehensive web server for quantitative protein structure evaluation that calculates, identifies, graphs, reports and/or evaluates a large number of key structural parameters both for individual residues and for the entire protein.
Abstract: VADAR (Volume Area Dihedral Angle Reporter) is a comprehensive web server for quantitative protein structure evaluation. It accepts Protein Data Bank (PDB) formatted files or PDB accession numbers as input and calculates, identifies, graphs, reports and/or evaluates a large number (>30) of key structural parameters both for individual residues and for the entire protein. These include excluded volume, accessible surface area, backbone and side chain dihedral angles, secondary structure, hydrogen bonding partners, hydrogen bond energies, steric quality, solvation free energy as well as local and overall fold quality. These derived parameters can be used to rapidly identify both general and residue-specific problems within newly determined protein structures. The VADAR web server is freely accessible at http://redpoll.pharmacy.ualberta.ca/vadar.

780 citations


Journal ArticleDOI
TL;DR: An improved estimator for the probabilities of the number of molecules in the crystallographic asymmetric unit has been implemented, using resolution as additional information.
Abstract: Estimating the number of molecules in the crystallographic asymmetric unit is one of the first steps in a macromolecular structure determination. Based on a survey of 15,641 crystallographic Protein Data Bank (PDB) entries the distribution of VM, the crystal volume per unit of protein molecular weight, known as Matthews coefficient, has been reanalyzed. The range of values and frequencies has changed in the 30 years since Matthews first analysis of protein crystal solvent content. In the statistical analysis, complexes of proteins and nucleic acids have been treated as a separate group. In addition, the VM distribution for nucleic acid crystals has been examined for the first time. Observing that resolution is a significant discriminator of VM, an improved estimator for the probabilities of the number of molecules in the crystallographic asymmetric unit has been implemented, using resolution as additional information.

728 citations


Journal ArticleDOI
TL;DR: ModLoop is a web server for automated modeling of loops in protein structures that predicts the loop conformations by satisfaction of spatial restraints, without relying on a database of known protein structures.
Abstract: Summary: ModLoop is a web server for automated modeling of loops in protein structures. The input is the atomic coordinates of the protein structure in the Protein Data Bank format, and the specification of the starting and ending residues of one or more segments to be modeled, containing no more than 20 residues in total. The output is the coordinates of the nonhydrogen atoms in the modeled segments. A user provides the input to the server via a simple web interface, and receives the output by e-mail. The server relies on the loop modeling routine in MODELLER that predicts the loop conformations by satisfaction of spatial restraints, without relying on a database of known protein structures. For a rapid response, ModLoop runs on a cluster of Linux PC computers. Availability: The server is freely accessible to academic users at http://salilab.org/modloop

680 citations


Journal ArticleDOI
TL;DR: It is shown that protein function can be predicted as enzymatic or not without resorting to alignments, and the method is compared to sequence-based methods that also avoid calculating alignments and predict a recently released set of unrelated proteins.

553 citations


Journal ArticleDOI
TL;DR: Structural analysis of structural data on glycosidic linkages is extended to the glycan-protein linkage, and the peptide primary, secondary, and tertiary structures around N-glycosylation sites, and findsHydrophobic protein-glycan interactions and the low accessibility of glycosylated asparagine sites in folded proteins are common features and may be critical in mediating these functions.
Abstract: We recently reported statistical analysis of structural data on glycosidic linkages. Here we extend this analysis to the glycan-protein linkage, and the peptide primary, secondary, and tertiary structures around N-glycosylation sites. We surveyed 506 glycoproteins in the Protein Data Bank crystallographic database, giving 2592 glycosylation sequons (1683 occupied) and generated a database of 626 nonredundant sequons with 386 occupied. Deviations in the expected amino acid composition were seen around occupied asparagines, particularly an increased occurrence of aromatic residues before the asparagine and threonine at position +2. Glycosylation alters the asparagine side chain torsion angle distribution and reduces its flexibility. There is an elevated probability of finding glycosylation sites in which secondary structure changes. An 11-class taxonomy was developed to describe protein surface geometry around glycosylation sites. Thirty-three percent of the occupied sites are on exposed convex surfaces, 10% in deep recesses and 20% on the edge of grooves with the glycan filling the cleft. A surprisingly large number of glycosylated asparagine residues have a low accessibility. The incidence of aromatic amino acids brought into close contact with the glycan by the folding process is higher than their normal levels on the surface or in the protein core. These data have significant implications for control of sequon occupancy and evolutionary selection of glycosylation sites and are discussed in relation to mechanisms of protein fold stabilization and regional quality control of protein folding. Hydrophobic protein-glycan interactions and the low accessibility of glycosylation sites in folded proteins are common features and may be critical in mediating these functions.

449 citations


Journal ArticleDOI
TL;DR: A shape-based Gaussian docking function is constructed which uses Gaussian functions to represent the shapes of individual atoms and it is found that by employing this docking function, quasi-Newton optimization is capable of moving ligands great distances to locate the correctly docked structure.
Abstract: A shape-based Gaussian docking function is constructed which uses Gaussian functions to represent the shapes of individual atoms. A set of 20 trypsin ligand-protein complexes are drawn from the Protein Data Bank (PDB), the ligands are separated from the proteins, and then are docked back into the active sites using numerical optimization of this function. It is found that by employing this docking function, quasi-Newton optimization is capable of moving ligands great distances [on average 7 A root mean square distance (RMSD)] to locate the correctly docked structure. It is also found that a ligand drawn from one PDB file can be docked into a trypsin structure drawn from any of the trypsin PDB files. This implies that this scoring function is not limited to more accurate x-ray structures, as is the case for many of the conventional docking methods, but could be extended to homology models.

433 citations


Journal ArticleDOI
John D. Westbrook1, Zukang Feng1, Li Chen1, Huanwang Yang1, Helen M. Berman1 
TL;DR: The Protein Data Bank (PDB) continues to be actively involved in various aspects of the informatics of structural genomics projects--developing and maintaining the Target Registration Database (TargetDB), organizing data dictionaries that will define the specification for the exchange and deposition of data with the structural Genomics centers and creating software tools to capture data from standard structure determination applications.
Abstract: The Protein Data Bank (PDB; http://www.pdb.org/) continues to be actively involved in various aspects of the informatics of structural genomics projects--developing and maintaining the Target Registration Database (TargetDB), organizing data dictionaries that will define the specification for the exchange and deposition of data with the structural genomics centers and creating software tools to capture data from standard structure determination applications.

Journal ArticleDOI
TL;DR: The results suggest that the performance of the docking calculation is affected by the particular representation of the receptor used in the screen, and that the holo structure is the one most likely to yield the best discrimination between known ligands and decoy molecules, but important exceptions to this rule also emerge.
Abstract: Molecular docking uses the three-dimensional structure of a receptor to screen a small molecule database for potential ligands. The dependence of docking screens on the conformation of the binding site remains an open question. To evaluate the information loss that occurs as the active site conformation becomes less defined, a small molecule database was docked against the holo (ligand bound), apo, and homology modeled structures of 10 different enzyme binding sites. The holo and apo representations were crystallographic structures taken from the Protein Data Bank (PDB), and the homology-modeled structures were taken from the publicly available resource ModBase. The database docked was the MDL Drug Data Report (MDDR), a functionally annotated database of 95000 small molecules that contained at least 35 ligands for each of the 10 systems. In all sites, at least 99% of the molecules in the MDDR were treated as nonbinding decoys. For each system, the holo, apo, and modeled structures were used to screen the MDDR, and the ability of each structure to enrich the known ligands for that system over random selection was evaluated. The best overall enrichment was produced by the holo structure in seven systems, the apo structure in two systems, and the modeled structure in one system. These results suggest that the performance of the docking calculation is affected by the particular representation of the receptor used in the screen, and that the holo structure is the one most likely to yield the best discrimination between known ligands and decoy molecules, but important exceptions to this rule also emerge from this study. Although each of the holo, apo, and modeled conformations led to enrichment of known ligands in all systems, the enrichment did not always rise to a level judged to be sufficient to justify the effort of a docking screen. Using a 20-fold enrichment of known ligands over random selection as a rough guideline for what might be enough to justify a docking screen, the holo conformation of the enzyme met this criterion in eight of 10 sites, whereas the apo conformation met this criterion in only two sites and the modeled conformation in three.

Journal ArticleDOI
TL;DR: Shotgun NMR short‐circuits the laborious and time‐consuming process of obtaining complete sequential assignments prior to the calculation of a protein structure from the NMR data by taking advantage of the orientational information inherent to the spectra of aligned proteins.
Abstract: A solid-state NMR approach for simultaneous resonance assignment and three-dimensional structure determination of a membrane protein in lipid bilayers is described. The approach is based on the scattering, hence the descriptor “shotgun,” of 15N-labeled amino acids throughout the protein sequence (and the resulting NMR spectra). The samples are obtained by protein expression in bacteria grown on media in which one type of amino acid is labeled and the others are not. Shotgun NMR short-circuits the laborious and time-consuming process of obtaining complete sequential assignments prior to the calculation of a protein structure from the NMR data by taking advantage of the orientational information inherent to the spectra of aligned proteins. As a result, it is possible to simultaneously assign resonances and measure orientational restraints for structure determination. A total of five two-dimensional 1H/15N PISEMA (polarization inversion spin exchange at the magic angle) spectra, from one uniformly and four selectively 15N-labeled samples, were sufficient to determine the structure of the membrane-bound form of the 50-residue major pVIII coat protein of fd filamentous bacteriophage. Pisa (polarity index slat angle) wheels are an essential element in the process, which starts with the simultaneous assignment of resonances and the assembly of isolated polypeptide segments, and culminates in the complete three-dimensional structure of the protein with atomic resolution. The principles are also applicable to weakly aligned proteins studied by solution NMR spectroscopy. [The structure we determined for the membrane-bound form of the Fd bacteriophage pVIII coat protein has been deposited in the Protein Data Bank as PDB file 1MZT.]

Journal ArticleDOI
TL;DR: A novel approach for inferring functional relationship of proteins by detecting sequence and spatial patterns of protein surfaces, which can detect functional relationship with specificity for members of the same protein family and superfamily, as well as remotely related functional surfaces from proteins of different fold structures.

Journal ArticleDOI
TL;DR: A new view of the nature of protein structure space is given, and its implications for protein structure prediction are discussed.

Journal ArticleDOI
TL;DR: An investigation of the peptide backbone and Trp side chains for ion permeation using molecular dynamics simulation with an explicit lipid bilayer membrane, similar to the system used for the solid-state NMR experiments, underscores the utility of molecular dynamics simulations in the analysis and interpretation of structural information from solid- state NMR.
Abstract: Two different high-resolution structures recently have been proposed for the membrane-spanning gramicidin A channel: one based on solid-state NMR experiments in oriented phospholipid bilayers (Ketchem, R. R.; Roux, B.; Cross, T. A. Structure 1997, 5, 1655−1669; Protein Data Bank, PDB:1MAG); and one based on two-dimensional NMR in detergent micelles (Townsley, L. E.; Tucker, W. A.; Sham, S.; Hinton, J. F. Biochemistry 2001, 40, 11676−11686; PDB:1JNO). Despite overall agreement, the two structures differ in peptide backbone pitch and the orientation of several side chains; in particular that of the Trp at position 9. Given the importance of the peptide backbone and Trp side chains for ion permeation, we undertook an investigation of the two structures using molecular dynamics simulation with an explicit lipid bilayer membrane, similar to the system used for the solid-state NMR experiments. Based on 0.1 μs of simulation, both backbone structures converge to a structure with 6.25 residues per turn, in agreem...

Journal ArticleDOI
TL;DR: Application of the knowledge-based distance-dependent pair potentials proved efficient to restrain the homology modelling process and to score and optimise the modelled protein-ligand complexes.

Journal ArticleDOI
TL;DR: The E-MSD macromolecular structure relational database is designed to be a single access point for protein and nucleic acid structures and related information, derived from Protein Data Bank entries.
Abstract: The E-MSD macromolecular structure relational database (http://www.ebi.ac.uk/msd) is designed to be a single access point for protein and nucleic acid structures and related information. The database is derived from Protein Data Bank (PDB) entries. Relational database technologies are used in a comprehensive cleaning procedure to ensure data uniformity across the whole archive. The search database contains an extensive set of derived properties, goodness-of-fit indicators, and links to other EBI databases including InterPro, GO, and SWISS-PROT, together with links to SCOP, CATH, PFAM and PROSITE. A generic search interface is available, coupled with a fast secondary structure domain search tool.

Journal ArticleDOI
TL;DR: In this paper, an automatic procedure is proposed for the inference of assembly structures that are likely to be physiologically relevant by scoring crystal contacts by their contact size and chemical complementarity, and the subunit assembly is then inferred from these scored contacts by a clustering procedure involving a single adjustable parameter.
Abstract: The arrangement of the subunits in an oligomeric protein often cannot be inferred without ambiguity from crystallographic studies. The annotation of the functional assembly of protein structures in the Protein Data Bank (PDB) is incomplete and frequently inconsistent. Instructions for the reconstruction, by symmetry, of the functional assembly from the deposited coordinates are often absent. An automatic procedure is proposed for the inference of assembly structures that are likely to be physiologically relevant. The method scores crystal contacts by their contact size and chemical complementarity. The subunit assembly is then inferred from these scored contacts by a clustering procedure involving a single adjustable parameter. When predicting the oligomeric state for a non-redundant set of 55 monomeric and 163 oligomeric proteins from dimers up to hexamers, a classification error rate of 16% was observed.

Journal ArticleDOI
TL;DR: Since many protein structure studies must address globular and membrane proteins separately, this new elimination factor, which excludes membrane protein chains, is introduced in the PDB-REPRDB system.
Abstract: PDB-REPRDB is a database of representative protein chains from the Protein Data Bank (PDB). Started at the Real World Computing Partnership (RWCP) in August 1997, it developed to the present system of PDB-REPRDB. In April 2001, the system was moved to the Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST) (http://www.cbrc.jp/); it is available at http://www.cbrc.jp/pdbreprdb/. The current database includes 33 368 protein chains from 16 682 PDB entries (1 September, 2002), from which are excluded (a) DNA and RNA data, (b) theoretically modeled data, (c) short chains (1<40 residues), or (d) data with non-standard amino acid residues at all residues. The number of entries including membrane protein structures in the PDB has increased rapidly with determination of numbers of membrane protein structures because of improved X-ray crystallography, NMR, and electron microscopic experimental techniques. Since many protein structure studies must address globular and membrane proteins separately, this new elimination factor, which excludes membrane protein chains, is introduced in the PDB-REPRDB system. Moreover, the PDB-REPRDB system for membrane protein chains begins at the same URL. The current membrane database includes 551 protein chains, including membrane domains in the SCOP database of release 1.59 (15 May, 2002).

Journal ArticleDOI
TL;DR: WebFEATURE is a web-accessible structural analysis tool that allows users to scan query structures for functional sites in both proteins and nucleic acids and is the public interface to the scanning algorithm of the FEATURE package, a supervised learning algorithm for creating and identifying 3D, physicochemical motifs in molecular structures.
Abstract: WebFEATURE (http://feature.stanford.edu/webfeature/) is a web-accessible structural analysis tool that allows users to scan query structures for functional sites in both proteins and nucleic acids. WebFEATURE is the public interface to the scanning algorithm of the FEATURE package, a supervised learning algorithm for creating and identifying 3D, physicochemical motifs in molecular structures. Given an input structure or Protein Data Bank identifier (PDB ID), and a statistical model of a functional site, WebFEATURE will return rank-scored ‘hits’ in 3D space that identify regions in the structure where similar distributions of physicochemical properties occur relative to the site model. Users can visualize and interactively manipulate scored hits and the query structure in web browsers that support the Chime plug-in. Alternatively, results can be downloaded and visualized through other freely available molecular modeling tools, like RasMol, PyMOL and Chimera. A major application of WebFEATURE is in rapid annotation of function to structures in the context of structural genomics.

Journal ArticleDOI
TL;DR: This work describes a server called NCI, which allows the user to either upload protein/peptide coordinates in Protein Data Bank (PDB) format or enter a Structural Classification of Proteins database (SCOP)/PDB identifier for which NCI identifies the different non-canonical interactions, based purely on geometric criteria.
Abstract: NCI is a server for the identification of non-canonical interactions in protein structures. These interactions, which include N-H...pi, C(alpha)-H...pi, C(alpha)-H...O=C and variants of them, were first observed in small molecules and subsequently in high-resolution protein structures. Such interactions have been subjected to extensive structural analysis to elucidate the different geometric criteria required to identify them. These interactions have also recently been shown to be important for the stability of protein structures. In this work, I describe a server called NCI, which allows the user to either upload protein/peptide coordinates in Protein Data Bank (PDB) format or enter a Structural Classification of Proteins database (SCOP)/PDB identifier for which NCI identifies the different non-canonical interactions, based purely on geometric criteria. Results are presented as an HTML table, as a parseable text file and as a color-coded interaction matrix. In addition, the user can view the RasMol image highlighting the interactions in the protein structure and download the RasMol script. The NCI server is available at: http://www.mrc-lmb.cam.ac.uk/genomes/nci/.

Journal ArticleDOI
TL;DR: The program ASSAM, which has been developed to search for patterns of amino acid side-chains in the 3D structures in the Protein Data Bank, calculates a graph representation of a protein in which the individual side-chain vectors are the nodes and the intervector distances are the edges.
Abstract: This paper describes the program ASSAM, which has been developed to search for patterns of amino acid side-chains in the 3D structures in the Protein Data Bank. ASSAM represents an amino acid by a vector drawn from the main chain towards the functional part of the amino acid and then computes a graph representation of a protein in which the individual side-chain vectors are the nodes and the intervector distances are the edges. The presence of a query pattern in a Protein Data Bank structure can then be searched for by means of a subgraph isomorphism algorithm. Recent enhancements to ASSAM allow searches to include the following: the main-chain structure in addition to the side-chains; the secondary structure and solvent accessibility of side-chains; allowable distances from a known binding-site; disulfide bridges; and improved generic and wild-card queries. The effectiveness of these approaches is demonstrated by extensive searches of the Protein Data Bank for typical 3D query patterns.

Journal ArticleDOI
TL;DR: The first atomic resolution (<1.20 A) structure of a copper protein, nitrite reductase, and of a mutant of the catalytically important Asp92 residue (D92E) is provided, providing the basis from which to build a detailed mechanism of this important enzyme.

Journal ArticleDOI
TL;DR: The Saccharomyces Genome Database (SGD) has recently developed new resources to provide more complete information about proteins from the budding yeast SacCharomyces cerevisiae, including the Protein Information page, which contains protein physical and chemical properties, predicted from the translated ORF sequence.
Abstract: The Saccharomyces Genome Database (SGD: http:// genome-www.stanford.edu/Saccharomyces/) has recently developed new resources to provide more complete information about proteins from the budding yeast Saccharomyces cerevisiae. The PDB Homologs page provides structural information from the Protein Data Bank (PDB) about yeast proteins and/or their homologs. SGD has also created a resource that utilizes the eMOTIF database for motif information about a given protein. A third new resource is the Protein Information page, which contains protein physical and chemical properties, such as molecular weight and hydropathicity scores, predicted from the translated ORF sequence.

Journal ArticleDOI
TL;DR: The results, which update previous studies, show that there exists sufficient coverage to model even a novel fold using fragments from the Protein Data Bank, as the current database of known structures has increased enormously in the last few years.
Abstract: Assembling short fragments from known structures has been a widely used approach to construct novel protein structures. To what extent there exist structurally similar fragments in the database of known structures for short fragments of a novel protein is a question that is fundamental to this approach. This work addresses that question for seven-, nine- and 15-residue fragments. For each fragment size, two databases, a query database and a template database of fragments from high-quality protein structures in SCOP20 and SCOP90, respectively, were constructed. For each fragment in the query database, the template database was scanned to find the lowest r.m.s.d. fragment among non-homologous structures. For seven-residue fragments, there is a 99% probability that there exists such a fragment within 0.7 A r.m.s.d. for each loop fragment. For nine-residue fragments there is a 96% probability of a fragment within 1 A r.m.s.d., while for 15-residue fragments there is a 91% probability of a fragment within 2 A r.m.s.d.. These results, which update previous studies, show that there exists sufficient coverage to model even a novel fold using fragments from the Protein Data Bank, as the current database of known structures has increased enormously in the last few years. We have also explored the use of a grid search method for loop homology modeling and make some observations about the use of a grid search compared with a database search for the loop modeling problem.

Journal ArticleDOI
TL;DR: A new application, described here, is the visualization of 75 interfaces in structures of protein-DNA and protein-RNA complexes, and the MolSurfer web server is now able to compute and map Poisson-Boltzmann electrostatic potentials of macromolecules onto interfaces.
Abstract: We describe the current status of the Java molecular graphics tool, MolSurfer. MolSurfer has been designed to assist the analysis of the structures and physico-chemical properties of macromolecular interfaces. MolSurfer provides a coupled display of two-dimensional (2D) maps of the interfaces generated with the ADS software and a three-dimensional (3D) view of the macromolecular structure in the Java PDB viewer, WebMol. The interfaces are analytically defined and properties such as electrostatic potential or hydrophobicity are projected on to them. MolSurfer has been applied previously to analyze a set of 39 protein-protein complexes, with structures available from the Protein Data Bank (PDB). A new application, described here, is the visualization of 75 interfaces in structures of protein-DNA and protein-RNA complexes. Another new feature is that the MolSurfer web server is now able to compute and map Poisson-Boltzmann electrostatic potentials of macromolecules onto interfaces. The MolSurfer web server is available at http://projects.villa-bosch.de/mcm/software/molsurfer.

Book ChapterDOI
TL;DR: This chapter evaluates the crystallographic data for bacteriorhodopsin and its photo intermediates and attempts to correlate the structural with the nonstructural data in order to explore the various and often contradictory mechanistic conclusions drawn.
Abstract: Publisher Summary This chapter evaluates the crystallographic data for bacteriorhodopsin and its photo intermediates. It attempts to correlate the structural with the nonstructural data in order to explore the various and often contradictory mechanistic conclusions drawn. Bacteriorhodopsin, a light-driven ion pump in halobacteria, is a simpler system than others because in this small seven-helical protein transport is driven not by a chemical reaction but by the free energy gained upon photoisomerization of the retinal to 13-cis,15-anti. The Protein Data Bank (PDB) contains 33 atomic coordinate entries of bacteriorhodopsin structures. This large number of models attests to the fact that bacteriorhodopsin is one of the most-studied and best-understood integral membrane proteins. Three of these coordinate entries are theoretical models (1BAC, 1BAD, and 1115) and three are NMR structures of fragments (1BCT, 1BHA, and 1BHB).

Journal ArticleDOI
TL;DR: Some of the more common methods and algorithms used to solve the docking problem are described and a review of recent applications in cancer research is reviewed.
Abstract: In recent years there has been a growing interest in computer-based screening. One of the driving forces has been the increased efficiency of protein crystallography leading to the real possibility of using structure-based design as a significant contributor to the discovery of novel ligands. In 1957 after 22 years of work the first protein structure, determined by x-ray crystallography was produced [1]. Now the process has become increasingly automated and nearly 20,000 protein structures are available in the Protein Data Bank (PDB) [2]. Equally, progress in genomics will result in a great expansion of validated targets for cancer therapy. The understanding of the relationships between structure and function of gene products will be one of the key routes to new therapeutic advances. The challenge now is to use this data in the discovery of novel therapeutics. One approach is obviously to synthesize molecules and co-crystallize or soak them into the protein crystal and so determine the position and interaction of the molecule with the protein. The structural information obtained (where does the molecule bind; what are the ligand/protein/solvent interactions?) can be invaluable in the generation of novel molecules or in the re-design of existing molecules whose drug properties are not optimal. However, when dealing with large numbers (millions) of molecules, when crystallization is difficult or in testing hypotheses, a significant contribution can be made using computer based screening methods. In order to use the structural information derived from x-ray crystallography (or other sources, for example NMR or homology modelling) when evaluating the utility of a novel ligand, we need to understand where in the protein (or other macromolecule such as RNA) the ligand is likely to bind and also if possible, the strength of the binding interactions. This problem is known as the 'docking problem'. There have been many approaches to the solution of this problem over the last ten years. For example, some methods rely on complex molecular dynamics simulations while others use less costly graph matching approaches. There is generally a compromise between speed and accuracy, with some methods giving much more information and insight into the nature of the protein/ligand interactions and other methods optimised for speed of docking thousands of putative ligands. We will describe some of the more common methods and algorithms used to solve the docking problem and in particular, we will review recent applications in cancer research.

Journal ArticleDOI
TL;DR: RNABase is a unified database of all three-dimensional structures containing RNA deposited in either the Protein Data Bank (PDB) or Nucleic Acid Data Base (NDB).
Abstract: RNABase is a unified database of all three-dimensional structures containing RNA deposited in either the Protein Data Bank (PDB) or Nucleic Acid Data Base (NDB). For each structure, RNABase contains a brief summary as well as annotation of conformational parameters, identification of possible model errors, Ramachandran-style conformational maps and classification of ribonucleotides into conformers. These same analyses can also be performed on structures submitted by users. To facilitate access, structures are automatically placed into a variety of functional and structural categories, including: ribozymes, pseudoknots, etc. RNABase can be freely accessed on the web at http://www.rnabase.org. We are committed to maintaining this database indefinitely.