The Protein Data Bank

doi:10.1093/NAR/28.1.235

Home
/
Papers
/
The Protein Data Bank

Journal Article•DOI•

The Protein Data Bank

Helen M. Berman¹, John D. Westbrook, Zukang Feng, Gary L. Gilliland, Talapady N. Bhat, Helge Weissig, Ilya N. Shindyalov, Philip E. Bourne - Show less +4 more•Institutions (1)

Rutgers University¹

01 Jan 2000-Nucleic Acids Research (Oxford University Press)-Vol. 28, Iss: 1, pp 235-242

TL;DR: The goals of the PDB are described, the systems in place for data deposition and access, how to obtain further information and plans for the future development of the resource are described.

read less

Abstract: The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

UCSF Chimera--a visualization system for exploratory research and analysis.

[...]

Eric F. Pettersen¹, Thomas D. Goddard¹, Conrad C. Huang¹, Gregory S. Couch¹, Daniel M. Greenblatt¹, Elaine C. Meng¹, Thomas E. Ferrin¹ - Show less +3 more•Institutions (1)

University of California, San Francisco¹

01 Oct 2004-Journal of Computational Chemistry

TL;DR: Two unusual extensions are presented: Multiscale, which adds the ability to visualize large‐scale molecular assemblies such as viral coats, and Collaboratory, which allows researchers to share a Chimera session interactively despite being at separate locales.

...read moreread less

Abstract: The design, implementation, and capabilities of an extensible visualization system, UCSF Chimera, are discussed. Chimera is segmented into a core that provides basic services and visualization, and extensions that provide most higher level functionality. This architecture ensures that the extension mechanism satisfies the demands of outside developers who wish to incorporate new features. Two unusual extensions are presented: Multiscale, which adds the ability to visualize large-scale molecular assemblies such as viral coats, and Collaboratory, which allows researchers to share a Chimera session interactively despite being at separate locales. Other extensions include Multalign Viewer, for showing multiple sequence alignments and associated structures; ViewDock, for screening docked ligand orientations; Movie, for replaying molecular dynamics trajectories; and Volume Viewer, for display and analysis of volumetric data. A discussion of the usage of Chimera in real-world situations is given, along with anticipated future directions. Chimera includes full user documentation, is free to academic and nonprofit users, and is available for Microsoft Windows, Linux, Apple Mac OS X, SGI IRIX, and HP Tru64 Unix from http://www.cgl.ucsf.edu/chimera/.

...read moreread less

35,698 citations

Journal Article•DOI•

Coot: model-building tools for molecular graphics.

[...]

Paul Emsley¹, Kevin Cowtan¹•Institutions (1)

University of York¹

01 Dec 2004-Acta Crystallographica Section D-biological Crystallography

TL;DR: CCP4mg is a project that aims to provide a general-purpose tool for structural biologists, providing tools for X-ray structure solution, structure comparison and analysis, and publication-quality graphics.

...read moreread less

Abstract: CCP4mg is a project that aims to provide a general-purpose tool for structural biologists, providing tools for X-ray structure solution, structure comparison and analysis, and publication-quality graphics. The map-fitting tools are available as a stand-alone package, distributed as `Coot'.

...read moreread less

27,505 citations

Additional excerpts

...…(Morris et al., 2002; Old®eld & Hubbard, 1994; Old®eld, 2001), a likelihood distribution for the pseudo-torsion angle C (n)ÐC (n + 1)ÐC (n + 2)Ð C (n + 3) versus the angle C (n + 1)ÐC (n + 2)ÐC (n + 3) has been generated from high-resolution structures in the PDB (Berman et al., 2000) (Fig....
[...]

Journal Article•DOI•

PHENIX: a comprehensive Python-based system for macromolecular structure solution

[...]

Paul D. Adams¹, Paul D. Adams², Pavel V. Afonine¹, Gábor Bunkóczi³, Vincent B. Chen⁴, Ian W. Davis⁴, Nathaniel Echols¹, Jeffrey J. Headd⁴, Li-Wei Hung⁵, Gary J. Kapral⁴, Ralf W. Grosse-Kunstleve¹, Airlie J. McCoy³, Nigel W. Moriarty¹, Robert D. Oeffner³, Randy J. Read³, David S. Richardson⁴, Jane S. Richardson⁴, Thomas C. Terwilliger⁵, Peter H. Zwart¹ - Show less +15 more•Institutions (5)

Lawrence Berkeley National Laboratory¹, University of California, Berkeley², University of Cambridge³, Duke University⁴, Los Alamos National Laboratory⁵

01 Feb 2010-Acta Crystallographica Section D-biological Crystallography

TL;DR: The PHENIX software for macromolecular structure determination is described and its uses and benefits are described.

...read moreread less

Abstract: Macromolecular X-ray crystallography is routinely applied to understand biological processes at a molecular level. However, significant time and effort are still required to solve and complete many of these structures because of the need for manual interpretation of complex numerical data using many software packages and the repeated use of interactive three-dimensional graphics. PHENIX has been developed to provide a comprehensive system for macromolecular crystallographic structure solution with an emphasis on the automation of all procedures. This has relied on the development of algorithms that minimize or eliminate subjective input, the development of algorithms that automate procedures that are traditionally performed by hand and, finally, the development of a framework that allows a tight integration between the algorithms.

...read moreread less

18,531 citations

Cites background from "The Protein Data Bank"

...This task can be complicated for several reasons: the presence of novel ligands or nonstandard residues in the PDB-format (Berman et al., 2000) coordinate file, data collected from twinned crystals, various reflection datafile formats, different representation of atomic displacement parameters in the presence of TLS (Schomaker & Trueblood, 1968), experimental data type (X-ray and/or neutron), files with multiple models and various formatting issues....
[...]
...This task can be complicated for several reasons: the presence of novel ligands or nonstandard residues in the PDB-format (Berman et al., 2000) coordinate file, data collected from twinned crystals, various reflection datafile formats, different representation of atomic displacement parameters in…...
[...]

Journal Article•DOI•

The Pfam protein families database

[...]

Marco Punta¹, Penny Coggill¹, Ruth Y. Eberhardt¹, Jaina Mistry¹, John Tate¹, Chris Boursnell¹, Ningze Pang¹, Kristoffer Forslund¹, Goran Ceric¹, Jody Clements¹, Andreas Heger¹, Liisa Holm¹, Erik L. L. Sonnhammer¹, Sean R. Eddy¹, Alex Bateman¹, Robert D. Finn¹ - Show less +12 more•Institutions (1)

Wellcome Trust Sanger Institute¹

01 Jan 2000-Nucleic Acids Research

TL;DR: The definition and use of family-specific, manually curated gathering thresholds are explained and some of the features of domains of unknown function (also known as DUFs) are discussed, which constitute a rapidly growing class of families within Pfam.

...read moreread less

Abstract: Pfam is a widely used database of protein families and domains. This article describes a set of major updates that we have implemented in the latest release (version 24.0). The most important change is that we now use HMMER3, the latest version of the popular profile hidden Markov model package. This software is approximately 100 times faster than HMMER2 and is more sensitive due to the routine use of the forward algorithm. The move to HMMER3 has necessitated numerous changes to Pfam that are described in detail. Pfam release 24.0 contains 11,912 families, of which a large number have been significantly updated during the past two years. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/).

...read moreread less

14,075 citations

Journal Article•DOI•

MolProbity: all-atom structure validation for macromolecular crystallography

[...]

Vincent B. Chen¹, W. Bryan Arendall¹, Jeffrey J. Headd¹, Daniel A. Keedy¹, R.M. Immormino¹, Gary J. Kapral¹, Laura Weston Murray¹, Jane S. Richardson¹, David S. Richardson¹ - Show less +5 more•Institutions (1)

Duke University¹

01 Jan 2010-Acta Crystallographica Section D-biological Crystallography

TL;DR: MolProbity structure validation will diagnose most local errors in macromolecular crystal structures and help to guide their correction.

...read moreread less

Abstract: MolProbity is a structure-validation web service that provides broad-spectrum solidly based evaluation of model quality at both the global and local levels for both proteins and nucleic acids. It relies heavily on the power and sensitivity provided by optimized hydrogen placement and all-atom contact analysis, complemented by updated versions of covalent-geometry and torsion-angle criteria. Some of the local corrections can be performed automatically in MolProbity and all of the diagnostics are presented in chart and graphical forms that help guide manual rebuilding. X-ray crystallography provides a wealth of biologically important molecular data in the form of atomic three-dimensional structures of proteins, nucleic acids and increasingly large complexes in multiple forms and states. Advances in automation, in everything from crystallization to data collection to phasing to model building to refinement, have made solving a structure using crystallography easier than ever. However, despite these improvements, local errors that can affect biological interpretation are widespread at low resolution and even high-resolution structures nearly all contain at least a few local errors such as Ramachandran outliers, flipped branched protein side chains and incorrect sugar puckers. It is critical both for the crystallographer and for the end user that there are easy and reliable methods to diagnose and correct these sorts of errors in structures. MolProbity is the authors' contribution to helping solve this problem and this article reviews its general capabilities, reports on recent enhancements and usage, and presents evidence that the resulting improvements are now beneficially affecting the global database.

...read moreread less

12,206 citations

Cites methods from "The Protein Data Bank"

...A typical MolProbity session starts with the user uploading a coordinate file of their own or fetching one from the PDB or NDB databases (Berman et al., 1992, 2000) in new or old PDB format or in mmCIF format....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Basic Local Alignment Search Tool

[...]

Stephen F. Altschul¹, Warren Gish¹, Webb Miller², Eugene W. Myers³, David J. Lipman¹ - Show less +1 more•Institutions (3)

National Institutes of Health¹, Pennsylvania State University², University of Arizona³

01 Oct 1990-Journal of Molecular Biology

TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.

...read moreread less

88,255 citations

Journal Article•DOI•

PROCHECK: a program to check the stereochemical quality of protein structures

[...]

Roman A. Laskowski, Malcolm W. MacArthur, David S. Moss, Janet M. Thornton

01 Apr 1993-Journal of Applied Crystallography

TL;DR: The PROCHECK suite of programs as mentioned in this paper provides a detailed check on the stereochemistry of a protein structure and provides an assessment of the overall quality of the structure as compared with well refined structures of the same resolution.

...read moreread less

Abstract: The PROCHECK suite of programs provides a detailed check on the stereochemistry of a protein structure Its outputs comprise a number of plots in PostScript format and a comprehensive residue-by-residue listing These give an assessment of the overall quality of the structure as compared with well refined structures of the same resolution and also highlight regions that may need further investigation The PROCHECK programs are useful for assessing the quality not only of protein structures in the process of being solved but also of existing structures and of those being modelled on known structures

...read moreread less

22,829 citations

Journal Article•DOI•

Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features

[...]

Wolfgang Kabsch¹, Chris Sander¹•Institutions (1)

Max Planck Society¹

01 Dec 1983-Biopolymers

TL;DR: A set of simple and physically motivated criteria for secondary structure, programmed as a pattern‐recognition process of hydrogen‐bonded and geometrical features extracted from x‐ray coordinates is developed.

...read moreread less

Abstract: For a successful analysis of the relation between amino acid sequence and protein structure, an unambiguous and physically meaningful definition of secondary structure is essential. We have developed a set of simple and physically motivated criteria for secondary structure, programmed as a pattern-recognition process of hydrogen-bonded and geometrical features extracted from x-ray coordinates. Cooperative secondary structure is recognized as repeats of the elementary hydrogen-bonding patterns “turn” and “bridge.” Repeating turns are “helices,” repeating bridges are “ladders,” connected ladders are “sheets.” Geometric structure is defined in terms of the concepts torsion and curvature of differential geometry. Local chain “chirality” is the torsional handedness of four consecutive Cα positions and is positive for right-handed helices and negative for ideal twisted β-sheets. Curved pieces are defined as “bends.” Solvent “exposure” is given as the number of water molecules in possible contact with a residue. The end result is a compilation of the primary structure, including SS bonds, secondary structure, and solvent exposure of 62 different globular proteins. The presentation is in linear form: strip graphs for an overall view and strip tables for the details of each of 10.925 residues. The dictionary is also available in computer-readable form for protein structure prediction work.

...read moreread less

14,077 citations

Journal Article•DOI•

MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures

[...]

P. J. Kraulis

01 Oct 1991-Journal of Applied Crystallography

TL;DR: The MOLSCRIPT program as discussed by the authors produces plots of protein structures using several different kinds of representations, including simple wire models, ball-and-stick models, CPK models and text labels.

...read moreread less

Abstract: The MOLSCRIPT program produces plots of protein structures using several different kinds of representations. Schematic drawings, simple wire models, ball-and-stick models, CPK models and text labels can be mixed freely. The schematic drawings are shaded to improve the illusion of three dimensionality. A number of parameters affecting various aspects of the objects drawn can be changed by the user. The output from the program is in PostScript format.

...read moreread less

13,971 citations

Journal Article•DOI•

Improved tools for biological sequence comparison.

[...]

William R. Pearson¹, David J. Lipman•Institutions (1)

University of Virginia¹

01 Apr 1988-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: Three computer programs for comparisons of protein and DNA sequences can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity.

...read moreread less

Abstract: We have developed three computer programs for comparisons of protein and DNA sequences. They can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity. The FASTA program is a more sensitive derivative of the FASTP program, which can be used to search protein or DNA sequence data bases and can compare a protein sequence to a DNA sequence data base by translating the DNA data base as it is searched. FASTA includes an additional step in the calculation of the initial pairwise similarity score that allows multiple regions of similarity to be joined to increase the score of related sequences. The RDF2 program can be used to evaluate the significance of similarity scores using a shuffling method that preserves local sequence composition. The LFASTA program can display all the regions of local similarity between two sequences with scores greater than a threshold, using the same scoring parameters and a similar alignment algorithm; these local similarities can be displayed as a "graphic matrix" plot or as individual alignments. In addition, these programs have been generalized to allow comparison of DNA or protein sequences based on a variety of alternative scoring matrices.

...read moreread less

12,432 citations