scispace - formally typeset
Search or ask a question

Showing papers by "Philip E. Bourne published in 1998"


Journal ArticleDOI
TL;DR: A new algorithm is reported which builds an alignment between two protein structures which involves a combinatorial extension of an alignment path defined by aligned fragment pairs rather than the more conventional techniques using dynamic programming and Monte Carlo optimization.
Abstract: A new algorithm is reported which builds an alignment between two protein structures. The algorithm involves a combinatorial extension (CE) of an alignment path defined by aligned fragment pairs (AFPs) rather than the more conventional techniques using dynamic programming and Monte Carlo optimization. AFPs, as the name suggests, are pairs of fragments, one from each protein, which confer structure similarity. AFPs are based on local geometry, rather than global features such as orientation of secondary structures and overall topology. Combinations of AFPs that represent possible continuous alignment paths are selectively extended or discarded thereby leading to a single optimal alignment. The algorithm is fast and accurate in finding an optimal structure alignment and hence suitable for database scanning and detailed analysis of large protein families. The method has been tested and compared with results from Dali and VAST using a representative sample of similar structures. Several new structural similarities not detected by these other methods are reported. Specific one-on-one alignments and searches against all structures as found in the Protein Data Bank (PDB) can be performed via the Web at http://cl.sdsc.edu/ce.html.

2,100 citations


Proceedings Article
01 Jan 1998
TL;DR: MICE defines a Molecular Scene Description Language (MSDL) which allows scenes to be stored in a relational database (a molecular scene gallery) and queried and retrieved from the gallery are rendered in Virtual Reality Modeling Language (VRML) and currently displayed in WebView.
Abstract: Illustrations of macromolecular structure in the scientific literature contain a high level of semantic content through which the authors convey, among other features, the biological function of that macromolecule. We refer to these illustrations as molecular scenes. Such scenes, if available electronically, are not readily accessible for further interactive interrogation. The basic PDB format does not retain features of the scene; formats like PostScript retain the scene but are not interactive; and the many formats used by individual graphics programs, while capable of reproducing the scene, are neither interchangeable nor can they be stored in a database and queried for features of the scene. MICE defines a Molecular Scene Description Language (MSDL) which allows scenes to be stored in a relational database (a molecular scene gallery) and queried. Scenes retrieved from the gallery are rendered in Virtual Reality Modeling Language (VRML) and currently displayed in WebView, a VRML browser modified to support the Virtual Reality Behavior System (VRBS) protocol. VRBS provides communication between multiple client browsers, each capable of manipulating the scene. This level of collaboration works well over standard Internet connections and holds promise for collaborative research at a distance and distance learning. Further, via VRBS, the VRML world can be used as a visual cue to trigger an application such as a remote MEME search. MICE is very much work in progress. Current work seeks to replace WebView with Netscape, Cosmoplayer, a standard VRML plug-in, and a Java-based console. The console consists of a generic kernel suitable for multiple collaborative applications and additional application-specific controls. Further details of the MICE project are available at http:/(/)mice.sdsc.edu.

25 citations


Book ChapterDOI
01 Jan 1998
TL;DR: The resulting database of automated pairwise alignments is reported and contains acetylcholinesterases, lipases, haloalkane dehalogenases, and cholesterol esterases structures for which more than 200 residues could be aligned with an RMSD of less than 4.0A.
Abstract: The comparison of results from 3-D structure alignments obtained by a variety of different algorithms and those determined by domain experts reveals significant differences. Simply finding the best RMSD between the corresponding C α positions in two structures is not enough to match biologically meaningful features, and hence provide the most meaningful structure alignment. Yet an accurate comparative analysis of structure reveals a great deal about the biological function of related proteins. A recently described combinatorial extension algorithm has been further refined by the use of protein properties relevant to structural and functional features to provide good structure alignments. The resulting database of automated pairwise alignments is reported and contains acetylcholinesterases (9), lipases (15), haloalkane dehalogenases (11), and cholesterol esterases (2) structures for which more than 200 residues could be aligned with an RMSD of less than 4.0A. Results for the alignment of a specific acetylcholinesterase to a lipase is compared to the previously performed manual alignment. The database and associated tools for visualizing the structure alignments are available via the Web at the URL http://cl.sdsc.edu/align_db.html.

3 citations



Proceedings ArticleDOI
01 Dec 1998
TL;DR: A protein documentary is defined as a description of the relationship between structure and function in a single protein or in a related family of proteins as discussed by the authors, which is further enhanced by the use of sound and interactive graphics.
Abstract: Computer-based multimedia technology for distance learning and research has come of age--the price point is acceptable, domain experts using off-the-shelf software can prepare compelling materials, and the material can be efficiently delivered via the Internet to a large audience. While not presenting any new scientific results, this paper outlines experiences with a variety of commercial and free software tools and the associated protocols we have used to prepare protein documentaries and other multimedia presentations relevant to molecular biology. A protein documentary is defined here as a description of the relationship between structure and function in a single protein or in a related family of proteins. A description using text and images which is further enhanced by the use of sound and interactive graphics. Examples of documentaries prepared to describe cAMP dependent protein kinase, the founding structural member of the protein kinase family for which there is now over 40 structures can be found at http://franklin.burnham-inst.org/rcsb. A variety of other prototype multimedia presentations for molecular biology described in this paper can be found at http://fraklin.burnham-inst.org.

3 citations


Journal ArticleDOI
TL;DR: Past progress is summarized by outlining the features of the significant number of relevant databases developed to date, and trends indicate that challenges exist if crystallographers are to provide the community with complete and consistent structural results in the future.
Abstract: Databases containing macromolecular structure data provide a crystallographer with important tools for use in solving, refining and understanding the functional significance of their protein structures Given this importance, this paper briefly summarizes past progress by outlining the features of the significant number of relevant databases developed to date One recent database, PDB+, containing all current and obsolete structures deposited with the Protein Data Bank (PDB) is discussed in more detail PDB+ has been used to analyze the self-consistency of the current (1 January 1998) corpus of over 7000 structures A summary of those findings is presented (a full discussion will appear elsewhere) in the form of global and temporal trends within the data These trends indicate that challenges exist if crystallographers are to provide the community with complete and consistent structural results in the future It is argued that better information management practices are required to meet these challenges

3 citations


Journal ArticleDOI
TL;DR: pdb2cif as discussed by the authors is a new version of an awk script originally written by P. E. Bourne in 1993 to translate from the 1992 Protein Data Bank (PDB) format to the then emerging macromolecular Crystallographic Information File (mmCIF) definition.
Abstract: pdb2cif is a new version of an awk script originally written by P. E. Bourne in 1993 to translate from the 1992 Protein Data Bank (PDB) format to the then-emerging macromolecular Crystallographic Information File (mmCIF) definition. This new version of pdb2cif translates from all current PDB formats, including the 1992 PDB format and the 1996 PDB Atomic Coordinate Entry Format, Version 2.0, to the 1997 mmCIF format as defined in the mmCIF dictionary 1.0.00. The program is provided as an m4 script from which both perl and awk versions can be produced. The program identifies mmCIF entities implicitly by sequence homology among PDB SEQRES records. With minor additions to the dictionary, the resultant mmCIF data-sets are substantially compliant with the mmCIF 1.0.00 dictionary.

3 citations