scispace - formally typeset
Search or ask a question

Showing papers by "Philip E. Bourne published in 2011"


Journal ArticleDOI
TL;DR: The next generation of the RCSB PDB web site, as described here, provides a rich resource for research and education and enables a range of new possibilities to analyze and understand structure data.
Abstract: The RCSB Protein Data Bank (RCSB PDB) web site (http://www.pdb.org) has been redesigned to increase usability and to cater to a larger and more diverse user base. This article describes key enhancements and new features that fall into the following categories: (i) query and analysis tools for chemical structure searching, query refinement, tabulation and export of query results; (ii) web site customization and new structure alerts; (iii) pair-wise and representative protein structure alignments; (iv) visualization of large assemblies; (v) integration of structural data with the open access literature and binding affinity data; and (vi) web services and web widgets to facilitate integration of PDB data and tools with other resources. These improvements enable a range of new possibilities to analyze and understand structure data. The next generation of the RCSB PDB web site, as described here, provides a rich resource for research and education.

598 citations


Journal ArticleDOI
TL;DR: An exhaustive set of drugs, including withdrawn or experimental drugs, annotated with drug–protein and protein–protein relationships compiled from public resources via text and data mining including manual curation to provide a starting point for drug-repositioning.
Abstract: The procedure of drug approval is time-consuming, costly and risky. Accidental findings regarding multi-specificity of approved drugs led to block-busters in new indication areas. Therefore, the interest in systematically elucidating new areas of application for known drugs is rising. Furthermore, the knowledge, understanding and prediction of so-called off-target effects allow a rational approach to the understanding of side-effects. With PROMISCUOUS we provide an exhaustive set of drugs (25,000), including withdrawn or experimental drugs, annotated with drug-protein and protein-protein relationships (21,500/104,000) compiled from public resources via text and data mining including manual curation. Measures of structural similarity for drugs as well as known side-effects can be easily connected to protein-protein interactions to establish and analyse networks responsible for multi-pharmacology. This network-based approach can provide a starting point for drug-repositioning. PROMISCUOUS is publicly available at http://bioinformatics.charite.de/promiscuous.

209 citations


Journal ArticleDOI
TL;DR: This paper shows how the use of support vector machines (SVMs), trained by associating sets of individual energy terms retrieved from molecular docking with the known binding affinity of each compound from high-throughput screening experiments, can be used to improve the correlation between known binding affinities and those predicted by the docking program eHiTS.
Abstract: Docking scoring functions are notoriously weak predictors of binding affinity. They typically assign a common set of weights to the individual energy terms that contribute to the overall energy score; however, these weights should be gene family dependent. In addition, they incorrectly assume that individual interactions contribute toward the total binding affinity in an additive manner. In reality, noncovalent interactions often depend on one another in a nonlinear manner. In this paper, we show how the use of support vector machines (SVMs), trained by associating sets of individual energy terms retrieved from molecular docking with the known binding affinity of each compound from high-throughput screening experiments, can be used to improve the correlation between known binding affinities and those predicted by the docking program eHiTS. We construct two prediction models: a regression model trained using IC(50) values from BindingDB, and a classification model trained using active and decoy compounds from the Directory of Useful Decoys (DUD). Moreover, to address the issue of overrepresentation of negative data in high-throughput screening data sets, we have designed a multiple-planar SVM training procedure for the classification model. The increased performance that both SVMs give when compared with the original eHiTS scoring function highlights the potential for using nonlinear methods when deriving overall energy scores from their individual components. We apply the above methodology to train a new scoring function for direct inhibitors of Mycobacterium tuberculosis (M.tb) InhA. By combining ligand binding site comparison with the new scoring function, we propose that phosphodiesterase inhibitors can potentially be repurposed to target M.tb InhA. Our methodology may be applied to other gene families for which target structures and activity data are available, as demonstrated in the work presented here.

177 citations


Journal ArticleDOI
TL;DR: The results suggest that Nelfinavir is able to inhibit multiple members of the protein kinase-like superfamily, which are involved in the regulation of cellular processes vital for carcinogenesis and metastasis.
Abstract: Nelfinavir is a potent HIV-protease inhibitor with pleiotropic effects in cancer cells. Experimental studies connect its anti-cancer effects to the suppression of the Akt signaling pathway, but the actual molecular targets remain unknown. Using a structural proteome-wide off-target pipeline, which integrates molecular dynamics simulation and MM/GBSA free energy calculations with ligand binding site comparison and biological network analysis, we identified putative human off-targets of Nelfinavir and analyzed the impact on the associated biological processes. Our results suggest that Nelfinavir is able to inhibit multiple members of the protein kinase-like superfamily, which are involved in the regulation of cellular processes vital for carcinogenesis and metastasis. The computational predictions are supported by kinase activity assays and are consistent with existing experimental and clinical evidence. This finding provides a molecular basis to explain the broad-spectrum anti-cancer effect of Nelfinavir and presents opportunities to optimize the drug as a targeted polypharmacology agent.

165 citations


Journal ArticleDOI
TL;DR: There is significant interest in determining a priori what off-targets exist on a proteome-wide scale, and the need to understand the impact of such binding on the complete biological system, with the ultimate goal of being able to predict the phenotypic outcome.

138 citations


Journal ArticleDOI
TL;DR: 3D is fully embedded within IEDB, thus allowing structural data, both curated and calculated, and all accompanying information to be queried using multiple search interfaces, including queries for epitopes recognized in different pathogens, eliciting different functional immune responses, and recognized by different components of the immune system.
Abstract: IEDB-3D is the 3D structural component of the Immune Epitope Database (IEDB) available via the ‘Browse by 3D Structure’ page at http://www.iedb.org. IEDB-3D catalogs B- and T-cell epitopes and Major Histocompatibility Complex (MHC) ligands for which 3D structures of complexes with antibodies, T-cell receptors or MHC molecules are available in the Protein Data Bank (PDB). Journal articles that are primary citations of PDB structures and that define immune epitopes are curated within IEDB as any other reference along with accompanying functional assays and immunologically relevant information. For each curated structure, IEDB-3D provides calculated data on intermolecular contacts and interface areas and includes an application, EpitopeViewer, to visualize the structures. IEDB-3D is fully embedded within IEDB, thus allowing structural data, both curated and calculated, and all accompanying information to be queried using multiple search interfaces. These include queries for epitopes recognized in different pathogens, eliciting different functional immune responses, and recognized by different components of the immune system. The query results can be downloaded in Microsoft Excel format, or the entire database, together with structural data both curated and calculated, can be downloaded in either XML or MySQL formats.

59 citations


Journal ArticleDOI
TL;DR: Virginia Barbour is paid a salary by the Public Library of Science, and she wrote this editorial during her salaried time.
Abstract: While we cannot articulate exactly what defines the less quantitative side of a scientific reputation, we might be able to seed a discussion. We invite you to crowdsource a better description and path to achieving such a reputation by using the comments feature associated with this article.Consider yourself challenged to contribute.

41 citations


Journal ArticleDOI
TL;DR: This work proposes a new database search tool (MixDB) that is able to identify mixture tandem mass spectra from more than one peptide, and shows that peptides can be reliably identified with up to 95% accuracy from mixture spectra.

34 citations


Journal ArticleDOI
TL;DR: A synthesis of the hypotheses of Lake, Gupta, and Cavalier-Smith is possible where a combination of antibiotic warfare and viral endosymbiosis in the bacilli led to dramatic changes in a bacterium that resulted in the birth of archaea and eukaryotes.
Abstract: The tree of life is usually rooted between archaea and bacteria. We have previously presented three arguments that support placing the root of the tree of life in bacteria. The data have been dismissed because those who support the canonical rooting between the prokaryotic superkingdoms cannot imagine how the vast divide between the prokaryotic superkingdoms could be crossed. We review the evidence that archaea are derived, as well as their biggest differences with bacteria. We argue that using novel data the gap between the superkingdoms is not insurmountable. We consider whether archaea are holophyletic or paraphyletic; essential to understanding their origin. Finally, we review several hypotheses on the origins of archaea and, where possible, evaluate each hypothesis using bioinformatics tools. As a result we argue for a firmicute ancestry for archaea over proposals for an actinobacterial ancestry. We believe a synthesis of the hypotheses of Lake, Gupta, and Cavalier-Smith is possible where a combination of antibiotic warfare and viral endosymbiosis in the bacilli led to dramatic changes in a bacterium that resulted in the birth of archaea and eukaryotes. This article was reviewed by Patrick Forterre, Eugene Koonin, and Gaspar Jekely

33 citations


Journal ArticleDOI
TL;DR: PharmGenEd is an evidence-based pharmacogenomics education program developed at the University of California San Diego Skaggs School of Pharmacy and Pharmaceutical Sciences and the School of Medicine with funding support from the Centers for Disease Control and Prevention.
Abstract: Clinical application of evidence-based pharmacogenomics information has the potential to help healthcare professionals provide safe and effective medication management to patients. However, there is a gap between the advances of pharmacogenomics discovery and the health professionals’ knowledge regarding pharmacogenomics testing and therapeutic uses. Furthermore, pharmacogenomics education materials for healthcare professionals have not been readily available or accessible. Pharmacogenomics Education Program (PharmGenEd™) is an evidence-based pharmacogenomics education program developed at the University of California San Diego Skaggs School of Pharmacy and Pharmaceutical Sciences and the School of Medicine (CA, USA), with funding support from the Centers for Disease Control and Prevention. Program components include continuing education modules, train-the-trainer materials and shared curriculum modules based on therapeutic topics, and virtual communities with online resources.

27 citations


Journal ArticleDOI
TL;DR: A subsection of the Education section of PLoS Computational Biology is introduced with articles devoted to teaching bioinformatics in secondary schools that is derived from the work of the education committee of the International Society for Computationalbiology (ISCB), who identified a need to address the issue of incorporating bio informatics into secondary school biology classes.
Abstract: Bioinformatics is now an integral part of biology and biological research. The field began with a few people from other disciplines teaching themselves and each other the techniques that are now considered commonplace. These pioneers then began graduate programs [1]–[3] to educate the next generation. Those early graduate students typically came as bench biologists or as computer scientists, both groups requiring significant time to “hybridize”. Not surprisingly, this then led to undergraduate majors in bioinformatics to better prepare students for graduate school and research careers in bioinformatics. In addition, teaching bioinformatics in undergraduate biology classes is also a priority [4], [5]. Through the Education section of PLoS Computational Biology we have tried to support this evolution through a collection of educational articles pertinent to the undergraduate level and beyond. It is only natural that we would take the next step [6]. We now introduce a subsection of the Education section with articles devoted to teaching bioinformatics in secondary schools that is derived from the work of the Education committee of the International Society for Computational Biology (ISCB), who identified a need to address the issue of incorporating bioinformatics into secondary school biology classes. They also recognized the interest among researchers to build and participate in outreach programs at the secondary school level given that many funding agencies worldwide encourage such a component in grant applications. To move the ball forward on secondary school bioinformatics education, at ISCB's 2010 international conference, Intelligent Systems in Molecular Biology (ISMB), the ISCB Education committee organized a half-day tutorial aimed at secondary school biology and chemistry teachers in the Boston area interested in learning about bioinformatics and how to include it in their curricula. The tutorial also attracted researchers involved in organizing or formulating outreach programs in their community. The main focus of the ISMB tutorial was the presentation of lesson plans by a secondary school teacher (David Form, a biology teacher at Nashoba Regional High School, Bolton, Massachusetts) who has successfully incorporated bioinformatics into his courses for more than five years. His is one example of such an effort and is embraced in the Ten Simple Rules and its supplementary material found in this issue. Also in this issue we have an article by Suzanne Gallagher and colleagues on the experience of teaching secondary school level bioinformatics in Boulder, Colorado. There are many examples of outreach efforts to high school students that we would like to feature in coming months, which incorporate bioinformatics into their programs (see Table 1). Table 1 Examples of Online Resources and Outreach Programs. There are many other examples of educators doing similar work in school districts worldwide. A recent issue of Briefings in Bioinformatics was dedicated to bioinformatics education [7] with a specific example of programs for secondary school students [8], [9].The ISCB Education committee is building a resource of information useful to secondary school teachers who would like to incorporate bioinformatics into their curriculum. In addition, the committee has begun to explore how to include bioinformatics in Advanced Placement courses and exams in the United States, which we also hope to feature in the Education section of the journal. We encourage feedback of any form, including comments on this editorial, and hearing about your experience teaching bioinformatics to secondary school students.

Journal ArticleDOI
01 Jan 2011-Database
TL;DR: Quality assurance for the RCSB PDB website is described at several distinct levels, including: hardware redundancy and failover, testing protocols for weekly database updates, testing and release procedures for major software updates and miscellaneous monitoring and troubleshooting tools and practices.
Abstract: The RCSB Protein Data Bank (RCSB PDB, www.pdb.org) is a key online resource for structural biology and related scientific disciplines. The website is used on average by 165 000 unique visitors per month, and more than 2000 other websites link to it. The amount and complexity of PDB data as well as the expectations on its usage are growing rapidly. Therefore, ensuring the reliability and robustness of the RCSB PDB query and distribution systems are crucially important and increasingly challenging. This article describes quality assurance for the RCSB PDB website at several distinct levels, including: (i) hardware redundancy and failover, (ii) testing protocols for weekly database updates, (iii) testing and release procedures for major software updates and (iv) miscellaneous monitoring and troubleshooting tools and practices. As such it provides suggestions for how other websites might be operated. Database URL: www.pdb.org

Journal ArticleDOI
TL;DR: Cobweb is a Java applet for real-time network visualization that allows new nodes to be interactively added to a network by querying a database on a server.
Abstract: Summary: Cobweb is a Java applet for real-time network visualisation; its strength lies in enabling the interactive exploration of networks. Therefore, it allows new nodes to be interactively added to a network by querying a database on a server. The network constantly rearranges to provide the most meaningful topological view. Availability: Cobweb is available under the GPLv3 and may be freely downloaded at http://bioinformatics.charite.de/cobweb.

Journal ArticleDOI
TL;DR: Ten simple rules from my own experiences, in both getting promoted and serving on such committees, for how you might maximize your chances of getting ahead under such circumstances.
Abstract: Getting a promotion or a new position are important parts of the scientific career process. Ironically, a committee whose membership has limited ability to truly judge your scholarly standing is often charged with making these decisions. Here are ten simple rules from my own experiences, in both getting promoted and serving on such committees, for how you might maximize your chances of getting ahead under such circumstances. The rules focus on what might be added to a CV, research statement, personal statement, or cover letter, depending on the format of the requested promotion materials. In part, the rules suggest that you educate the committee members, who have a range of expertise, on what they should find important in the promotion application provided by a computational biologist. Further, while some rules are generally applicable, the focus here is on promotion in an academic setting. Having said that, in such a setting teaching and community service are obviously important, but barely touched upon here. Rather, the focus is on how to maximize the appreciation of your research-related activities. As a final thought before we get started on the rules, this is not just about you, but an opportunity to educate a broad committee on what is important in our field. Use that opportunity well, for it will serve future generations of computational biologists.


Journal ArticleDOI
TL;DR: The evolution required by RCSB PDB to meet these challenges provides insight into the motivation and challenges of developing and maintaining a major biological resource, particularly the one used in understanding the molecular details of living systems in both normal and disease states.
Abstract: The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) supports scientific research and education by providing an essential resource of information about biomolecular structures. As a member of the Worldwide Protein Data Bank (wwPDB), the RCSB PDB curates and annotates the data about the experimentally determined three-dimensional structures of proteins and nucleic acids that are deposited into the PDB archive. The RCSB PDB also provides online resources to access the data in the archive, including a relational database supporting simple and complex query and reporting, visualization tools, structure-sequence comparison tools, access to the associated literature, and educational services. In the 11 years (1999–2010) since RCSB PDB has been in operation, the amount of data in the archive has increased six-fold, along with an increase in the complexity of structures being determined and in the number of experimental methods used. The evolution required by RCSB PDB to meet these challenges provides insight into the motivation and challenges of developing and maintaining a major biological resource, particularly the one used in understanding the molecular details of living systems in both normal and disease states. © 2011 John Wiley & Sons, Ltd. WIREs Comput Mol Sci 2011 1 782–789 DOI: 10.1002/wcms.57

Journal ArticleDOI
TL;DR: A distributed multipoles expansion approach that allows the partitioning of the charge distribution into subsystems so that the multipole expansion of each component of the partition, and therefore of their superposition, is valid outside an enclosing surface of the molecule of arbitrary shape.


Journal ArticleDOI
TL;DR: It is argued here that these other views could increase the rate of scientific discovery through provision of a digital continuum that includes the data and methods used to reach the major conclusions of the work.
Abstract: The scientific workflow, from initial ideas to data generated during hypothesis testing, to conclusions drawn from those experiments, is increasingly in a digital form.The publishing process loses much of that digital context since while a PDF or HTML page is digital, it really is a reflection of an analog past in terms of what can be done with that content. While a PDF of a research article has merit as a concise description, it is only one view on that work and others are possible. I argue here that these other views could increase the rate of scientific discovery through provision of a digital continuum that includes the data and methods (where possible) used to reach the major conclusions of the work. Such changes are afoot being driven from the bottom up by scientists and the top down by publishers.

Journal ArticleDOI
TL;DR: In 2010, PLoS Computational Biology launched two new features to enrich the journal: “The Roots of Bioinformatics” and “PLoS Conference Postcards”, and has seen growth not only in submissions, but in readership as well.
Abstract: PLoS Computational Biology celebrated its fifth anniversary in 2010, and all in our community, either as readers, authors, or editors, should take pride in what has been accomplished in such a short space of time. In the past year we received 1,403 new Research Articles, a 295% increase from our first year of operation in 2005–2006 and a 17% increase over 2009. Of the articles submitted in 2010, 875 (62%) were rejected, and 70% of these were before review. We have seen growth not only in submissions, but in readership as well. Currently, around 16,000 readers receive the electronic table of contents, a 14% increase over the previous year. We published 392 Research Articles this year, along with 23 “front section” articles (Reviews, Perspectives, Education), down from 33 in the previous year. Eighty Associate Editors handled the combined submissions, with a total of 26 new editors joining this past year and six departing. We are proud to say that virtually every editor we asked to join accepted, a testament to how our community values the journal. These editors worked with more than 180 guest editors and 1,800 reviewers to handle the submissions, and we are of course very grateful for their support (Table S1). Table 1 provides a list of Research Articles we have published since 2005 through October 2010 that have accrued over 10,000 downloads and shows the diversity of highly accessed papers published by the journal. Note that these are downloads from the PLoS Web site only, and do not include downloads from PubMed Central. Readers are free to review download statistics for all research and non-research articles published across the PLoS journals through the Microsoft Excel spreadsheet that can be found at http://www.ploscompbiol.org/static/plos-alm.zip. Individual article metrics and comments are available from the respective tabs associated with each article. Table 1 List of published Research Articles that have accrued over 10,000 downloads since launch. In 2010, we launched two new features to enrich the journal: “The Roots of Bioinformatics” and “PLoS Conference Postcards”. The Roots of Bioinformatics was eloquently introduced by the Series Editor, David B. Searls, in June [1] and was followed in July by Russell F. Doolittle’s insightful reflections on the roots of protein evolution, which went back as far as the 1950s when chemistry, rather than computers, ruled [2]. More such reflections will follow in 2011. Conference Postcards act as a counterpoint to the rich roots retrospectives by providing current views of the field of computational biology, as young scientists present crisp perspectives on what they perceive as conference highlights. We published Postcards from January’s Pacific Symposium on Biocomputing (PSB) meeting held in Hawaii [3] and from the Intelligent Systems for Molecular Biology (ISMB) meeting held in Boston in July [4]. At the latter we learnt about various sessions held at ISMB, namely the Highlights session, the ISCB Student Council Symposium’s “speed dating” event, and reports from Satellite meetings. We look forward to digging deeper and receiving Postcards from further afield in 2011. PLoS Computational Biology continues its strong relationship with the International Society for Computational Biology (ISCB) through postings on its Web site and activities at ISMB. At ISMB 2010 in Boston, PLoS Computational Biology ran a Workshop entitled “Where and How to Get Published” in which we endeavored to make the path to getting published a little less inscrutable. The first half was led by journal co-founders Philip E. Bourne and Steven E. Brenner and provided guidelines on how to write a good paper, and the second half included questions and advice from editors and authors from a range of career stages, which resulted in a broad discussion of what journals want and the state of publishing today. Presenters’ materials from the Workshop are available on the new PLoS Blog (http://blogs.plos.org/plos/2010/10/materials-from-plos%E2%80%99-workshop-at-ismb-2010/). We have three major goals for 2011. First, to reduce the time to decision for submitted manuscripts, which currently averages 10 days for those papers rejected before review and 40–50 days for those reviewed. Second, to introduce a new section called “Editors’ Outlook”, which are invited mini-reviews from members of our Editorial Board who will provide insights into their respective fields, discussing what is hot and what we can expect going forward. Collectively, these will provide an ongoing and insightful look into the broad and rapidly expanding field of computational biology—a field of endeavor the journal is proud to serve. Specifically, current experimental techniques are leading to an unprecedented increase in the rate at which data are becoming available. When combined with the vast growth in computational power, we can expect rapid growth in computational papers. Computational biology is the area that helps in organizing the data, in making sense of observations, and in using these to make experimentally testable predictions. Our third goal is to keep abreast of these developments and keep PLoS Computational Biology the number one journal in the field.