Showing papers in "Journal of Chemical Information and Modeling in 2011"

PDF

Open Access

Journal Article•DOI•

LigPlot+: multiple ligand-protein interaction diagrams for drug discovery.

[...]

Roman A. Laskowski¹, Mark B. Swindells•Institutions (1)

05 Oct 2011-Journal of Chemical Information and Modeling

TL;DR: A graphical system for automatically generating multiple 2D diagrams of ligand-protein interactions from 3D coordinates that facilitates popular research tasks, such as analyzing a series of small molecules binding to the same protein target, a single ligand binding to homologous proteins, or the completely general case where both protein and ligand change.

...read moreread less

Abstract: We describe a graphical system for automatically generating multiple 2D diagrams of ligand–protein interactions from 3D coordinates. The diagrams portray the hydrogen-bond interaction patterns and hydrophobic contacts between the ligand(s) and the main-chain or side-chain elements of the protein. The system is able to plot, in the same orientation, related sets of ligand–protein interactions. This facilitates popular research tasks, such as analyzing a series of small molecules binding to the same protein target, a single ligand binding to homologous proteins, or the completely general case where both protein and ligand change.

...read moreread less

3,840 citations

Journal Article•DOI•

Assessing the performance of the MM/PBSA and MM/GBSA methods. 1. The accuracy of binding free energy calculations based on molecular dynamics simulations.

[...]

Tingjun Hou¹, Junmei Wang², Youyong Li¹, Wei Wang³•Institutions (3)

Soochow University (Suzhou)¹, University of Texas Southwestern Medical Center², University of California, San Diego³

24 Jan 2011-Journal of Chemical Information and Modeling

TL;DR: An extensive study of 59 ligands interacting with six different proteins finds that MM/PBSA can serve as a powerful tool in drug design, where correct ranking of inhibitors is often emphasized, and the accuracy of the binding free energies calculated by three Generalized Born (GB) models is evaluated.

...read moreread less

Abstract: The Molecular Mechanics/Poisson−Boltzmann Surface Area (MM/PBSA) and the Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) methods calculate binding free energies for macromolecules by combining molecular mechanics calculations and continuum solvation models. To systematically evaluate the performance of these methods, we report here an extensive study of 59 ligands interacting with six different proteins. First, we explored the effects of the length of the molecular dynamics (MD) simulation, ranging from 400 to 4800 ps, and the solute dielectric constant (1, 2, or 4) on the binding free energies predicted by MM/PBSA. The following three important conclusions could be observed: (1) MD simulation length has an obvious impact on the predictions, and longer MD simulation is not always necessary to achieve better predictions. (2) The predictions are quite sensitive to the solute dielectric constant, and this parameter should be carefully determined according to the characteristics of the protein/lig...

...read moreread less

1,926 citations

Journal Article•DOI•

TRAVIS - a free analyzer and visualizer for Monte Carlo and molecular dynamics trajectories.

[...]

Martin Brehm¹, Barbara Kirchner¹•Institutions (1)

Leipzig University¹

27 Jul 2011-Journal of Chemical Information and Modeling

TL;DR: Some of the algorithms that are implemented in TRAVIS are presented - many of them widely known for a long time, but some of them also to appear in literature for the first time.

...read moreread less

Abstract: We present TRAVIS (“TRajectory Analyzer and VISualizer”), a free program package for analyzing and visualizing Monte Carlo and molecular dynamics trajectories. The aim of TRAVIS is to collect as many analyses as possible in one program, creating a powerful tool and making it unnecessary to use many different programs for evaluating simulations. This should greatly rationalize and simplify the workflow of analyzing trajectories. TRAVIS is written in C++, open-source freeware and licensed under the terms of the GNU General Public License (GPLv3). It is easy to install (platform independent, no external libraries) and easy to use. In this article, we present some of the algorithms that are implemented in TRAVIS - many of them widely known for a long time, but some of them also to appear in literature for the first time. All shown analyses only require a standard MD trajectory as input data.

...read moreread less

854 citations

Journal Article•DOI•

FRED Pose Prediction and Virtual Screening Accuracy

[...]

Mark McGann¹•Institutions (1)

OpenEye Scientific Software¹

16 Feb 2011-Journal of Chemical Information and Modeling

TL;DR: This analysis shows that most docking programs are effective overall but highly inconsistent, tending to do well on one system and poorly on the next, particularly when using a global enrichment metric (AUC).

...read moreread less

Abstract: Results of a previous docking study are reanalyzed and extended to include results from the docking program FRED and a detailed statistical analysis of both structure reproduction and virtual screening results. FRED is run both in a traditional docking mode and in a hybrid mode that makes use of the structure of a bound ligand in addition to the protein structure to screen molecules. This analysis shows that most docking programs are effective overall but highly inconsistent, tending to do well on one system and poorly on the next. Comparing methods, the difference in mean performance on DUD is found to be statistically significant (95% confidence) 61% of the time when using a global enrichment metric (AUC). Early enrichment metrics are found to have relatively poor statistical power, with 0.5% early enrichment only able to distinguish methods to 95% confidence 14% of the time.

...read moreread less

581 citations

Journal Article•DOI•

Real external predictivity of QSAR models: how to evaluate it? Comparison of different validation criteria and proposal of using the concordance correlation coefficient.

[...]

Nicola Chirico¹, Paola Gramatica¹•Institutions (1)

University of Insubria¹

12 Aug 2011-Journal of Chemical Information and Modeling

TL;DR: The concordance correlation coefficient is proposed as a complementary, or alternative, more prudent measure of a QSAR model to be externally predictive, and works well on real data sets, where it seems to be more stable, and helps in making decisions when the validation measures are in conflict.

...read moreread less

Abstract: The main utility of QSAR models is their ability to predict activities/properties for new chemicals, and this external prediction ability is evaluated by means of various validation criteria. As a measure for such evaluation the OECD guidelines have proposed the predictive squared correlation coefficient Q2F1 (Shi et al.). However, other validation criteria have been proposed by other authors: the Golbraikh-Tropsha method, r2m (Roy), Q2F2 (Schuurmann et al.), Q2F3 (Consonni et al.). In QSAR studies these measures are usually in accordance, though this is not always the case, thus doubts can arise when contradictory results are obtained. It is likely that none of the aforementioned criteria is the best in every situation, so a comparative study using simulated data sets is proposed here, using threshold values suggested by the proponents or those widely used in QSAR modeling. In addition, a different and simple external validation measure, the concordance correlation coefficient (CCC), is proposed and comp...

...read moreread less

552 citations

Journal Article•DOI•

NNScore 2.0: a neural-network receptor-ligand scoring function.

[...]

Jacob D. Durrant¹, J. Andrew McCammon¹•Institutions (1)

University of California, San Diego¹

03 Nov 2011-Journal of Chemical Information and Modeling

TL;DR: The purpose of the current work is to further confirm that neural-network scoring functions are effective, even when compared to the scoring functions of state-of-the-art docking programs, such as AutoDock, the most commonly cited program, and AutoD dock Vina, thought to be two orders of magnitude faster.

...read moreread less

Abstract: NNScore is a neural-network-based scoring function designed to aid the computational identification of small-molecule ligands. While the test cases included in the original NNScore article demonstrated the utility of the program, the application examples were limited. The purpose of the current work is to further confirm that neural-network scoring functions are effective, even when compared to the scoring functions of state-of-the-art docking programs, such as AutoDock, the most commonly cited program, and AutoDock Vina, thought to be two orders of magnitude faster. Aside from providing additional validation of the original NNScore function, we here present a second neural-network scoring function, NNScore 2.0. NNScore 2.0 considers many more binding characteristics when predicting affinity than does the original NNScore. The network output of NNScore 2.0 also differs from that of NNScore 1.0; rather than a binary classification of ligand potency, NNScore 2.0 provides a single estimate of the pKd. To fac...

...read moreread less

260 citations

Journal Article•DOI•

DSX: A Knowledge-Based Scoring Function for the Assessment of Protein–Ligand Complexes

[...]

Gerd Neudert¹, Gerhard Klebe¹•Institutions (1)

University of Marburg¹

04 Oct 2011-Journal of Chemical Information and Modeling

TL;DR: The new knowledge-based scoring function DSX that consists of distance-dependent pair potentials, novel torsion angle potentialS, and newly defined solvent accessible surface-dependent potentials is introduced, featuring superior performance with respect to docking- and ranking power and runtime requirements.

...read moreread less

Abstract: We introduce the new knowledge-based scoring function DSX that consists of distance-dependent pair potentials, novel torsion angle potentials, and newly defined solvent accessible surface-dependent potentials. DSX pair potentials are based on the statistical formalism of DrugScore, extended by a much more specialized set of atom types. The original DrugScore-like reference state is rather unstable with respect to modifications in the used atom types. Therefore, an important method to overcome this problem and to allow for robust results when deriving pair potentials for arbitrary sets of atom types is presented. A validation based on a carefully prepared test set is shown, enabling direct comparison to the majority of other popular scoring functions. Here, DSX features superior performance with respect to docking- and ranking power and runtime requirements. Furthermore, the beneficial combination with torsion angle-dependent and desolvation-dependent potentials is demonstrated. DSX is robust, flexible, an...

...read moreread less

257 citations

Journal Article•DOI•

A Machine Learning-Based Method To Improve Docking Scoring Functions and Its Application to Drug Repurposing

[...]

Sarah L. Kinnings¹, Nina Liu², Peter J. Tonge², Richard M. Jackson¹, Lei Xie³, Lei Xie⁴, Philip E. Bourne³ - Show less +3 more•Institutions (4)

University of Leeds¹, Stony Brook University², University of Montana³, City University of New York⁴

03 Feb 2011-Journal of Chemical Information and Modeling

TL;DR: This paper shows how the use of support vector machines (SVMs), trained by associating sets of individual energy terms retrieved from molecular docking with the known binding affinity of each compound from high-throughput screening experiments, can be used to improve the correlation between known binding affinities and those predicted by the docking program eHiTS.

...read moreread less

Abstract: Docking scoring functions are notoriously weak predictors of binding affinity. They typically assign a common set of weights to the individual energy terms that contribute to the overall energy score; however, these weights should be gene family dependent. In addition, they incorrectly assume that individual interactions contribute toward the total binding affinity in an additive manner. In reality, noncovalent interactions often depend on one another in a nonlinear manner. In this paper, we show how the use of support vector machines (SVMs), trained by associating sets of individual energy terms retrieved from molecular docking with the known binding affinity of each compound from high-throughput screening experiments, can be used to improve the correlation between known binding affinities and those predicted by the docking program eHiTS. We construct two prediction models: a regression model trained using IC(50) values from BindingDB, and a classification model trained using active and decoy compounds from the Directory of Useful Decoys (DUD). Moreover, to address the issue of overrepresentation of negative data in high-throughput screening data sets, we have designed a multiple-planar SVM training procedure for the classification model. The increased performance that both SVMs give when compared with the original eHiTS scoring function highlights the potential for using nonlinear methods when deriving overall energy scores from their individual components. We apply the above methodology to train a new scoring function for direct inhibitors of Mycobacterium tuberculosis (M.tb) InhA. By combining ligand binding site comparison with the new scoring function, we propose that phosphodiesterase inhibitors can potentially be repurposed to target M.tb InhA. Our methodology may be applied to other gene families for which target structures and activity data are available, as demonstrated in the work presented here.

...read moreread less

177 citations

Journal Article•DOI•

Rapid shape-based ligand alignment and virtual screening method based on atom/feature-pair similarities and volume overlap scoring.

[...]

G. Madhavi Sastry¹, Steven L. Dixon¹, Woody Sherman¹•Institutions (1)

Schrödinger¹

15 Sep 2011-Journal of Chemical Information and Modeling

TL;DR: A new shape-based flexible ligand superposition and virtual screening method, Phase Shape, is shown to rapidly produce accurate 3D ligand alignments and efficiently enrich actives in virtual screening.

...read moreread less

Abstract: Shape-based methods for aligning and scoring ligands have proven to be valuable in the field of computer-aided drug design. Here, we describe a new shape-based flexible ligand superposition and virtual screening method, Phase Shape, which is shown to rapidly produce accurate 3D ligand alignments and efficiently enrich actives in virtual screening. We describe the methodology, which is based on the principle of atom distribution triplets to rapidly define trial alignments, followed by refinement of top alignments to maximize the volume overlap. The method can be run in a shape-only mode or it can include atom types or pharmacophore feature encoding, the latter consistently producing the best results for database screening. We apply Phase Shape to flexibly align molecules that bind to the same target and show that the method consistently produces correct alignments when compared with crystal structures. We then illustrate the effectiveness of the method for identifying active compounds in virtual screening ...

...read moreread less

173 citations

Journal Article•DOI•

Learning to Predict Chemical Reactions

[...]

Matthew A. Kayala¹, Chloé-Agathe Azencott¹, Jonathan H. Chen¹, Pierre Baldi¹•Institutions (1)

University of California, Irvine¹

26 Sep 2011-Journal of Chemical Information and Modeling

TL;DR: This work describes single mechanistic reactions as interactions between coarse approximations of molecular orbitals (MOs) and use topological and physicochemical attributes as descriptors and proposes a new approach to reaction prediction utilizing elements from each pole.

...read moreread less

Abstract: Being able to predict the course of arbitrary chemical reactions is essential to the theory and applications of organic chemistry. Approaches to the reaction prediction problems can be organized around three poles corresponding to: (1) physical laws; (2) rule-based expert systems; and (3) inductive machine learning. Previous approaches at these poles, respectively, are not high throughput, are not generalizable or scalable, and lack sufficient data and structure to be implemented. We propose a new approach to reaction prediction utilizing elements from each pole. Using a physically inspired conceptualization, we describe single mechanistic reactions as interactions between coarse approximations of molecular orbitals (MOs) and use topological and physicochemical attributes as descriptors. Using an existing rule-based system (Reaction Explorer), we derive a restricted chemistry data set consisting of 1630 full multistep reactions with 2358 distinct starting materials and intermediates, associated with 2989 ...

...read moreread less

168 citations

Journal Article•DOI•

LigBuilder 2: A Practical de Novo Drug Design Approach

[...]

Yaxia Yuan¹, Jianfeng Pei¹, Luhua Lai¹•Institutions (1)

Peking University¹

03 May 2011-Journal of Chemical Information and Modeling

TL;DR: A cavity detection procedure is implemented to detect the positions and shapes of the binding sites on the surface of a given protein structure and to quantitatively estimate drugability in LigBuilder 2.0.

...read moreread less

Abstract: We have developed a new version (2.0) of the de novo drug design program LigBuilder. With LigBuilder 2.0, the synthesis accessibility of designed compounds can be analyzed, and a cavity detection procedure is implemented to detect the positions and shapes of the binding sites on the surface of a given protein structure and to quantitatively estimate drugability. Ligands are designed to best fit the detected cavities using a set of rules for evaluation. Drug-like and privileged fragments are used to construct the ligands with the aid of internal and external absorption, distribution, metabolism, excretion, and toxicity (ADME/T) and drug-like filters.

...read moreread less

Journal Article•DOI•

Chemical name to structure: OPSIN, an open source solution.

[...]

Daniel M. Lowe¹, Peter T. Corbett¹, Peter Murray-Rust¹, Robert C. Glen¹•Institutions (1)

University of Cambridge¹

09 Mar 2011-Journal of Chemical Information and Modeling

TL;DR: An open source, freely available, algorithm that interprets the majority of organic chemical nomenclature in a fast and precise manner using an approach based on a regular grammar that can serve as the basis for future open source developments of chemical name interpretation.

...read moreread less

Abstract: We have produced an open source, freely available, algorithm (Open Parser for Systematic IUPAC Nomenclature, OPSIN) that interprets the majority of organic chemical nomenclature in a fast and precise manner. This has been achieved using an approach based on a regular grammar. This grammar is used to guide tokenization, a potentially difficult problem in chemical names. From the parsed chemical name, an XML parse tree is constructed that is operated on in a stepwise manner until the structure has been reconstructed from the name. Results from OPSIN on various computer generated name/structure pair sets are presented. These show exceptionally high precision (99.8%+) and, when using general organic chemical nomenclature, high recall (98.7−99.2%). This software can serve as the basis for future open source developments of chemical name interpretation.

...read moreread less

Journal Article•DOI•

Classification of cytochrome P450 inhibitors and noninhibitors using combined classifiers.

[...]

Feixiong Cheng¹, Yue Yu¹, Jie Shen¹, Lei Yang¹, Weihua Li¹, Guixia Liu¹, Philip W. Lee², Philip W. Lee¹, Yun Tang¹ - Show less +5 more•Institutions (2)

East China University of Science and Technology¹, Kyoto University²

14 Apr 2011-Journal of Chemical Information and Modeling

TL;DR: These classification models are applicable for virtual screening of the five major CYP isoforms inhibitors or can be used as simple filters of potential chemicals in drug discovery.

...read moreread less

Abstract: Adverse side effects of drug-drug interactions induced by human cytochrome P450 (CYP) inhibition is an important consideration, especially, during the research phase of drug discovery. It is highly desirable to develop computational models that can predict the inhibitive effect of a compound against a specific CYP isoform. In this study, inhibitor predicting models were developed for five major CYP isoforms, namely 1A2, 2C9, 2C19, 2D6, and 3A4, using a combined classifier algorithm on a large data set containing more than 24,700 unique compounds, extracted from PubChem. The combined classifiers algorithm is an ensemble of different independent machine learning classifiers including support vector machine, C4.5 decision tree, k-nearest neighbor, and naive Bayes, fused by a back-propagation artificial neural network (BP-ANN). All developed models were validated by 5-fold cross-validation and a diverse validation set composed of about 9000 diverse unique compounds. The range of the area under the receiver operating characteristic curve (AUC) for the validation sets was 0.764 to 0.815 for CYP1A2, 0.837 to 0.861 for CYP2C9, 0.793 to 0.842 for CYP2C19, 0.839 to 0.886 for CYP2D6, and 0.754 to 0.790 for CYP3A4, respectively, using the new developed combined classifiers. The overall performance of the combined classifiers fused by BP-ANN was superior to that of three classic fusion techniques (Mean, Maximum, and Multiply). The chemical spaces of data sets were explored by multidimensional scaling plots, and the use of applicability domain improved the prediction accuracies of models. In addition, some representative substructure fragments differentiating CYP inhibitors and noninhibitors were characterized by the substructure fragment analysis. These classification models are applicable for virtual screening of the five major CYP isoforms inhibitors or can be used as simple filters of potential chemicals in drug discovery.

...read moreread less

Journal Article•DOI•

SHAFTS: a hybrid approach for 3D molecular similarity calculation. 1. Method and assessment of virtual screening.

[...]

Xiaofeng Liu¹, Hualiang Jiang², Honglin Li¹•Institutions (2)

East China University of Science and Technology¹, Chinese Academy of Sciences²

25 Aug 2011-Journal of Chemical Information and Modeling

TL;DR: SHAFTS outperformed several other widely used virtual screening methods in terms of enrichment of known active compounds as well as novel chemotypes, thereby indicating its robustness in hit compounds identification and potential of scaffold hopping in virtual screening.

...read moreread less

Abstract: We developed a novel approach called SHAFTS (SHApe-FeaTure Similarity) for 3D molecular similarity calculation and ligand-based virtual screening. SHAFTS adopts a hybrid similarity metric combined with molecular shape and colored (labeled) chemistry groups annotated by pharmacophore features for 3D similarity calculation and ranking, which is designed to integrate the strength of pharmacophore matching and volumetric overlay approaches. A feature triplet hashing method is used for fast molecular alignment poses enumeration, and the optimal superposition between the target and the query molecules can be prioritized by calculating corresponding “hybrid similarities”. SHAFTS is suitable for large-scale virtual screening with single or multiple bioactive compounds as the query “templates” regardless of whether corresponding experimentally determined conformations are available. Two public test sets (DUD and Jain’s sets) including active and decoy molecules from a panel of useful drug targets were adopted to e...

...read moreread less

Journal Article•DOI•

CSAR Benchmark Exercise of 2010: Combined Evaluation Across All Submitted Scoring Functions

[...]

Richard D. Smith¹, James B. Dunbar¹, Peter Man-Un Ung¹, Emilio Xavier Esposito¹, Chao Yie Yang¹, Shaomeng Wang¹, Heather A. Carlson¹ - Show less +3 more•Institutions (1)

University of Michigan¹

29 Aug 2011-Journal of Chemical Information and Modeling

TL;DR: Poorly scored complexes were found to have ligands that were the same size as those in well-scored complexes, but hydrogen bonding and torsional strain were significantly different, pointing to a need for CSAR to develop data sets of congeneric series with a range of hydrogen-bonding and hydrophobic characteristics and arange of rotatable bonds.

...read moreread less

Abstract: As part of the Community Structure-Activity Resource (CSAR) center, a set of 343 high-quality, protein–ligand crystal structures were assembled with experimentally determined Kd or Ki information from the literature. We encouraged the community to score the crystallographic poses of the complexes by any method of their choice. The goal of the exercise was to (1) evaluate the current ability of the field to predict activity from structure and (2) investigate the properties of the complexes and methods that appear to hinder scoring. A total of 19 different methods were submitted with numerous parameter variations for a total of 64 sets of scores from 16 participating groups. Linear regression and nonparametric tests were used to correlate scores to the experimental values. Correlation to experiment for the various methods ranged R2 = 0.58–0.12, Spearman ρ = 0.74–0.37, Kendall τ = 0.55–0.25, and median unsigned error = 1.00–1.68 pKd units. All types of scoring functions—force field based, knowledge based, an...

...read moreread less

Journal Article•DOI•

Anisotropic solvent model of the lipid bilayer. 2. Energetics of insertion of small molecules, peptides, and proteins in membranes.

[...]

Andrei L. Lomize¹, Irina D. Pogozheva¹, Henry I. Mosberg¹•Institutions (1)

University of Michigan¹

25 Mar 2011-Journal of Chemical Information and Modeling

TL;DR: A new computational approach to calculating binding energies and spatial positions of small molecules, peptides, and proteins in the lipid bilayer has been developed and applied for the large-scale calculations of spatial positions in membranes of more than 1000 peripheral and integral proteins.

...read moreread less

Abstract: A new computational approach to calculating binding energies and spatial positions of small molecules, peptides, and proteins in the lipid bilayer has been developed. The method combines an anisotropic solvent representation of the lipid bilayer and universal solvation model, which predicts transfer energies of molecules from water to an arbitrary medium with defined polarity properties. The universal solvation model accounts for hydrophobic, van der Waals, hydrogen-bonding, and electrostatic solute−solvent interactions. The lipid bilayer is represented as a fluid anisotropic environment described by profiles of dielectric constant (e), solvatochromic dipolarity parameter (π*), and hydrogen bonding acidity and basicity parameters (α and β). The polarity profiles were calculated using published distributions of quasi-molecular segments of lipids determined by neutron and X-ray scattering for DOPC bilayer and spin-labeling data that define concentration of water in the lipid acyl chain region. The model als...

...read moreread less

Journal Article•DOI•

Pharmer: efficient and exact pharmacophore search

[...]

David Ryan Koes¹, Carlos J. Camacho¹•Institutions (1)

University of Pittsburgh¹

27 Jun 2011-Journal of Chemical Information and Modeling

TL;DR: Pharmacophore search is a key component of many drug discovery efforts and Pharmer is a new computational approach that scales with the breadth and complexity of the query, not the size of the compound library being screened.

...read moreread less

Abstract: Pharmacophore search is a key component of many drug discovery efforts. Pharmer is a new computational approach to pharmacophore search that scales with the breadth and complexity of the query, not the size of the compound library being screened. Two novel methods for organizing pharmacophore data, the Pharmer KDB-tree and Bloom fingerprints, enable Pharmer to perform an exact pharmacophore search of almost two million structures in less than a minute. In general, Pharmer is more than an order of magnitude faster than existing technologies. The complete source code is available under an open-source license at http://pharmer.sourceforge.net.

...read moreread less

Journal Article•DOI•

Solvated interaction energy (SIE) for scoring protein-ligand binding affinities. 2. Benchmark in the CSAR-2010 scoring exercise.

[...]

Traian Sulea¹, Qizhi Cui¹, Enrico O. Purisima¹•Institutions (1)

National Research Council¹

13 Jul 2011-Journal of Chemical Information and Modeling

TL;DR: The SIE function was tested in the Community Structure-Activity Resource (CSAR) scoring challenge consisting of high-resolution cocrystal structures for 343 protein-ligand complexes with high-quality binding affinity data and high diversity with respect to protein targets, finding that this manual curation was a critical step for accurately testing the performance of the SIEfunction.

...read moreread less

Abstract: Solvated interaction energy (SIE) is an end-point physics-based scoring function for predicting binding affinities from force-field nonbonded interaction terms, continuum solvation, and configurational entropy linear compensation. We tested the SIE function in the Community Structure–Activity Resource (CSAR) scoring challenge consisting of high-resolution cocrystal structures for 343 protein–ligand complexes with high-quality binding affinity data and high diversity with respect to protein targets. Particular emphasis was placed on the sensitivity of SIE predictions to the assignment of protonation and tautomeric states in the complex and the treatment of metal ions near the protein–ligand interface. These were manually curated from an originally distributed CSAR-HiQ data set version, leading to the currently distributed CSAR-NRC-HiQ version. We found that this manual curation was a critical step for accurately testing the performance of the SIE function. The standard SIE parametrization, previously calib...

...read moreread less

Journal Article•DOI•

CSAR Benchmark Exercise of 2010: Selection of the Protein–Ligand Complexes

[...]

James B. Dunbar¹, Richard D. Smith¹, Chao Yie Yang¹, Peter Man-Un Ung¹, Katrina W. Lexa¹, Nickolay A. Khazanov¹, Jeanne A. Stuckey², Shaomeng Wang¹, Heather A. Carlson¹ - Show less +5 more•Institutions (2)

University of Michigan¹, Life Sciences Institute²

22 Jul 2011-Journal of Chemical Information and Modeling

TL;DR: The details of how the data set was initially selected, and the process by which it matured to better fit the needs of the community are presented, underscores the value of a supportive, collaborative effort in moving the field forward.

...read moreread less

Abstract: A major goal in drug design is the improvement of computational methods for docking and scoring. The Community Structure Activity Resource (CSAR) aims to collect available data from industry and academia which may be used for this purpose (www.csardock.org). Also, CSAR is charged with organizing community-wide exercises based on the collected data. The first of these exercises was aimed to gauge the overall state of docking and scoring, using a large and diverse data set of protein–ligand complexes. Participants were asked to calculate the affinity of the complexes as provided and then recalculate with changes which may improve their specific method. This first data set was selected from existing PDB entries which had binding data (Kd or Ki) in Binding MOAD, augmented with entries from PDBbind. The final data set contains 343 diverse protein–ligand complexes and spans 14 pKd. Sixteen proteins have three or more complexes in the data set, from which a user could start an inspection of congeneric series. In...

...read moreread less

Journal Article•DOI•

Aromatic-aromatic interactions in proteins: beyond the dimer.

[...]

Esteban Omar Lanzarotti¹, Rolf R. Biekofsky¹, Darío A. Estrin¹, Marcelo A. Martí¹, Adrian Turjanski¹ - Show less +1 more•Institutions (1)

Facultad de Ciencias Exactas y Naturales¹

27 Jun 2011-Journal of Chemical Information and Modeling

TL;DR: This study surveyed protein structures deposited in the Protein Data Bank in order to find clusters of aromatic residues in proteins larger than dimers and characterized them, and shows aromatic clsuters possible role in folding and protein-protein interactions.

...read moreread less

Abstract: Aromatic residues are key widespread elements of protein structures and have been shown to be important for structure stability, folding, protein–protein recognition, and ligand binding. The interactions of pairs of aromatic residues (aromatic dimers) have been extensively studied in protein structures. Isolated aromatic molecules tend to form higher order clusters, like trimers, tetramers, and pentamers, that adopt particular well-defined structures. Taking this into account, we have surveyed protein structures deposited in the Protein Data Bank in order to find clusters of aromatic residues in proteins larger than dimers and characterized them. Our results show that larger clusters are found in one of every two unique proteins crystallized so far, that the clusters are built adopting the same trimer motifs found for benzene clusters in vacuum, and that they are clearly nonlocal brining primary structure distant sites together. We extensively analyze the trimers and tetramers conformations and found two ...

...read moreread less

Journal Article•DOI•

Scaffold Diversity of Exemplified Medicinal Chemistry Space

[...]

Sarah R. Langdon¹, Nathan J. Brown¹, Julian Blagg¹•Institutions (1)

Institute of Cancer Research¹

31 Aug 2011-Journal of Chemical Information and Modeling

TL;DR: This study highlights the need for diversification of compound libraries used in hit discovery by focusing library enrichment on the synthesis of compounds with novel or underrepresented scaffolds.

...read moreread less

Abstract: The scaffold diversity of 7 representative commercial and proprietary compound libraries is explored for the first time using both Murcko frameworks and Scaffold Trees. We show that Level 1 of the Scaffold Tree is useful for the characterization of scaffold diversity in compound libraries and offers advantages over the use of Murcko frameworks. This analysis also demonstrates that the majority of compounds in the libraries we analyzed contain only a small number of well represented scaffolds and that a high percentage of singleton scaffolds represent the remaining compounds. We use Tree Maps to clearly visualize the scaffold space of representative compound libraries, for example, to display highly populated scaffolds and clusters of structurally similar scaffolds. This study further highlights the need for diversification of compound libraries used in hit discovery by focusing library enrichment on the synthesis of compounds with novel or underrepresented scaffolds.

...read moreread less

Journal Article•DOI•

AADS--an automated active site identification, docking, and scoring protocol for protein targets based on physicochemical descriptors

[...]

Tanya Singh¹, Debasish Biswas¹, Bhyravabhotla Jayaram¹•Institutions (1)

Indian Institutes of Technology¹

15 Sep 2011-Journal of Chemical Information and Modeling

TL;DR: A robust automated active site detection, docking, and scoring (AADS) protocol for proteins with known structures that predicts structure and energetics of the complexes agree quite well with experiment when tested on a data set of 170 protein-ligand complexes withknown structures and binding affinities.

...read moreread less

Abstract: We report here a robust automated active site detection, docking, and scoring (AADS) protocol for proteins with known structures. The active site finder identifies all cavities in a protein and scores them based on the physicochemical properties of functional groups lining the cavities in the protein. The accuracy realized on 620 proteins with sizes ranging from 100 to 600 amino acids with known drug active sites is 100% when the top ten cavity points are considered. These top ten cavity points identified are then submitted for an automated docking of an input ligand/candidate molecule. The docking protocol uses an all atom energy based Monte Carlo method. Eight low energy docked structures corresponding to different locations and orientations of the candidate molecule are stored at each cavity point giving 80 docked structures overall which are then ranked using an effective free energy function and top five structures are selected. The predicted structure and energetics of the complexes agree quite well...

...read moreread less

Journal Article•DOI•

Reproducing crystal binding modes of ligand functional groups using Site-Identification by Ligand Competitive Saturation (SILCS) simulations.

[...]

E. Prabhu Raman¹, Wenbo Yu¹, Olgun Guvench², Alexander D. MacKerell¹•Institutions (2)

University of Maryland, Baltimore¹, New England College²

25 Apr 2011-Journal of Chemical Information and Modeling

TL;DR: Results show that SILCS can recapitulate the known location of functional groups of bound inhibitors for a number of proteins, suggesting that the method may be of utility for rational drug design.

...read moreread less

Abstract: The applicability of a computational method, Site Identification by Ligand Competitive Saturation (SILCS), to identify regions on a protein surface with which different types of functional groups o...

...read moreread less

Journal Article•DOI•

A collection of robust organic synthesis reactions for in silico molecule design.

[...]

Markus Hartenfeller¹, Martin Eberle¹, Peter Meier¹, Cristina Nieto-Oberhuber¹, Karl-Heinz Altmann², Gisbert Schneider², Edgar Jacoby¹, Steffen Renner¹ - Show less +4 more•Institutions (2)

Novartis¹, École Polytechnique Fédérale de Lausanne²

11 Nov 2011-Journal of Chemical Information and Modeling

TL;DR: A focused collection of organic synthesis reactions for computer-based molecule construction inspired by real-world chemistry and compiled in close collaboration with medicinal chemists to achieve high practical relevance is presented.

...read moreread less

Abstract: A focused collection of organic synthesis reactions for computer-based molecule construction is presented. It is inspired by real-world chemistry and has been compiled in close collaboration with medicinal chemists to achieve high practical relevance. Virtual molecules assembled from existing starting material connected by these reactions are supposed to have an enhanced chance to be amenable to real chemical synthesis. About 50% of the reactions in the dataset are ring-forming reactions, which fosters the assembly of novel ring systems and innovative chemotypes. A comparison with a recent survey of the reactions used in early drug discovery revealed considerable overlaps with the collection presented here. The dataset is available encoded as computer-readable Reaction SMARTS expressions from the Supporting Information presented for this paper.

...read moreread less

Journal Article•DOI•

Support vector regression scoring of receptor-ligand complexes for rank-ordering and virtual screening of chemical libraries.

[...]

Liwei Li¹, Bo Wang², Samy O. Meroueh²•Institutions (2)

Indiana University¹, Indiana University – Purdue University Indianapolis²

26 Jul 2011-Journal of Chemical Information and Modeling

TL;DR: A variant of SVR-KB (SVR-KBD) was developed by following a target-specific tailoring strategy that was previously employed to derive SVM-SP, and showed a much higher enrichment, outperforming all other scoring functions tested, and was comparable in performance to the authors' previously derived scoring function S VM-SP.

...read moreread less

Abstract: The community structure–activity resource (CSAR) data sets are used to develop and test a support vector machine-based scoring function in regression mode (SVR). Two scoring functions (SVR-KB and SVR-EP) are derived with the objective of reproducing the trend of the experimental binding affinities provided within the two CSAR data sets. The features used to train SVR-KB are knowledge-based pairwise potentials, while SVR-EP is based on physicochemical properties. SVR-KB and SVR-EP were compared to seven other widely used scoring functions, including Glide, X-score, GoldScore, ChemScore, Vina, Dock, and PMF. Results showed that SVR-KB trained with features obtained from three-dimensional complexes of the PDBbind data set outperformed all other scoring functions, including best performing X-score, by nearly 0.1 using three correlation coefficients, namely Pearson, Spearman, and Kendall. It was interesting that higher performance in rank ordering did not translate into greater enrichment in virtual screening ...

...read moreread less

Journal Article•DOI•

Rationalizing Tight Ligand Binding through Cooperative Interaction Networks

[...]

Bernd Kuhn¹, Julian E. Fuchs¹, Michael Reutlinger¹, Martin Stahl¹, Neil R. Taylor - Show less +1 more•Institutions (1)

Hoffmann-La Roche¹

27 Dec 2011-Journal of Chemical Information and Modeling

TL;DR: This study introduces two new concepts into the computational description of molecular recognition and takes a broader view of noncovalent interactions and describes protein–ligand binding with a comprehensive set of favorable and unfavorable contact types, including for example halogen bonding and orthogonal multipolar interactions.

...read moreread less

Abstract: Small modifications of the molecular structure of a ligand sometimes cause strong gains in binding affinity to a protein target, rendering a weakly active chemical series suddenly attractive for further optimization. Our goal in this study is to better rationalize and predict the occurrence of such interaction hot-spots in receptor binding sites. To this end, we introduce two new concepts into the computational description of molecular recognition. First, we take a broader view of noncovalent interactions and describe protein–ligand binding with a comprehensive set of favorable and unfavorable contact types, including for example halogen bonding and orthogonal multipolar interactions. Second, we go beyond the commonly used pairwise additive treatment of atomic interactions and use a small world network approach to describe how interactions are modulated by their environment. This approach allows us to capture local cooperativity effects and considerably improves the performance of a newly derived empirica...

...read moreread less

Journal Article•DOI•

A multiscale simulation system for the prediction of drug-induced cardiotoxicity.

[...]

Cristian Obiol-Pardo¹, Julio Gomis-Tena², Ferran Sanz¹, Javier Saiz², Manuel Pastor¹ - Show less +1 more•Institutions (2)

Pompeu Fabra University¹, Polytechnic University of Valencia²

20 Jan 2011-Journal of Chemical Information and Modeling

TL;DR: These results can be considered a proof of concept, suggesting that multiscale prediction systems can be suitable for being used for preliminary screening in lead discovery, before the compound is physically available, or in early preclinical development when they can be fed with experimentally obtained data.

...read moreread less

Abstract: The preclinical assessment of drug-induced ventricular arrhythmia, a major concern for regulators, is typically based on experimental or computational models focused on the potassium channel hERG (human ether-a-go-go-related gene, Kv11.1). Even if the role of this ion channel in the ventricular repolarization is of critical importance, the complexity of the events involved make the cardiac safety assessment based only on hERG has a high risk of producing either false positive or negative results. We introduce a multiscale simulation system aiming to produce a better cardiotoxicity assessment. At the molecular scale, the proposed system uses a combination of docking simulations on two potassium channels, hERG and KCNQ1, plus three-dimensional quantitative structure−activity relationship modeling for predicting how the tested compound will block the potassium currents IKr and IKs. The obtained results have been introduced in electrophysiological models of the cardiomyocytes and the ventricular tissue, allow...

...read moreread less

Journal Article•DOI•

Predictive power of molecular dynamics receptor structures in virtual screening.

[...]

Sara E. Nichols¹, Riccardo Baron¹, Anthony Ivetac¹, J. Andrew McCammon¹•Institutions (1)

University of California, San Diego¹

27 Jun 2011-Journal of Chemical Information and Modeling

TL;DR: A critical analysis of the predictive power of MD snapshots to blind virtual screening by docking is presented, evaluating two well-characterized systems of varying flexibility in ligand-bound and unbound configurations.

...read moreread less

Abstract: Molecular dynamics (MD) simulation is a well-established method for understanding protein dynamics. Conformations from unrestrained MD simulations have yet to be assessed for blind virtual screening (VS) by docking. This study presents a critical analysis of the predictive power of MD snapshots to this regard, evaluating two well-characterized systems of varying flexibility in ligand-bound and unbound configurations. Results from such VS predictions are discussed with respect to experimentally determined structures. In all cases, MD simulations provide snapshots that improve VS predictive power over known crystal structures, possibly due to sampling more relevant receptor conformations. Additionally, MD can move conformations previously not amenable to docking into the predictive range.

...read moreread less

Journal Article•DOI•

Understanding the Impact of the P-loop Conformation on Kinase Selectivity

[...]

Cristiano Ruch Werneck Guimarães¹, Brajesh K. Rai¹, Michael John Munchhof¹, Shenping Liu¹, Jian Wang¹, Samit Kumar Bhattacharya¹, Leonard Buckbinder¹ - Show less +3 more•Institutions (1)

Pfizer¹

24 May 2011-Journal of Chemical Information and Modeling

TL;DR: Statistical and computational analyses of the crystal structure database demonstrate that inhibitors that induce the P-loop folded conformation tend to be more selective, especially if they take advantage of this specific conformation by interacting more favorably with a conserved Tyr or Phe residue from theP-loop.

...read moreread less

Abstract: This work addresses the link between selectivity and an unusual, folded conformation for the P-loop observed initially for MAP4K4 and subsequently for other kinases. Statistical and computational analyses of our crystal structure database demonstrate that inhibitors that induce the P-loop folded conformation tend to be more selective, especially if they take advantage of this specific conformation by interacting more favorably with a conserved Tyr or Phe residue from the P-loop.

...read moreread less

Journal Article•DOI•

Detailed mechanism of squalene epoxidase inhibition by terbinafine.

[...]

Marcin Nowosielski, Marcin Hoffmann¹, Lucjan Wyrwicz, Piotr Stępniak, Dariusz Plewczynski², Michal Lazniewski³, Michal Lazniewski², Krzysztof Ginalski², Leszek Rychlewski - Show less +5 more•Institutions (3)

Adam Mickiewicz University in Poznań¹, University of Warsaw², Medical University of Warsaw³

13 Jan 2011-Journal of Chemical Information and Modeling

TL;DR: The results, elucidating at a molecular level the mode of terbinafine inhibitory activity, can be utilized in designing more potent or selective antifungal drugs or even medicines lowering cholesterol in humans.

...read moreread less

Abstract: Squalene epoxidase (SE) is a key flavin adenine dinucleotide (FAD)-dependent enzyme of ergosterol and cholesterol biosynthetic pathways and an attractive potential target for drugs used to inhibit the growth of pathogenic fungi or to lower cholesterol level. Although many studies on allylamine drugs activity have been published during the last 30 years, up until now no detailed mechanism of the squalene epoxidase inhibition has been presented. Our study brings such a model at atomic resolution in the case of yeast Saccharomyces cerevisiae. Presented data resulting from modeling studies are in excellent agreement with experimental findings. A fully atomic three-dimensional (3D) model of squalene epoxidase (EC 1.14.99.7) from S. cerevisiae was built with the help of 3D-Jury approach and further screened based on data known from mutation experiments leading to terbinafine resistance. Docking studies followed by molecular dynamics simulations and quantum interaction energy calculations [MP2/6-31G(d)] resulted...

...read moreread less

Collapse