scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Computer-aided Molecular Design in 2012"


Journal ArticleDOI
TL;DR: The docking performance of the FRED and HYBRID programs are evaluated on two standardized datasets from the Docking and Scoring Symposium of the ACS Spring 2011 national meeting and includes cognate docking and virtual screening performance.
Abstract: The docking performance of the FRED and HYBRID programs are evaluated on two standardized datasets from the Docking and Scoring Symposium of the ACS Spring 2011 national meeting. The evaluation includes cognate docking and virtual screening performance. FRED docks 70 % of the structures to within 2 A in the cognate docking test. In the virtual screening test, FRED is found to have a mean AUC of 0.75. The HYBRID program uses a modified version of FRED’s algorithm that uses both ligand- and structure-based information to dock molecules, which increases its mean AUC to 0.78. HYBRID can also implicitly account for protein flexibility by making use of multiple crystal structures. Using multiple crystal structures improves HYBRID’s performance (mean AUC 0.80) with a negligible increase in docking time (~15 %).

373 citations


Journal ArticleDOI
TL;DR: Docking failures, defined as cases where the top scoring pose is greater than 2 Å from the experimental structure, are shown to be largely due to the absence of bound waters in the source dataset, highlighting the need to include these and other crucial information in future standardized sets.
Abstract: The results of cognate docking with the prepared Astex dataset provided by the organizers of the “Docking and Scoring: A Review of Docking Programs” session at the 241st ACS national meeting are presented. The MOE software with the newly developed GBVI/WSA dG scoring function is used throughout the study. For 80 % of the Astex targets, the MOE docker produces a top-scoring pose within 2 A of the X-ray structure. For 91 % of the targets a pose within 2 A of the X-ray structure is produced in the top 30 poses. Docking failures, defined as cases where the top scoring pose is greater than 2 A from the experimental structure, are shown to be largely due to the absence of bound waters in the source dataset, highlighting the need to include these and other crucial information in future standardized sets. Docking success is shown to depend heavily on data preparation. A “dataset preparation” error of 0.5 kcal/mol is shown to cause fluctuations of over 20 % in docking success rates.

324 citations


Journal ArticleDOI
TL;DR: Flexible docking and scoring using the internal coordinate mechanics software (ICM) was benchmarked for ligand binding mode prediction against the 85 co-crystal structures in the modified Astex data set and significant improvements up to ROC AUC = 82.2 and ROC(2%) =–45.2 were achieved following best practices for flexible pocket refinement and out-of-pocket binding rescore.
Abstract: Flexible docking and scoring using the internal coordinate mechanics software (ICM) was benchmarked for ligand binding mode prediction against the 85 co-crystal structures in the modified Astex data set. The ICM virtual ligand screening was tested against the 40 DUD target benchmarks and 11-target WOMBAT sets. The self-docking accuracy was evaluated for the top 1 and top 3 scoring poses at each ligand binding site with near native conformations below 2 ARMSD found in 91 and 95% of the predictions, respectively. The virtual ligand screening using single rigid pocket conformations provided the median area under the ROC curves equal to 69.4 with 22.0% true positives recovered at 2% false positive rate. Significant improvements up to ROC AUC = 82.2 and ROC(2%) = 45.2 were achieved following our best prac- tices for flexible pocket refinement and out-of-pocket binding rescore. The virtual screening can be further improved by considering multiple conformations of the target.

276 citations


Journal ArticleDOI
TL;DR: In the context of the problems now facing the pharmaceutical industry, it is asked how to best address drug discovery needs of the next quarter century using molecular dynamics simulations, and some possible approaches are suggested.
Abstract: Molecular dynamics simulations can now track rapid processes—those occurring in less than about a millisecond—at atomic resolution for many biologically relevant systems. These simulations appear poised to exert a significant impact on how new drugs are found, perhaps even transforming the very process of drug discovery. We predict here future results we can expect from, and enhancements we need to make in, molecular dynamics simulations over the coming 25 years, and in so doing set out several Grand Challenges for the field. In the context of the problems now facing the pharmaceutical industry, we ask how we can best address drug discovery needs of the next quarter century using molecular dynamics simulations, and we suggest some possible approaches.

254 citations


Journal ArticleDOI
TL;DR: Docking protocols developed for cross-docking, which address protein flexibility and produce discrete families of predicted poses, produced substantially better performance for pose prediction and use of multiple protein conformations significantly improved screening enrichment.
Abstract: Benchmarks for molecular docking have historically focused on re-docking the cognate ligand of a well-determined protein-ligand complex to measure geometric pose prediction accuracy, and measurement of virtual screening performance has been focused on increasingly large and diverse sets of target protein structures, cognate ligands, and various types of decoy sets. Here, pose prediction is reported on the Astex Diverse set of 85 protein ligand complexes, and virtual screening performance is reported on the DUD set of 40 protein targets. In both cases, prepared structures of targets and ligands were provided by symposium organizers. The re-prepared data sets yielded results not significantly different than previous reports of Surflex-Dock on the two benchmarks. Minor changes to protein coordinates resulting from complex pre-optimization had large effects on observed performance, highlighting the limitations of cognate ligand re-docking for pose prediction assessment. Docking protocols developed for cross-docking, which address protein flexibility and produce discrete families of predicted poses, produced substantially better performance for pose prediction. Performance on virtual screening performance was shown to benefit by employing and combining multiple screening methods: docking, 2D molecular similarity, and 3D molecular similarity. In addition, use of multiple protein conformations significantly improved screening enrichment.

213 citations


Journal ArticleDOI
TL;DR: This short article will concentrate only on cheminformatics applications and the workflow tools most commonly used in chemin formatics, namely Pipeline Pilot and KNIME.
Abstract: There are many examples of scientific workflow systems [1, 2]; in this short article I will concentrate only on cheminformatics applications and the workflow tools most commonly used in cheminformatics, namely Pipeline Pilot [3] and KNIME [4]. Workflow solutions have been used for years in bioinformatics and other sciences, and some also have applications in so-called “business intelligence” and “predictive analytics”. Readers can find details of Discovery Net, Galaxy, Kepler, Triana, SOMA, SMILA, VisTrails, and others on the Web. Kappler has compared Competitive Workflow, Taverna and Pipeline Pilot [5]. Taverna has been widely used in bioinformatics but is also used with the Chemistry Development Kit (CDK) [6, 7]. CDK-Taverna workflows are made freely available at myExperiment.org [8]. (myExperiment.org also includes KNIME workflows.) DiscoveryNet was one of the earliest examples of a scientific workflow system; its concepts were later commercialized in InforSense Knowledge Discovery Environment (KDE). My 2007 review [1] centered on Pipeline Pilot and InforSense KDE; KNIME was then a relative newcomer. In 2009 the loss-making InforSense organization was acquired by IDBS and KDE has made progress in translational medicine [9]. InforSense’s ChemSense [10] used ChemAxon’s JChem Cartridge, and ChemAxon chemical structure, property prediction, and enumeration tools. ChemSense’s three major pharmaceutical customers have turned to other solutions. The InforSense Suite lives on but it not seen as a “personal productivity tool”; rather it is integrated into the IDBS ELN platform. KNIME and Pipeline Pilot are now the market leaders in personal productivity in cheminformatics.

151 citations


Journal ArticleDOI
TL;DR: It is concluded that existing docking programs already perform close to optimally in the cognate pose prediction experiments currently carried out and that more stringent pose prediction tests should be used in the future to employ cross-docking sets.
Abstract: The performance of all four GOLD scoring functions has been evaluated for pose prediction and virtual screening under the standardized conditions of the comparative docking and scoring experiment reported in this Edition. Excellent pose prediction and good virtual screening performance was demonstrated using unmodified protein models and default parameter settings. The best performing scoring function for both pose prediction and virtual screening was demonstrated to be the recently introduced scoring function ChemPLP. We conclude that existing docking programs already perform close to optimally in the cognate pose prediction experiments currently carried out and that more stringent pose prediction tests should be used in the future. These should employ cross-docking sets. Evaluation of virtual screening performance remains problematic and much remains to be done to improve the usefulness of publically available active and decoy sets for virtual screening. Finally we suggest that, for certain target/scoring function combinations, good enrichment may sometimes be a consequence of 2D property recognition rather than a modelling of the correct 3D interactions.

143 citations


Journal ArticleDOI
TL;DR: Overall, the breadth and number of experiments performed provide a useful snapshot of current capabilities of DOCK6 as well as starting points to guide future development efforts to further improve sampling and scoring.
Abstract: In conjunction with the recent American Chemical Society symposium titled “Docking and Scoring: A Review of Docking Programs” the performance of the DOCK6 program was evaluated through (1) pose reproduction and (2) database enrichment calculations on a common set of organizer-specified systems and datasets (ASTEX, DUD, WOMBAT). Representative baseline grid score results averaged over five docking runs yield a relatively high pose identification success rate of 72.5 % (symmetry corrected rmsd) and sampling rate of 91.9 % for the multi site ASTEX set (N = 147) using organizer-supplied structures. Numerous additional docking experiments showed that ligand starting conditions, symmetry, multiple binding sites, clustering, and receptor preparation protocols all affect success. Encouragingly, in some cases, use of more sophisticated scoring and sampling methods yielded results which were comparable (Amber score ligand movable protocol) or exceeded (LMOD score) analogous baseline grid-score results. The analysis highlights the potential benefit and challenges associated with including receptor flexibility and indicates that different scoring functions have system dependent strengths and weaknesses. Enrichment studies with the DUD database prepared using the SB2010 preparation protocol and native ligand pairings yielded individual area under the curve (AUC) values derived from receiver operating characteristic curve analysis ranging from 0.29 (bad enrichment) to 0.96 (good enrichment) with an average value of 0.60 (27/38 have AUC ≥ 0.5). Strong early enrichment was also observed in the critically important 1.0–2.0 % region. Somewhat surprisingly, an alternative receptor preparation protocol yielded comparable results. As expected, semi-random pairings yielded poorer enrichments, in particular, for unrelated receptors. Overall, the breadth and number of experiments performed provide a useful snapshot of current capabilities of DOCK6 as well as starting points to guide future development efforts to further improve sampling and scoring.

131 citations


Journal ArticleDOI
TL;DR: The SAMPL3 community challenge as discussed by the authors included the first ever blind prediction challenge for host-guest binding affinities, through the incorporation of 11 new host−guest complexes, and 10 participating research groups addressed this challenge with a variety of approaches.
Abstract: The computational prediction of protein–ligand binding affinities is of central interest in early-stage drug-discovery, and there is a widely recognized need for improved methods. Low molecular weight receptors and their ligands—i.e., host–guest systems—represent valuable test-beds for such affinity prediction methods, because their small size makes for fast calculations and relatively facile numerical convergence. The SAMPL3 community exercise included the first ever blind prediction challenge for host–guest binding affinities, through the incorporation of 11 new host–guest complexes. Ten participating research groups addressed this challenge with a variety of approaches. Statistical assessment indicates that, although most methods performed well at predicting some general trends in binding affinity, overall accuracy was not high, as all the methods suffered from either poor correlation or high RMS errors or both. There was no clear advantage in using explicit versus implicit solvent models, any particular force field, or any particular approach to conformational sampling. In a few cases, predictions using very similar energy models but different sampling and/or free-energy methods resulted in significantly different results. The protonation states of one host and some guest molecules emerged as key uncertainties beyond the choice of computational approach. The present results have implications for methods development and future blind prediction exercises.

118 citations


Journal ArticleDOI
TL;DR: In both Astex and DUD datasets, docking performance is significantly improved employing a best-practices preparation scheme over using minimally-prepared structures from the PDB.
Abstract: Glide SP mode enrichment results for two preparations of the DUD dataset and native ligand docking RMSDs for two preparations of the Astex dataset are presented. Following a best-practices preparation scheme, an average RMSD of 1.140 A for native ligand docking with Glide SP is computed. Following the same best-practices preparation scheme for the DUD dataset an average area under the ROC curve (AUC) of 0.80 and average early enrichment via the ROC (0.1 %) metric of 0.12 were observed. 74 and 56 % of the 39 best-practices prepared targets showed AUC over 0.7 and 0.8, respectively. Average AUC was greater than 0.7 for all best-practices protein families demonstrating consistent enrichment performance across a broad range of proteins and ligand chemotypes. In both Astex and DUD datasets, docking performance is significantly improved employing a best-practices preparation scheme over using minimally-prepared structures from the PDB. Enrichment results for WScore, a new scoring function and sampling methodology integrating WaterMap and Glide, are presented for four DUD targets, hivrt, hsp90, cdk2, and fxa. WScore performance in early enrichment is consistently strong and all systems examined show AUC > 0.9 and superior early enrichment to DUD best-practices Glide SP results.

114 citations


Journal ArticleDOI
TL;DR: This study validated the new scoring function of HYDE by applying it in large-scale docking experiments and carried out a very detailed analysis of the data that revealed interesting pitfalls, which should be addressed in future benchmark datasets.
Abstract: The HYDE scoring function consistently describes hydrogen bonding, the hydrophobic effect and desolvation. It relies on HYdration and DEsolvation terms which are calibrated using octanol/water partition coefficients of small molecules. We do not use affinity data for calibration, therefore HYDE is generally applicable to all protein targets. HYDE reflects the Gibbs free energy of binding while only considering the essential interactions of protein–ligand complexes. The greatest benefit of HYDE is that it yields a very intuitive atom-based score, which can be mapped onto the ligand and protein atoms. This allows the direct visualization of the score and consequently facilitates analysis of protein–ligand complexes during the lead optimization process. In this study, we validated our new scoring function by applying it in large-scale docking experiments. We could successfully predict the correct binding mode in 93% of complexes in redocking calculations on the Astex diverse set, while our performance in virtual screening experiments using the DUD dataset showed significant enrichment values with a mean AUC of 0.77 across all protein targets with little or no structural defects. As part of these studies, we also carried out a very detailed analysis of the data that revealed interesting pitfalls, which we highlight here and which should be addressed in future benchmark datasets.

Journal ArticleDOI
TL;DR: This interaction study provided promising ligands to inhibit the quorum sensing (QS) mediated virulence factors production in P. aeruginosa.
Abstract: Drugs have been discovered in the past mainly either by identification of active components from traditional remedies or by unpredicted discovery. A key motivation for the study of structure based virtual screening is the exploitation of such information to design targeted drugs. In this study, structure based virtual screening was used in search for putative quorum sensing inhibitors (QSI) of Pseudomonas aeruginosa. The virtual screening programme Glide version 5.5 was applied to screen 1,920 natural compounds/drugs against LasR and RhlR receptor proteins of P. aeruginosa. Based on the results of in silico docking analysis, five top ranking compounds namely rosmarinic acid, naringin, chlorogenic acid, morin and mangiferin were subjected to in vitro bioassays against laboratory strain PAO1 and two more antibiotic resistant clinical isolates, P. aeruginosa AS1 (GU447237) and P. aeruginosa AS2 (GU447238). Among the five compounds studied, except mangiferin other four compounds showed significant inhibition in the production of protease, elastase and hemolysin. Further, all the five compounds potentially inhibited the biofilm related behaviours. This interaction study provided promising ligands to inhibit the quorum sensing (QS) mediated virulence factors production in P. aeruginosa.

Journal ArticleDOI
TL;DR: An analysis of the geometry of ligands bound to proteins is presented and the role of small molecule crystal structures is highlighted in enabling molecular modellers to critically evaluate a ligand model’s quality and investigate protein-induced strain.
Abstract: The protein databank now contains the structures of over 11,000 ligands bound to proteins. These structures are invaluable in applied areas such as structure-based drug design, but are also the substrate for understanding the energetics of intermolecular interactions with proteins. Despite their obvious importance, the careful analysis of ligands bound to protein structures lags behind the analysis of the protein structures themselves. We present an analysis of the geometry of ligands bound to proteins and highlight the role of small molecule crystal structures in enabling molecular modellers to critically evaluate a ligand model’s quality and investigate protein-induced strain.

Journal ArticleDOI
TL;DR: This work shows that carbonyl–halogen bonds may be used to expand the patentable medicinal chemistry space, redefining halogens as key features and making the QM information available for automatic molecular recognition in virtual high throughput screening.
Abstract: Halogen bonds are specific embodiments of the sigma hole bonding paradigm. They represent directional interactions between the halogens chlorine, bromine, or iodine and an electron donor as binding partner. Using quantum chemical calculations at the MP2 level, we systematically explore how they can be used in molecular design to address the omnipresent carbonyls of the protein backbone. We characterize energetics and directionality and elucidate their spatial variability in sub-optimal geometries that are expected to occur in protein–ligand complexes featuring a multitude of concomitant interactions. By deriving simple rules, we aid medicinal chemists and chemical biologists in easily exploiting them for scaffold decoration and design. Our work shows that carbonyl–halogen bonds may be used to expand the patentable medicinal chemistry space, redefining halogens as key features. Furthermore, this data will be useful for implementing halogen bonds into pharmacophore models or scoring functions making the QM information available for automatic molecular recognition in virtual high throughput screening.

Journal ArticleDOI
TL;DR: Alchemical hydration free energy calculations for the set of small molecules comprising the 2011 Statistical Assessment of Modeling of Proteins and Ligands challenge find a number of chemical trends within each molecular series which can explain, but there are also some surprises.
Abstract: Hydration free energy calculations have become important tests of force fields. Alchemical free energy calculations based on molecular dynamics simulations provide a rigorous way to calculate these free energies for a particular force field, given sufficient sampling. Here, we report results of alchemical hydration free energy calculations for the set of small molecules comprising the 2011 Statistical Assessment of Modeling of Proteins and Ligands challenge. Our calculations are largely based on the Generalized Amber Force Field with several different charge models, and we achieved RMS errors in the 1.4–2.2 kcal/mol range depending on charge model, marginally higher than what we typically observed in previous studies (Mobley et al. in J Phys Chem B 111(9):2242–2254, 2007, J Chem Theory Comput 5(2):350–358, 2009, J Phys Chem B 115:1329–1332, 2011; Nicholls et al. in J Med Chem 51:769–779, 2008; Klimovich and Mobley in J Comput Aided Mol Design 24(4):307–316, 2010). The test set consists of ethane, biphenyl, and a dibenzyl dioxin, as well as a series of chlorinated derivatives of each. We found that, for this set, using high-quality partial charges from MP2/cc-PVTZ SCRF RESP fits provided marginally improved agreement with experiment over using AM1-BCC partial charges as we have more typically done, in keeping with our recent findings (Mobley et al. in J Phys Chem B 115:1329–1332, 2011). Switching to OPLS Lennard–Jones parameters with AM1-BCC charges also improves agreement with experiment. We also find a number of chemical trends within each molecular series which we can explain, but there are also some surprises, including some that are captured by the calculations and some that are not.

Journal ArticleDOI
TL;DR: It is argued that despite limitations of today's force fields, current simulation tools and force fields now provide the potential for real benefits in a variety of applications, however, these same tools also provide irreproducible results which are often poorly interpreted.
Abstract: Molecular simulations see widespread and increasing use in computation and molecular design, especially within the area of molecular simulations applied to biomolecular binding and interactions, our focus here. However, force field accuracy remains a concern for many practitioners, and it is often not clear what level of accuracy is really needed for payoffs in a discovery setting. Here, I argue that despite limitations of today’s force fields, current simulation tools and force fields now provide the potential for real benefits in a variety of applications. However, these same tools also provide irreproducible results which are often poorly interpreted. Continued progress in the field requires more honesty in assessment and care in evaluation of simulation results, especially with respect to convergence.

Journal ArticleDOI
Ori Kalid, Dora Warshaviak1, Sharon Shechter, Woody Sherman1, Sharon Shacham 
TL;DR: The Consensus Induced Fit Docking approach for adapting a protein binding site to accommodate multiple diverse ligands for virtual screening results in a single binding site structure that can bind diverse chemotypes and is thus highly useful for efficient structure-based virtual screening.
Abstract: We present the Consensus Induced Fit Docking (cIFD) approach for adapting a protein binding site to accommodate multiple diverse ligands for virtual screening. This novel approach results in a single binding site structure that can bind diverse chemotypes and is thus highly useful for efficient structure-based virtual screening. We first describe the cIFD method and its validation on three targets that were previously shown to be challenging for docking programs (COX-2, estrogen receptor, and HIV reverse transcriptase). We then demonstrate the application of cIFD to the challenging discovery of irreversible Crm1 inhibitors. We report the identification of 33 novel Crm1 inhibitors, which resulted from the testing of 402 purchased compounds selected from a screening set containing 261,680 compounds. This corresponds to a hit rate of 8.2 %. The novel Crm1 inhibitors reveal diverse chemical structures, validating the utility of the cIFD method in a real-world drug discovery project. This approach offers a pragmatic way to implicitly account for protein flexibility without the additional computational costs of ensemble docking or including full protein flexibility during virtual screening.

Journal ArticleDOI
TL;DR: This work states that QSAR approaches, including recent advances in 3D-QSAR, are advantageous during the lead optimization phase of drug discovery and complementary with bioinformatics and growing data accessibility.
Abstract: QSAR approaches, including recent advances in 3D-QSAR, are advantageous during the lead optimization phase of drug discovery and complementary with bioinformatics and growing data accessibility. Hints for future QSAR practitioners are also offered.

Journal ArticleDOI
TL;DR: The results of this exercise demonstrate that the field in general has difficulty predicting the transfer energies of more highly chlorinated compounds, and that methods seem to be erring in the same direction.
Abstract: Prediction of the free energy of solvation of a small molecule, or its transfer energy, is a necessary step along the path towards calculating the interactions between molecules that occur in an aqueous environment. A set of these transfer energies were gathered from the literature for series of chlorinated molecules with varying numbers of chlorines based on ethane, biphenyl, and dibenzo-p-dioxin. This focused set of molecules were then provided as a blinded challenge to assess the ability of current computational solvation methods to accurately model the interactions between water and increasingly chlorinated compounds. This was presented as part of the SAMPL3 challenge, which represented the fourth iterative blind prediction challenge involving transfer energies. The results of this exercise demonstrate that the field in general has difficulty predicting the transfer energies of more highly chlorinated compounds, and that methods seem to be erring in the same direction.

Journal ArticleDOI
TL;DR: The results reveal that the protonation state of Asp25/Asp25′ strongly affects the dynamics, the overall affinity and the interactions of the inhibitor with individual residues, and may assist in the design of new inhibitors against HIV-1 PR variants that are resistant against current drugs.
Abstract: Amprenavir (APV) is a high affinity (0.15 nM) HIV-1 protease (PR) inhibitor. However, the affinities of the drug resistant protease variants V32I, I50V, I54V, I54M, I84V and L90M to amprenavir are decreased 3 to 30-fold compared to the wild-type. In this work, the popular molecular mechanics Poisson-Boltzmann surface area method has been used to investigate the effectiveness of amprenavir against the wild-type and these mutated protease variants. Our results reveal that the protonation state of Asp25/Asp25' strongly affects the dynamics, the overall affinity and the interactions of the inhibitor with individual residues. We emphasize that, in contrast to what is often assumed, the protonation state may not be inferred from the affinities but requires pK(a) calculations. At neutral pH, Asp25 and Asp25' are ionized or protonated, respectively, as suggested from pK(a) calculations. This protonation state was thus mainly considered in our study. Mutation induced changes in binding affinities are in agreement with the experimental findings. The decomposition of the binding free energy reveals the mechanisms underlying binding and drug resistance. Drug resistance arises from an increase in the energetic contribution from the van der Waals interactions between APV and PR (V32I, I50V, and I84V mutant) or a rise in the energetic contribution from the electrostatic interactions between the inhibitor and its target (I54M and I54V mutant). For the V32I mutant, also an increased free energy for the polar solvation contributes to the drug resistance. For the L90M mutant, a rise in the van der Waals energy for APV-PR interactions is compensated by a decrease in the polar solvation free energy such that the net binding affinity remains unchanged. Detailed understanding of the molecular forces governing binding and drug resistance might assist in the design of new inhibitors against HIV-1 PR variants that are resistant against current drugs.

Journal ArticleDOI
TL;DR: BEDAM calculations are described to predict the free energies of binding of a series of anaesthetic drugs to a recently characterized acyclic cucurbituril host, offering the prospect of utilizing host-guest binding free energy data for force field validation and development.
Abstract: BEDAM calculations are described to predict the free energies of binding of a series of anaesthetic drugs to a recently characterized acyclic cucurbituril host. The modeling predictions, conducted as part of the SAMPL3 host-guest affinity blind challenge, are generally in good quantitative agreement with the experimental measurements. The correlation coefficient between computed and measured binding free energies is 70% with high statistical significance. Multiple conformational stereoisomers and protonation states of the guests have been considered. Better agreement is obtained with high statistical confidence under acidic modeling conditions. It is shown that this level of quantitative agreement could have not been reached without taking into account reorganization energy and configurational entropy effects. Extensive conformational variability of the host, the guests and their complexes is observed in the simulations, affecting binding free energy estimates and structural predictions. A conformational reservoir technique is introduced as part of the parallel Hamiltonian replica exchange molecular dynamics BEDAM protocol to fully capture conformational variability. It is shown that these advanced computational strategies lead to converged free energy estimates for these systems, offering the prospect of utilizing host-guest binding free energy data for force field validation and development.

Journal ArticleDOI
TL;DR: A combined quantum chemistry and molecular dynamics simulation study on how the ILs dissolve cellulose, emphasizing that the chloride anions play a critically important role and the imidazolium cations also present a remarkable contribution in the cellulose dissolution.
Abstract: While N,N′-dialkylimidazolium ionic liquids (ILs) have been well-established as effective solvents for dissolution and processing of cellulose, the detailed mechanism at the molecular level still remains unclear. In this work, we present a combined quantum chemistry and molecular dynamics simulation study on how the ILs dissolve cellulose. On the basis of calculations on 1-butyl-3-methylimidazolium chloride, one of the most effective ILs dissolving cellulose, we further studied the molecular behavior of cellulose models (i.e. cellulose oligomers with degrees of polymerization n = 2, 4, and 6) in the IL, including the structural features and hydrogen bonding patterns. The collected data indicate that both chloride anions and imidazolium cations of the IL interact with the oligomer via hydrogen bonds. However, the anions occupy the first coordination shell of the oligomer, and the strength and number of hydrogen bonds and the interaction energy between anions and the oligomer are much larger than those between cations and the oligomer. It is observed that the intramolecular hydrogen bond in the oligomer is broken under the combined effect of anions and cations. The present results emphasize that the chloride anions play a critically important role and the imidazolium cations also present a remarkable contribution in the cellulose dissolution. This point of view is different from previous one that only underlines the importance of the chloride anions in the cellulose dissolution. The present results improve our understanding for the cellulose dissolution in imidazolium chloride ILs.

Journal ArticleDOI
TL;DR: The capability of FLAP models to uncover selectivity aspects although single AR subtype models were not trained for this purpose is demonstrated and the novel FLAPPharm tool for pharmacophore generation is applied for the first time.
Abstract: FLAP fingerprints are applied in the ligand-, structure- and pharmacophore-based mode in a case study on antagonists of all four adenosine receptor (AR) subtypes. Structurally diverse antagonist collections with respect to the different ARs were constructed by including binding data to human species only. FLAP models well discriminate “active” (=highly potent) from “inactive” (=weakly potent) AR antagonists, as indicated by enrichment curves, numbers of false positives, and AUC values. For all FLAP modes, model predictivity slightly decreases as follows: A2BR > A2AR > A3R > A1R antagonists. General performance of FLAP modes in this study is: ligand- > structure- > pharmacophore- based mode. We also compared the FLAP performance with other common ligand- and structure-based fingerprints. Concerning the ligand-based mode, FLAP model performance is superior to ECFP4 and ROCS for all AR subtypes. Although focusing on the early first part of the A2A, A2B and A3 enrichment curves, ECFP4 and ROCS still retain a satisfactory retrieval of actives. FLAP is also superior when comparing the structure-based mode with PLANTS and GOLD. In this study we applied for the first time the novel FLAPPharm tool for pharmacophore generation. Pharmacophore hypotheses, generated with this tool, convincingly match with formerly published data. Finally, we could demonstrate the capability of FLAP models to uncover selectivity aspects although single AR subtype models were not trained for this purpose.

Journal ArticleDOI
TL;DR: Estimating affinities for the binding of 34 ligands to trypsin and nine guest molecules to three different hosts in the SAMPL3 blind challenge, using the MM/PBSA, MM/GBSA, LIE, continuum Lie, and Glide score methods, finds that the success of the methods is system-dependent.
Abstract: We have estimated affinities for the binding of 34 ligands to trypsin and nine guest molecules to three different hosts in the SAMPL3 blind challenge, using the MM/PBSA, MM/GBSA, LIE, continuum LIE, and Glide score methods. For the trypsin challenge, none of the methods were able to accurately predict the experimental results. For the MM/GB(PB)SA and LIE methods, the rankings were essentially random and the mean absolute deviations were much worse than a null hypothesis giving the same affinity to all ligand. Glide scoring gave a Kendall’s τ index better than random, but the ranking is still only mediocre, τ = 0.2. However, the range of affinities is small and most of the pairs of ligands have an experimental affinity difference that is not statistically significant. Removing those pairs improves the ranking metric to 0.4–1.0 for all methods except CLIE. Half of the trypsin ligands were non-binders according to the binding assay. The LIE methods could not separate the inactive ligands from the active ones better than a random guess, whereas MM/GBSA and MM/PBSA were slightly better than random (area under the receiver-operating-characteristic curve, AUC = 0.65–0.68), and Glide scoring was even better (AUC = 0.79). For the first host, MM/GBSA and MM/PBSA reproduce the experimental ranking fairly good, with τ = 0.6 and 0.5, respectively, whereas the Glide scoring was considerably worse, with a τ = 0.4, highlighting that the success of the methods is system-dependent.

Journal ArticleDOI
TL;DR: This special issue of the Journal of Computer-Aided Molecular Design is the culmination of the 4th Statistical Assessment of the Modeling of Proteins and Ligands (SAMPL) challenge and workshop, and SAMPL3 was the first blinded challenge to include prediction of host–guest binding affinities.
Abstract: This special issue of the Journal of Computer-Aided Molecular Design is the culmination of the 4th Statistical Assessment of the Modeling of Proteins and Ligands (SAMPL) challenge and workshop. SAMPL3 had three datasets: blinded small-molecule hydration energies, provided by Peter Guthrie [1]; two novel host–guest systems, including eleven unpublished binding energies, provided by Adam Urbach and Lyle Issacs [2], and a monumental dataset including structural and affinity data for 500 fragments against Trypsin, provided by Tom Peat [3]. The SAMPL3 workshop saw over 40 attendees while the SAMPL3 challenge received 103 submissions from 23 participating groups using a variety of methods including: discrete and dynamic conformational sampling; implicit, semi-implicit and explicit water models; and myriad of force-fields and charge models. Gilson [2] and Geballe [1] have provided summaries of the host–guest challenge and solvation-energy challenge respectively. As with prior SAMPL challenges, many different approaches generated high-quality predictions yet no single technique distinguished itself significantly. Nevertheless, many important insights into the strengths and limitations of computational and experimental methods were developed through SAMPL. SAMPL3 was the first blinded challenge to include prediction of host–guest binding affinities. Host–guest binding affinities provided an outstanding blind challenge, as they are simple enough to encourage participants to recognize, explore and address assumptions and errors. Most participants had great difficulty modeling the aspartyl-protease-like formal charges found in the host molecules. This is concerning, for while ionization sites occur commonly in protein–ligand systems, rarely are they addressed at the level of detail participants found necessary for this host–guest system. More host–guest examples should be included in future SAMPL challenges as their streamlined nature highlights assumptions that can be too easily overlooked. SAMPL is one of several projects that provide blinded or prospective experimental challenges to the computational community [4–6]. These projects are intended to serve as both a guidepost for computational progress and a meeting ground for experimental and computational scientist. However, it has been a struggle to generate mutual interest between computational and experimental scientists. This year, SAMPL had a breakthrough in the form of Lyle Issacs (host–guest affinities) and Tom Peat (trypsin structures and affinities). Lyle and Tom are experimentalists who provided data to SAMPL3, attended the SAMPL workshop, and provided insights and challenges to the computational scientists. We hope they are the first of many experimentalists to join SAMPL and challenge theorists with their data. Unfortunately, no experimental collaborator has emerged to provide prospective hydration free energies, which have been part of each of the four SAMPL evaluations. Hydration energies are the most basic measure of the solvation of molecules in water. Aqueous solvation plays a critical role in most biophysical and biochemical phenomenon, and our ability to accurately predict biophysical processes is limited by our ability to accurately calculate solvation interactions. Hydration energies represent one of the simplest experiments that allow us to evaluate these predictions. As a consequence, comparison to hydration energies is a fundamental tool for evaluating force fields and electrostatic models (for example see [7]). A. G. Skillman (&) OpenEye Scientific Software, Santa Fe, NM, USA e-mail: skillman@eyesopen.com

Journal ArticleDOI
TL;DR: Macrocycles from the Aurora project were screened in a kinase panel and were found to be active on other kinase targets, mainly JAKs, FLT3 and CDKs, which resulted in good selectivity for JAK2 over JAK3 andCDKs.
Abstract: Macrocycles from our Aurora project were screened in a kinase panel and were found to be active on other kinase targets, mainly JAKs, FLT3 and CDKs. Subsequently these compounds became leads in our JAK2 project. Macrocycles with a basic nitrogen in the linker form a salt bridge with Asp86 in CDK2 and Asp698 in FLT3. This residue is conserved in most CDKs resulting in potent pan CDK inhibition. One of the main project objectives was to achieve JAK2 potency with 100-fold selectivity against CDKs. Macrocycles with an ether linker have potent JAK2 activity with the ether oxygen forming a hydrogen bond to Ser936. A hydrogen bond to the equivalent residues of JAK3 and most CDKs cannot be formed resulting in good selectivity for JAK2 over JAK3 and CDKs. Further optimization of the macrocyclic linker and side chain increased JAK2 and FLT3 activity as well as improving DMPK properties. The selective JAK2/FLT3 inhibitor 11 (Pacritinib, SB1518) has successfully finished phase 2 clinical trials for myelofibrosis and lymphoma. Another selective JAK2/FLT3 inhibitor, 33 (SB1578), has entered phase 1 clinical development for the non-oncology indication rheumatoid arthritis.

Journal ArticleDOI
TL;DR: A substrate binding comparison of CHI II with the mesophilic chitinase from Coccidioides immitis, 1D2K, suggested that the psychrophilic adaptation and catalytic activity at low temperatures were achieved through a reduction in the number of salt bridges, fewer hydrogen bonds and an increase in the exposure of the hydrophobic side chains to the solvent.
Abstract: The structure of psychrophilic chitinase (CHI II) from Glaciozyma antarctica PI12 has yet to be studied in detail. Due to its low sequence identity (<30 %), the structural prediction of CHI II is a challenge. A 3D model of CHI II was built by first using a threading approach to search for a suitable template and to generate an optimum target-template alignment, followed by model building using MODELLER9v7. Analysis of the catalytic insertion domain structure in CHI II revealed an increase in the number of aromatic residues and longer loops compared to mesophilic and thermophilic chitinases. A molecular dynamics simulation was used to examine the stability of the CHI II structure at 273, 288 and 300 K. Structural analysis of the substrate-binding cleft revealed a few exposed aromatic residues. Substitutions of certain amino acids in the surface and loop regions of CHI II conferred an increased flexibility to the enzyme, allowing for an adaptation to cold temperatures. A substrate binding comparison of CHI II with the mesophilic chitinase from Coccidioides immitis, 1D2K, suggested that the psychrophilic adaptation and catalytic activity at low temperatures were achieved through a reduction in the number of salt bridges, fewer hydrogen bonds and an increase in the exposure of the hydrophobic side chains to the solvent.

Journal ArticleDOI
TL;DR: A major part of the future of QSAR analysis, and its application to modeling biological potency, ADME-Tox properties, general use in virtual screening applications, as well as its expanding use into new fields for building QSPR models, lies in developing strategies that combine and use 1D through nD molecular descriptors.
Abstract: The usefulness and utility of QSAR modeling depends heavily on the ability to estimate the values of molecular descriptors relevant to the endpoints of interest followed by an optimized selection of descriptors to form the best QSAR models from a representative set of the endpoints of interest. The performance of a QSAR model is directly related to its molecular descriptors. QSAR modeling, specifically model construction and optimization, has benefited from its ability to borrow from other unrelated fields, yet the molecular descriptors that form QSAR models have remained basically unchanged in both form and preferred usage. There are many types of endpoints that require multiple classes of descriptors (descriptors that encode 1D through multi-dimensional, 4D and above, content) needed to most fully capture the molecular features and interactions that contribute to the endpoint. The advantages of QSAR models constructed from multiple, and different, descriptor classes have been demonstrated in the exploration of markedly different, and principally biological systems and endpoints. Multiple examples of such QSAR applications using different descriptor sets are described and that examined. The take-home-message is that a major part of the future of QSAR analysis, and its application to modeling biological potency, ADME-Tox properties, general use in virtual screening applications, as well as its expanding use into new fields for building QSPR models, lies in developing strategies that combine and use 1D through nD molecular descriptors.

Journal ArticleDOI
TL;DR: The next 25 years will undoubtedly show a series of translational science activities that are aimed at a better communication between all parties involved, from quantum chemistry to bedside and from academia to industry.
Abstract: In its first 25 years JCAMD has been disseminating a large number of techniques aimed at finding better medicines faster. These include genetic algorithms, COMFA, QSAR, structure based techniques, homology modelling, high throughput screening, combichem, and dozens more that were a hype in their time and that now are just a useful addition to the drug-designers toolbox. Despite massive efforts throughout academic and industrial drug design research departments, the number of FDA-approved new molecular entities per year stagnates, and the pharmaceutical industry is reorganising accordingly. The recent spate of industrial consolidations and the concomitant move towards outsourcing of research activities requires better integration of all activities along the chain from bench to bedside. The next 25 years will undoubtedly show a series of translational science activities that are aimed at a better communication between all parties involved, from quantum chemistry to bedside and from academia to industry. This will above all include understanding the underlying biological problem and optimal use of all available data.

Journal ArticleDOI
TL;DR: The pharmacophoric models and associated QSAR equation were employed to screen the national cancer institute (NCI) list of compounds Eight submicromolar ROCKII inhibitors were identified and the successful pharmacophores models were found to be comparable with crystallographically resolved ROCKII binding pocket.
Abstract: Rho Kinase (ROCKII) has been recently implicated in several cardiovascular diseases prompting several attempts to discover and optimize new ROCKII inhibitors. Towards this end we explored the pharmacophoric space of 138 ROCKII inhibitors to identify high quality pharmacophores. The pharmacophoric models were subsequently allowed to compete within quantitative structure–activity relationship (QSAR) context. Genetic algorithm and multiple linear regression analysis were employed to select an optimal combination of pharmacophoric models and 2D physicochemical descriptors capable of accessing self-consistent QSAR of optimal predictive potential (r 77 = 0.84, F = 18.18, r LOO 2 = 0.639, r PRESS 2 against 19 external test inhibitors = 0.494). Two orthogonal pharmacophores emerged in the QSAR equation suggesting the existence of at least two binding modes accessible to ligands within ROCKII binding pocket. Receiver operating characteristic (ROC) curve analyses established the validity of QSAR-selected pharmacophores. Moreover, the successful pharmacophores models were found to be comparable with crystallographically resolved ROCKII binding pocket. We employed the pharmacophoric models and associated QSAR equation to screen the national cancer institute (NCI) list of compounds Eight submicromolar ROCKII inhibitors were identified. The most potent gave IC50 values of 0.7 and 1.0 μM.