scispace - formally typeset
Search or ask a question

Showing papers by "David L. Mobley published in 2018"


Journal ArticleDOI
TL;DR: This Review summarizes the state-of-the-art of the two fields, and highlights where there is latent chemical space for collaborative exploration by the two groups.
Abstract: On planet Earth, water is everywhere: the majority of the surface is covered with it; it is a key component of all life; its vapour and droplets fill the lower atmosphere; and even rocks contain it and undergo geomorphological changes because of it. A community of physical scientists largely drives studies of the chemistry of water and aqueous solutions, with expertise in biochemistry, spectroscopy and computer modelling. More recently, however, supramolecular chemists - with their expertise in macrocyclic synthesis and measuring supramolecular interactions - have renewed their interest in water-mediated non-covalent interactions. These two groups offer complementary expertise that, if harnessed, offer to accelerate our understanding of aqueous supramolecular chemistry and water writ large. This Review summarizes the state-of-the-art of the two fields, and highlights where there is latent chemical space for collaborative exploration by the two groups.

118 citations


Journal ArticleDOI
TL;DR: This work describes a new approach to assigning force field parameters via direct chemical perception, which operates directly on the unmodified chemical graph of the molecule to assign parameters, and implements a new force field format, called the SMIRKS Native Open Force Field (SMIRNOFF) format.
Abstract: Traditional approaches to specifying a molecular mechanics force field encode all the information needed to assign force field parameters to a given molecule into a discrete set of atom types. This is equivalent to a representation consisting of a molecular graph comprising a set of vertices, which represent atoms labeled by atom type, and unlabeled edges, which represent chemical bonds. Bond stretch, angle bend, and dihedral parameters are then assigned by looking up bonded pairs, triplets, and quartets of atom types in parameter tables to assign valence terms and using the atom types themselves to assign nonbonded parameters. This approach, which we call indirect chemical perception because it operates on the intermediate graph of atom-typed nodes, creates a number of technical problems. For example, atom types must be sufficiently complex to encode all necessary information about the molecular environment, making it difficult to extend force fields encoded this way. Atom typing also results in a prolif...

103 citations


Journal ArticleDOI
TL;DR: An overview of the SAMPL6 host–guest binding affinity prediction challenge, which featured three supramolecular hosts and an overall improvement in the correlation obtained by the affinity predictions for OA and TEMOA systems, but a surprising lack of improvement regarding root mean square error over the past several challenge rounds.
Abstract: Accurately predicting the binding affinities of small organic molecules to biological macromolecules can greatly accelerate drug discovery by reducing the number of compounds that must be synthesized to realize desired potency and selectivity goals. Unfortunately, the process of assessing the accuracy of current computational approaches to affinity prediction against binding data to biological macromolecules is frustrated by several challenges, such as slow conformational dynamics, multiple titratable groups, and the lack of high-quality blinded datasets. Over the last several SAMPL blind challenge exercises, host-guest systems have emerged as a practical and effective way to circumvent these challenges in assessing the predictive performance of current-generation quantitative modeling tools, while still providing systems capable of possessing tight binding affinities. Here, we present an overview of the SAMPL6 host-guest binding affinity prediction challenge, which featured three supramolecular hosts: octa-acid (OA), the closely related tetra-endo-methyl-octa-acid (TEMOA), and cucurbit[8]uril (CB8), along with 21 small organic guest molecules. A total of 119 entries were received from ten participating groups employing a variety of methods that spanned from electronic structure and movable type calculations in implicit solvent to alchemical and potential of mean force strategies using empirical force fields with explicit solvent models. While empirical models tended to obtain better performance than first-principle methods, it was not possible to identify a single approach that consistently provided superior results across all host-guest systems and statistical metrics. Moreover, the accuracy of the methodologies generally displayed a substantial dependence on the system considered, emphasizing the need for host diversity in blind evaluations. Several entries exploited previous experimental measurements of similar host-guest systems in an effort to improve their physical-based predictions via some manner of rudimentary machine learning; while this strategy succeeded in reducing systematic errors, it did not correspond to an improvement in statistical correlation. Comparison to previous rounds of the host-guest binding free energy challenge highlights an overall improvement in the correlation obtained by the affinity predictions for OA and TEMOA systems, but a surprising lack of improvement regarding root mean square error over the past several challenge rounds. The data suggests that further refinement of force field parameters, as well as improved treatment of chemical effects (e.g., buffer salt conditions, protonation states), may be required to further enhance predictive accuracy.

99 citations


Journal ArticleDOI
TL;DR: In this paper, the reproducibility of relative alchemical free energy (RAFE) simulations has been evaluated for a set of small organic molecules and demonstrated that free energies can be reproduced to within about 0.2 kcal/mol with the aforementioned codes.
Abstract: Alchemical free energy calculations are an increasingly important modern simulation technique to calculate free energy changes on binding or solvation. Contemporary molecular simulation software such as AMBER, CHARMM, GROMACS, and SOMD include support for the method. Implementation details vary among those codes, but users expect reliability and reproducibility, i.e., for a given molecular model and set of force field parameters, comparable free energy differences should be obtained within statistical bounds regardless of the code used. Relative alchemical free energy (RAFE) simulation is increasingly used to support molecule discovery projects, yet the reproducibility of the methodology has been less well tested than its absolute counterpart. Here we present RAFE calculations of hydration free energies for a set of small organic molecules and demonstrate that free energies can be reproduced to within about 0.2 kcal/mol with the aforementioned codes. Absolute alchemical free energy simulations have been c...

65 citations


Journal ArticleDOI
TL;DR: This work develops and applies a nonequilibrium candidate Monte Carlo (NCMC) method to improve sampling of ligand binding modes and is making this approach available via the new Binding modes of ligands using enhanced sampling (BLUES) package which is freely available on GitHub.
Abstract: Accurately predicting protein–ligand binding affinities and binding modes is a major goal in computational chemistry, but even the prediction of ligand binding modes in proteins poses major challenges Here, we focus on solving the binding mode prediction problem for rigid fragments That is, we focus on computing the dominant placement, conformation, and orientations of a relatively rigid, fragment-like ligand in a receptor, and the populations of the multiple binding modes which may be relevant This problem is important in its own right, but is even more timely given the recent success of alchemical free energy calculations Alchemical calculations are increasingly used to predict binding free energies of ligands to receptors However, the accuracy of these calculations is dependent on proper sampling of the relevant ligand binding modes Unfortunately, ligand binding modes may often be uncertain, hard to predict, and/or slow to interconvert on simulation time scales, so proper sampling with current te

50 citations


Journal ArticleDOI
TL;DR: This work used UV absorbance-based pKa measurements to construct a high-quality experimental reference dataset of macroscopic pKas for the evaluation of computational pKa prediction methodologies that was utilized in the SAMPL6 pKa challenge.
Abstract: Determining the net charge and protonation states populated by a small molecule in an environment of interest or the cost of altering those protonation states upon transfer to another environment is a prerequisite for predicting its physicochemical and pharmaceutical properties. The environment of interest can be aqueous, an organic solvent, a protein binding site, or a lipid bilayer. Predicting the protonation state of a small molecule is essential to predicting its interactions with biological macromolecules using computational models. Incorrectly modeling the dominant protonation state, shifts in dominant protonation state, or the population of significant mixtures of protonation states can lead to large modeling errors that degrade the accuracy of physical modeling. Low accuracy hinders the use of physical modeling approaches for molecular design. For small molecules, the acid dissociation constant (pKa) is the primary quantity needed to determine the ionic states populated by a molecule in an aqueous solution at a given pH. As a part of SAMPL6 community challenge, we organized a blind pKa prediction component to assess the accuracy with which contemporary pKa prediction methods can predict this quantity, with the ultimate aim of assessing the expected impact on modeling errors this would induce. While a multitude of approaches for predicting pKa values currently exist, predicting the pKas of drug-like molecules can be difficult due to challenging properties such as multiple titratable sites, heterocycles, and tautomerization. For this challenge, we focused on set of 24 small molecules selected to resemble selective kinase inhibitors—an important class of therapeutics replete with titratable moieties. Using a Sirius T3 instrument that performs automated acid–base titrations, we used UV absorbance-based pKa measurements to construct a high-quality experimental reference dataset of macroscopic pKas for the evaluation of computational pKa prediction methodologies that was utilized in the SAMPL6 pKa challenge. For several compounds in which the microscopic protonation states associated with macroscopic pKas were ambiguous, we performed follow-up NMR experiments to disambiguate the microstates involved in the transition. This dataset provides a useful standard benchmark dataset for the evaluation of pKa prediction methodologies on kinase inhibitor-like compounds.

36 citations


Journal ArticleDOI
TL;DR: This work addresses the performance of different atomic charges to reproduce experimental hydration free energies in the FreeSolv database in combination with the GAFF force field and finds the best agreement with the experimental data is observed for the AM1-BCC and the MBIS atomic charges.
Abstract: Computer simulations of biomolecular systems often use force fields, which are combinations of simple empirical atom-based functions to describe the molecular interactions. Even though polarizable force fields give a more detailed description of intermolecular interactions, nonpolarizable force fields, developed several decades ago, are often still preferred because of their reduced computation cost. Electrostatic interactions play a major role in biomolecular systems and are therein described by atomic point charges. In this work, we address the performance of different atomic charges to reproduce experimental hydration free energies in the FreeSolv database in combination with the GAFF force field. Atomic charges were calculated by two atoms-in-molecules approaches, Hirshfeld-I and Minimal Basis Iterative Stockholder (MBIS). To account for polarization effects, the charges were derived from the solute's electron density computed with an implicit solvent model, and the energy required to polarize the solute was added to the free energy cycle. The calculated hydration free energies were analyzed with an error model, revealing systematic errors associated with specific functional groups or chemical elements. The best agreement with the experimental data is observed for the AM1-BCC and the MBIS atomic charge methods. The latter includes the solvent polarization and presents a root-mean-square error of 2.0 kcal mol-1 for the 613 organic molecules studied. The largest deviation was observed for phosphorus-containing molecules and the molecules with amide, ester and amine functional groups.

31 citations


Posted ContentDOI
13 Jul 2018-bioRxiv
TL;DR: This work describes a new approach to assigning parameters for molecular mechanics force fields based on the industry standard SMARTS chemical perception language, where parameters are assigned directly based on substructure queries operating on the molecule(s) being parameterized, thereby avoiding the intermediate step of assigning atom types.
Abstract: Here, we focus on testing and improving force fields for molecular modeling, which see widespread use in diverse areas of computational chemistry and biomolecular simulation. A key issue affecting the accuracy and transferrability of these force fields is the use of atom typing. Traditional approaches to defining molecular mechanics force fields must encode, within a discrete set of atom types, all information which will ever be needed about the chemical environment; parameters are then assigned by looking up combinations of these atom types in tables. This atom typing approach leads to a wide variety of problems such as inextensible atom-typing machinery, enormous difficulty in expanding parameters encoded by atom types, and unnecessarily proliferation of encoded parameters. Here, we describe a new approach to assigning parameters for molecular mechanics force fields based on the industry standard SMARTS chemical perception language (with extensions to identify specific atoms available in SMIRKS). In this approach, each force field term (bonds, angles, and torsions, and nonbonded interactions) features separate definitions assigned in a hierarchical manner without using atom types. We accomplish this using direct chemical perception, where parameters are assigned directly based on substructure queries operating on the molecule(s) being parameterized, thereby avoiding the intermediate step of assigning atom types --- a step which can be considered indirect chemical perception. Direct chemical perception allows for substantial simplification of force fields, as well as additional generality in the substructure queries. This approach is applicable to a wide variety of (bio)molecular systems, and can greatly reduce the number of parameters needed to create a complete force field. Further flexibility can also be gained by allowing force field terms to be interpolated based on the assignment of fractional bond orders via the same procedure used to assign partial charges. As an example of the utility of this approach, we provide a minimalist small molecule force field derived from Merck9s parm@Frosst (an Amber parm99 descendant), in which a parameter definition file only approximately 300 lines long can parameterize a large and diverse spectrum of pharmaceutically relevant small molecule chemical space. We benchmark this minimalist force field on the FreeSolv small molecule hydration free energy set and calculations of densities and dielectric constants from the ThermoML Archive, demonstrating that it achieves comparable accuracy to the Generalized Amber Force Field (GAFF) that consists of many thousands of parameters.

22 citations


Journal ArticleDOI
TL;DR: The intent of this study is to provide a way for developers of implicit solvent model parameter sets to understand the sensitivity of their target properties (solvation energy) on underlying choices for solute radius and charge parameters.
Abstract: Atomic radii and charges are two major parameters used in implicit solvent electrostatics and energy calculations The optimization problem for charges and radii is underdetermined, leading to uncertainty in the values of these parameters and in the results of solvation energy calculations using these parameters This paper presents a new method for quantifying this uncertainty in implicit solvation calculations of small molecules using surrogate models based on generalized polynomial chaos (gPC) expansions There are relatively few atom types used to specify radii parameters in implicit solvation calculations; therefore, surrogate models for these low-dimensional spaces could be constructed using least-squares fitting However, there are many more types of atomic charges; therefore, construction of surrogate models for the charge parameter space requires compressed sensing combined with an iterative rotation method to enhance problem sparsity We demonstrate the application of the method by presenting results for the uncertainties in small molecule solvation energies based on these approaches The method presented in this paper is a promising approach for efficiently quantifying uncertainty in a wide range of force field parametrization problems, including those beyond continuum solvation calculations The intent of this study is to provide a way for developers of implicit solvent model parameter sets to understand the sensitivity of their target properties (solvation energy) on underlying choices for solute radius and charge parameters

20 citations


Journal ArticleDOI
TL;DR: A general model for predicting microscopic and macroscopic $$pK_a$$pKas using a Gaussian process regression trained using physical and chemical features of each ionizable group, along with good agreement in quantile–quantile plots, indicating it can predict its own accuracy.
Abstract: A variety of fields would benefit from accurate [Formula: see text] predictions, especially drug design due to the effect a change in ionization state can have on a molecule's physiochemical properties. Participants in the recent SAMPL6 blind challenge were asked to submit predictions for microscopic and macroscopic [Formula: see text]s of 24 drug like small molecules. We recently built a general model for predicting [Formula: see text]s using a Gaussian process regression trained using physical and chemical features of each ionizable group. Our pipeline takes a molecular graph and uses the OpenEye Toolkits to calculate features describing the removal of a proton. These features are fed into a Scikit-learn Gaussian process to predict microscopic [Formula: see text]s which are then used to analytically determine macroscopic [Formula: see text]s. Our Gaussian process is trained on a set of 2700 macroscopic [Formula: see text]s from monoprotic and select diprotic molecules. Here, we share our results for microscopic and macroscopic predictions in the SAMPL6 challenge. Overall, we ranked in the middle of the pack compared to other participants, but our fairly good agreement with experiment is still promising considering the challenge molecules are chemically diverse and often polyprotic while our training set is predominately monoprotic. Of particular importance to us when building this model was to include an uncertainty estimate based on the chemistry of the molecule that would reflect the likely accuracy of our prediction. Our model reports large uncertainties for the molecules that appear to have chemistry outside our domain of applicability, along with good agreement in quantile-quantile plots, indicating it can predict its own accuracy. The challenge highlighted a variety of means to improve our model, including adding more polyprotic molecules to our training set and more carefully considering what functional groups we do or do not identify as ionizable.

18 citations


Journal ArticleDOI
TL;DR: A concerted approach combining depth-dependent fluorescence quenching with Molecular Dynamics simulation to decipher dynamic interactions of membrane proteins with the lipid bilayers and indicates that membrane partitioning of the neutral E362 is more favorable energetically but causes stronger perturbation of the bilayer, than the charged E362.
Abstract: Dynamic disorder of the lipid bilayer presents a challenge for establishing structure–function relationships in membranous systems. The resulting structural heterogeneity is especially evident for peripheral and spontaneously inserting membrane proteins, which are not constrained by the well-defined transmembrane topology and exert their action in the context of intimate interaction with lipids. Here, we propose a concerted approach combining depth-dependent fluorescence quenching with Molecular Dynamics simulation to decipher dynamic interactions of membrane proteins with the lipid bilayers. We apply this approach to characterize membrane-mediated action of the diphtheria toxin translocation domain. First, we use a combination of the steady-state and time-resolved fluorescence spectroscopy to characterize bilayer penetration of the NBD probe selectively attached to different sites of the protein into membranes containing lipid-attached nitroxyl quenching groups. The constructed quenching profiles are analyzed with the Distribution Analysis methodology allowing for accurate determination of transverse distribution of the probe. The results obtained for 12 NBD-labeled single-Cys mutants are consistent with the so-called Open-Channel topology model. The experimentally determined quenching profiles for labeling sites corresponding to L350, N373, and P378 were used as initial constraints for positioning TH8–9 hairpin into the lipid bilayer for Molecular Dynamics simulation. Finally, we used alchemical free energy calculations to characterize protonation of E362 in soluble translocation domain and membrane-inserted conformation of its TH8–9 fragment. Our results indicate that membrane partitioning of the neutral E362 is more favorable energetically (by ~ 6 kcal/mol), but causes stronger perturbation of the bilayer, than the charged E362.

Journal ArticleDOI
TL;DR: Solubility prediction of drug-like solids remains computationally challenging, and it appears that both the underlying energy model and the computational approach applied may need improvement before the approach is suitable for routine use.
Abstract: Background: Solubility is a physical property of high importance to the pharmaceutical industry, the prediction of which for potential drugs has so far been a hard task. We attempted to predict the solubility of acetylsalicylic acid (ASA) by estimating the absolute chemical potentials of its most stable polymorph and of solutions with different concentrations of the drug molecule. Methods: Chemical potentials were estimated from all-atom molecular dynamics simulations. We used the Einstein molecule method (EMM) to predict the absolute chemical potential of the solid and solvation free energy calculations to predict the excess chemical potentials of the liquid-phase systems. Results: Reliable estimations of the chemical potentials for the solid and for a single ASA molecule using the EMM required an extremely large number of intermediate states for the free energy calculations, meaning that the calculations were extremely demanding computationally. Despite the computational cost, however, the computed value did not agree well with the experimental value, potentially due to limitations with the underlying energy model. Perhaps better values could be obtained with a better energy model; however, it seems likely computational cost may remain a limiting factor for use of this particular approach to solubility estimation. Conclusions: Solubility prediction of drug-like solids remains computationally challenging, and it appears that both the underlying energy model and the computational approach applied may need improvement before the approach is suitable for routine use.

Posted ContentDOI
04 Jun 2018-ChemRxiv
TL;DR: In the SAMPL6 blind challenge, this paper used a Gaussian process regression trained using physical and chemical features of each ionizable group to predict the removal of a proton from a molecular graph.
Abstract: A variety of fields would benefit from accurate pKa predictions, especially drug design due to the affect a change in ionization state can have on a molecules physiochemical properties.Participants in the recent SAMPL6 blind challenge were asked to submit predictions for microscopic and macroscopic pKas of 24 drug like small molecules.We recently built a general model for predicting pKas using a Gaussian process regression trained using physical and chemical features of each ionizable group.Our pipeline takes a molecular graph and uses the OpenEye Toolkits to calculate features describing the removal of a proton.These features are fed into a Scikit-learn Gaussian process to predict microscopic pKas which are then used to analytically determine macroscopic pKas.Our Gaussian process is trained on a set of 2,700 macroscopic pKas from monoprotic and select diprotic molecules.Here, we share our results for microscopic and macroscopic predictions in the SAMPL6 challenge.Overall, we ranked in the middle of the pack compared to other participants, but our fairly good agreement with experiment is still promising considering the challenge molecules are chemically diverse and often polyprotic while our training set is predominately monoprotic.Of particular importance to us when building this model was to include an uncertainty estimate based on the chemistry of the molecule that would reflect the likely accuracy of our prediction. Our model reports large uncertainties for the molecules that appear to have chemistry outside our domain of applicability, along with good agreement in quantile-quantile plots, indicating it can predict its own accuracy.The challenge highlighted a variety of means to improve our model, including adding more polyprotic molecules to our training set and more carefully considering what functional groups we do or do not identify as ionizable.

Posted ContentDOI
19 Jul 2018-bioRxiv
TL;DR: An overview of the SAMPL6 host-guest binding affinity prediction challenge, which featured three supramolecular hosts and an overall improvement in the correlation obtained by the affinity predictions for OA and TEMOA systems, but a surprising lack of improvement regarding root mean square error over the past several challenge rounds.
Abstract: The ability to accurately predict the binding affinities of small organic molecules to biological macromolecules would greatly accelerate drug discovery by reducing the number of compounds that must be synthesized to realize desired potency and selectivity goals. Unfortunately, the process of assessing the accuracy of current quantitative physical and empirical modeling approaches to affinity prediction against binding data to biological macromolecules is frustrated by several challenges, such as slow conformational dynamics, multiple titratable groups, and the lack of high-quality blinded datasets. Over the last several SAMPL blind challenge exercises, host-guest systems have emerged as a practical and effective way to circumvent these challenges in assessing the predictive performance of current-generation quantitative modeling tools, while still providing systems capable of possessing tight binding affinities. Here, we present an overview of the SAMPL6 host-guest binding affinity prediction challenge, which featured three supramolecular hosts: octa-acid (OA), the closely related tetra-endo-methyl-octa-acid (TEMOA), and cucurbit[8]uril (CB8), along with 21 small organic guest molecules. A total of 119 entries were received from 10 participating groups employing a variety of methods that spanned electronic structure and movable type calculations in implicit solvent to alchemical and potential of mean force strategies using empirical force fields and explicit solvent models. While empirical models tended to obtain better performance, it was not possible to identify a single approach consistently providing superior predictions across all host-guest systems and statistical metrics, and the accuracy of the methodologies generally displayed a substantial dependence on the systems considered, arguing for the importance of considering a diverse set of hosts in blind evaluations. Several entries exploited previous experimental measurements of similar host-guest systems in an effort to improve their physical-based predictions via some manner of rudimentary machine learning; while this strategy succeeded in reducing systematic errors, it was not able to generated a corresponding improvement of correlation statistics. Comparison to previous rounds of the host-guest binding free energy challenge highlights an overall improvement in the correlation obtained by the affinity predictions for OA and TEMOA systems, but a surprising lack of improvement in root mean square error over the past several challenge rounds. The data suggests that further refinement of force field parameters and improved treatment of chemical effects (e.g., buffer salt conditions, protonation states) may be required to continue to enhance predictive accuracy.

Posted ContentDOI
13 Jul 2018-bioRxiv
TL;DR: A blind pKa prediction component is organized to assess the accuracy with which contemporary p Ka prediction methods can predict this quantity, with the ultimate aim of assessing the expected impact on modeling errors this would induce.
Abstract: Determining the net charge and protonation states populated by a small molecule in an environment of interest---such as solvent, a protein binding site, or a lipid bilayer---or the cost of altering those protonation states upon transfer to another environment is a prerequisite for predicting its physicochemical and pharmaceutical properties, as well as interactions with biological macromolecules using computational models. Incorrectly modeling the dominant protonation state, shifts in dominant protonation state, or the population of significant mixtures of protonation states can lead to large modeling errors that degrade the accuracy of physical modeling and hinder the ability to use physical modeling approaches for molecular design. For small molecules, the acid dissociation constant (pKa) is the primary quantity needed to determine the ionic states populated by a molecule in an aqueous solution at a given pH. As a part of SAMPL6 community challenge, we organized a blind pKa prediction component to assess the accuracy with which contemporary pKa prediction methods can predict this quantity, with the ultimate aim of assessing the expected impact on modeling errors this would induce. While a multitude of approaches for predicting pKa values currently exist, predicting the pKas of drug-like molecules can be difficult due to challenging properties such as multiple titratable sites, heterocycles, and tautomerization. For this challenge, we focused on set of 24 small molecules selected to resemble selective kinase inhibitors---an important class of therapeutics replete with titratable moieties. Using a Sirius T3 instrument that performs automated acid-base titrations, we used UV absorbance-based pKa measurements to construct a high-quality experimental reference dataset of macroscopic pKas for the evaluation of computational pKa prediction methodologies that was utilized in the SAMPL6 pKa challenge. For several compounds in which the microscopic protonation states associated with macroscopic pKas were ambiguous, we performed follow-up NMR experiments to disambiguate the microstates involved in the transition. This dataset provides a useful standard benchmark dataset for the evaluation of pKa prediction methodologies on kinase inhibitor-like compounds.


Posted ContentDOI
04 Jun 2018-ChemRxiv
TL;DR: In this article, the relative alchemical free energy (RAFE) simulation is used to support molecule discovery projects, and the reproducibility of the methodology has been less well tested than its absolute counterpart.
Abstract: Alchemical free energy calculations are an increasingly important modern simulation technique. Contemporary molecular simulation software such as AMBER, CHARMM, GROMACS and SOMD include support for the method. Implementation details vary among those codes but users expect reliability and reproducibility, i.e. for a given molec- ular model and set of forcefield parameters, comparable free energy should be obtained within statistical bounds regardless of the code used. Relative alchemical free energy (RAFE) simulation is increasingly used to support molecule discovery projects, yet the reproducibility of the methodology has been less well tested than its absolute counter- part. Here we present RAFE calculations of hydration free energies for a set of small organic molecules and demonstrate that free energies can be reproduced to within about 0.2 kcal/mol with aforementioned codes. Achieving this level of reproducibility requires considerable attention to detail and package–specific simulation protocols, and no uni- versally applicable protocol emerges. The benchmarks and protocols reported here should be useful for the community to validate new and future versions of software for free energy calculations.

Posted ContentDOI
23 Nov 2018-ChemRxiv
TL;DR: In this paper, the relative energetic stabilities of syn and anti acetic acid were compared using ab initio quantum mechanical calculations and atomistic molecular dynamics simulations, and it was shown that while the syn conformation is the preferred state, the anti state may in some cases also be present under normal NPT conditions in solution.
Abstract: Accurate hydrogen placement in molecular modeling is crucial for studying the interactions and dynamics of biomolecular systems. It is difficult to locate hydrogen atoms from many experimental structural characterization approaches, such as due to the weak scattering of x-ray radiation. Hydrogen atoms are usually added and positioned in silico when preparing experimental structures for modeling and simulation. The carboxyl functional group is a prototypical example of a functional group that requires protonation during structure preparation. To our knowledge, when in their neutral form, carboxylic acids are typically protonated in the syn conformation by default in classical molecular modeling packages, with no consideration of alternative conformations, though we are not aware of any careful examination of this topic. Here, we investigate the general belief that carboxylic acids should always be protonated in the syn conformation. We calculate and compare the relative energetic stabilities of syn and anti acetic acid using ab initio quantum mechanical calculations and atomistic molecular dynamics simulations. We show that while the syn conformation is the preferred state, the anti state may in some cases also be present under normal NPT conditions in solution.