scispace - formally typeset
Open AccessJournal ArticleDOI

Group Contribution and Machine Learning Approaches to Predict Abraham Solute Parameters, Solvation Free Energy, and Solvation Enthalpy

TLDR
In this paper , a group contribution method (SoluteGC) was used to predict Abraham solute parameters, as well as a machine learning model (DirectML) to predict solvation free energy and enthalpy at 298 K. The results show that the DirectML model is superior to SoluteGC and SoluteML models for both predictions and can provide accuracy comparable to that of advanced quantum chemistry methods.
Abstract
We present a group contribution method (SoluteGC) and a machine learning model (SoluteML) to predict the Abraham solute parameters, as well as a machine learning model (DirectML) to predict solvation free energy and enthalpy at 298 K. The proposed group contribution method uses atom-centered functional groups with corrections for ring and polycyclic strain while the machine learning models adopt a directed message passing neural network. The solute parameters predicted from SoluteGC and SoluteML are used to calculate solvation energy and enthalpy via linear free energy relationships. Extensive data sets containing 8366 solute parameters, 20,253 solvation free energies, and 6322 solvation enthalpies are compiled in this work to train the models. The three models are each evaluated on the same test sets using both random and substructure-based solute splits for solvation energy and enthalpy predictions. The results show that the DirectML model is superior to the SoluteML and SoluteGC models for both predictions and can provide accuracy comparable to that of advanced quantum chemistry methods. Yet, even though the DirectML model performs better in general, all three models are useful for various purposes. Uncertain predicted values can be identified by comparing the three models, and when the 3 models are combined together, they can provide even more accurate predictions than any one of them individually. Finally, we present our compiled solute parameter, solvation energy, and solvation enthalpy databases (SoluteDB, dGsolvDBx, dHsolvDB) and provide public access to our final prediction models through a simple web-based tool, software packages, and source code.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

High accuracy barrier heights, enthalpies, and rate coefficients for chemical reactions

TL;DR: In this article , the authors used CCSD(T)-F12a/cc-pVDZ-F12//ωB97X-D3/def2-TZVP to obtain high-quality single point calculations for nearly 22,000 unique stable species and transition states.
Journal ArticleDOI

Abraham Solvation Parameter Model: Examination of Possible Intramolecular Hydrogen-Bonding Using Calculated Solute Descriptors

TL;DR: In this article , the Abraham model was used to predict the solubility of 4,5-dihydroxyanthraquinone-2-carboxylic acid.
Journal ArticleDOI

Predicting Solubility Limits of Organic Solutes for a Wide Range of Solvents and Temperatures.

TL;DR: In this article , the authors present a fast and convenient computational method for estimating the solubility of solid neutral organic molecules in water and many organic solvents for a broad range of temperatures.
Journal ArticleDOI

RMG Database for Chemical Property Prediction

TL;DR: The RMG database provides kineticists with easy access to estimates of the many parameters they need to model and analyze kinetic systems by enabling easy hypothesis testing on pathways, by providing parameters for model construction, and by providing checks on kinetic parameters from other sources.
Journal ArticleDOI

High accuracy barrier heights, enthalpies, and rate coefficients for chemical reactions

TL;DR: In this article , the authors used CCSD(T)-F12a/cc-pVDZ-F12//ωB97X-D3/def2-TZVP to obtain high-quality single point calculations for nearly 22,000 unique stable species and transition states.
References
More filters
Journal ArticleDOI

Universal solvation model based on solute electron density and on a continuum model of the solvent defined by the bulk dielectric constant and atomic surface tensions.

TL;DR: The SMD model may be employed with other algorithms for solving the nonhomogeneous Poisson equation for continuum solvation calculations in which the solute is represented by its electron density in real space, including, for example, the conductor-like screening algorithm.
Journal ArticleDOI

DrugBank 5.0: a major update to the DrugBank database for 2018

TL;DR: This year’s update, DrugBank 5.0, represents the most significant upgrade to the database in more than 10 years and significant improvements have been made to the quantity, quality and consistency of drug indications, drug binding data as well as drug-drug and drug-food interactions.
Journal ArticleDOI

SMILES, a chemical language and information system. 1. introduction to methodology and encoding rules

TL;DR: This chapter discusses the construction of Benzenoid and Coronoid Hydrocarbons through the stages of enumeration, classification, and topological properties in a number of computers used for this purpose.
Journal ArticleDOI

Conductor-like Screening Model for Real Solvents: A New Approach to the Quantitative Calculation of Solvation Phenomena

TL;DR: In this paper, a new approach for the calculation of solvation phenomena is presented, based on the perfect, Le., conductor-like, screening of the solute molecule and a quantitative calculation of the deviations from ideality appearing in real solvents.
Related Papers (5)