scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Machine learning meets volcano plots: Computational discovery of cross-coupling catalysts

12 Sep 2018-Chemical Science (The Royal Society of Chemistry)-Vol. 9, Iss: 35, pp 7069-7077
TL;DR: The application of modern machine learning to challenges in atomistic simulation is gaining attraction and the potential for innovation in this area is being explored.
Abstract: The application of modern machine learning to challenges in atomistic simulation is gaining attraction. We present new machine learning models that can predict the energy of the oxidative addition process between a transition metal complex and a substrate for C–C cross-coupling reactions. In turn, this quantity can be used as a descriptor to estimate the activity of homogeneous catalysts using molecular volcano plots. The versatility of this approach is illustrated for vast libraries of organometallic catalysts based on Pt, Pd, Ni, Cu, Ag, and Au combined with 91 ligands. Out-of-sample machine learning predictions were made on a total of 18 062 compounds leading to 557 catalyst candidates falling into the ideal thermodynamic window. This number was further refined by searching for candidates with an estimated price lower than 10 US$ per mmol. The 37 catalyst finalists are dominated by palladium phosphine ligand combinations but also include the earth abundant transition metal (Cu) with less common ligands. Our results indicate that modern statistical learning techniques can be applied to the computational discovery of readily available and promising catalyst candidates.

Content maybe subject to copyright    Report

Citations
More filters
01 Feb 1995
TL;DR: In this paper, the unpolarized absorption and circular dichroism spectra of the fundamental vibrational transitions of the chiral molecule, 4-methyl-2-oxetanone, are calculated ab initio using DFT, MP2, and SCF methodologies and a 5S4P2D/3S2P (TZ2P) basis set.
Abstract: : The unpolarized absorption and circular dichroism spectra of the fundamental vibrational transitions of the chiral molecule, 4-methyl-2-oxetanone, are calculated ab initio. Harmonic force fields are obtained using Density Functional Theory (DFT), MP2, and SCF methodologies and a 5S4P2D/3S2P (TZ2P) basis set. DFT calculations use the Local Spin Density Approximation (LSDA), BLYP, and Becke3LYP (B3LYP) density functionals. Mid-IR spectra predicted using LSDA, BLYP, and B3LYP force fields are of significantly different quality, the B3LYP force field yielding spectra in clearly superior, and overall excellent, agreement with experiment. The MP2 force field yields spectra in slightly worse agreement with experiment than the B3LYP force field. The SCF force field yields spectra in poor agreement with experiment.The basis set dependence of B3LYP force fields is also explored: the 6-31G* and TZ2P basis sets give very similar results while the 3-21G basis set yields spectra in substantially worse agreements with experiment. jg

1,652 citations

Journal ArticleDOI
TL;DR: In this paper, a deep multi-task artificial neural network is used to predict multiple electronic ground-and excited-state properties, such as atomization energy, polarizability, frontier orbital eigenvalues, ionization potential, electron affinity, and excitation energies.
Abstract: The combination of modern scientific computing with electronic structure theory can lead to an unprecedented amount of data amenable to intelligent data analysis for the identification of meaningful, novel, and predictive structure-property relationships. Such relationships enable high-throughput screening for relevant properties in an exponentially growing pool of virtual compounds that are synthetically accessible. Here, we present a machine learning (ML) model, trained on a data base of \textit{ab initio} calculation results for thousands of organic molecules, that simultaneously predicts multiple electronic ground- and excited-state properties. The properties include atomization energy, polarizability, frontier orbital eigenvalues, ionization potential, electron affinity, and excitation energies. The ML model is based on a deep multi-task artificial neural network, exploiting underlying correlations between various molecular properties. The input is identical to \emph{ab initio} methods, \emph{i.e.} nuclear charges and Cartesian coordinates of all atoms. For small organic molecules the accuracy of such a "Quantum Machine" is similar, and sometimes superior, to modern quantum-chemical methods---at negligible computational cost.

456 citations

Journal ArticleDOI
TL;DR: In this article, the authors provide an in-depth, critical review of ML-guided design and discovery of energy materials, a field where a novel material with superior performance (e.g., higher energy density, higher energy conversion efficiency, etc.) can have a transformative impact on the urgent global problem of climate change.
Abstract: DOI: 10.1002/aenm.201903242 materials in silico,[19–22] high computational costs and poor scaling still limit their effectiveness in exploring unconstrained chemical spaces and/or complex real-world materials. For instance, highthroughput DFT screening works typically limit the search space to hundreds or, at best, thousands of materials, while DFT simulations of materials are mostly limited to typically less than 1000 atoms, i.e., bulk crystals and isolated molecules. ML therefore offers a solution to the materials exploration problem, making predictions of new materials or properties from existing data, which in turn can drive the generation of more data that can be used to further refine the ML models. Here, we will provide an in-depth, critical review of MLguided design and discovery of energy materials, a field where a novel material with superior performance (e.g., higher energy density, higher energy conversion efficiency, etc.) can have a transformative impact on the urgent global problem of climate change. This review is structured along the steps in a typical workflow for materials ML model building, as shown in Figure 1. The next four sections will provide a concise overview of ML concepts designed to give the reader an appreciation of state-of-the-art techniques as well as resources for building ML models for materials. Section 6 reviews the actual application of ML techniques to the discovery and design of various classes of energy materials, from energy storage (e.g., batteries, fuel cells, etc.) to energy conversion (e.g., thermoelectrics, catalysis, etc.). The final section outlines our perspectives on various challenges and opportunities in ML for energy materials design.

282 citations

Journal ArticleDOI
TL;DR: The discovery and development of catalysts and catalytic processes are essential components to maintaining an ecological balance in the future as mentioned in this paper, and recent revolutions made in data science could have a...
Abstract: The discovery and development of catalysts and catalytic processes are essential components to maintaining an ecological balance in the future. Recent revolutions made in data science could have a ...

272 citations

Journal ArticleDOI
Pavlo O. Dral1
TL;DR: A view on the current state of affairs in this new exciting research field is offered, challenges of using ML in QC applications are described, and potential future developments are outlined.
Abstract: As the quantum chemistry (QC) community embraces machine learning (ML), the number of new methods and applications based on the combination of QC and ML is surging. In this Perspective, a view of the current state of affairs in this new and exciting research field is offered, challenges of using machine learning in quantum chemistry applications are described, and potential future developments are outlined. Specifically, examples of how machine learning is used to improve the accuracy and accelerate quantum chemical research are shown. Generalization and classification of existing techniques are provided to ease the navigation in the sea of literature and to guide researchers entering the field. The emphasis of this Perspective is on supervised machine learning.

261 citations

References
More filters
Journal ArticleDOI
TL;DR: In this article, a semi-empirical exchange correlation functional with local spin density, gradient, and exact exchange terms was proposed. But this functional performed significantly better than previous functionals with gradient corrections only, and fits experimental atomization energies with an impressively small average absolute deviation of 2.4 kcal/mol.
Abstract: Despite the remarkable thermochemical accuracy of Kohn–Sham density‐functional theories with gradient corrections for exchange‐correlation [see, for example, A. D. Becke, J. Chem. Phys. 96, 2155 (1992)], we believe that further improvements are unlikely unless exact‐exchange information is considered. Arguments to support this view are presented, and a semiempirical exchange‐correlation functional containing local‐spin‐density, gradient, and exact‐exchange terms is tested on 56 atomization energies, 42 ionization potentials, 8 proton affinities, and 10 total atomic energies of first‐ and second‐row systems. This functional performs significantly better than previous functionals with gradient corrections only, and fits experimental atomization energies with an impressively small average absolute deviation of 2.4 kcal/mol.

87,732 citations

Journal ArticleDOI
TL;DR: Numerical calculations on a number of atoms, positive ions, and molecules, of both open- and closed-shell type, show that density-functional formulas for the correlation energy and correlation potential give correlation energies within a few percent.
Abstract: A correlation-energy formula due to Colle and Salvetti [Theor. Chim. Acta 37, 329 (1975)], in which the correlation energy density is expressed in terms of the electron density and a Laplacian of the second-order Hartree-Fock density matrix, is restated as a formula involving the density and local kinetic-energy density. On insertion of gradient expansions for the local kinetic-energy density, density-functional formulas for the correlation energy and correlation potential are then obtained. Through numerical calculations on a number of atoms, positive ions, and molecules, of both open- and closed-shell type, it is demonstrated that these formulas, like the original Colle-Salvetti formulas, give correlation energies within a few percent.

84,646 citations

Book
Vladimir Vapnik1
01 Jan 1995
TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Abstract: Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing learning algorithms what is important in learning theory?.

40,147 citations

Journal ArticleDOI
TL;DR: High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated and the performance of the support- vector network is compared to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Abstract: The support-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data. High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.

37,861 citations

Journal ArticleDOI
TL;DR: The revised DFT-D method is proposed as a general tool for the computation of the dispersion energy in molecules and solids of any kind with DFT and related (low-cost) electronic structure methods for large systems.
Abstract: The method of dispersion correction as an add-on to standard Kohn-Sham density functional theory (DFT-D) has been refined regarding higher accuracy, broader range of applicability, and less empiricism. The main new ingredients are atom-pairwise specific dispersion coefficients and cutoff radii that are both computed from first principles. The coefficients for new eighth-order dispersion terms are computed using established recursion relations. System (geometry) dependent information is used for the first time in a DFT-D type approach by employing the new concept of fractional coordination numbers (CN). They are used to interpolate between dispersion coefficients of atoms in different chemical environments. The method only requires adjustment of two global parameters for each density functional, is asymptotically exact for a gas of weakly interacting neutral atoms, and easily allows the computation of atomic forces. Three-body nonadditivity terms are considered. The method has been assessed on standard benchmark sets for inter- and intramolecular noncovalent interactions with a particular emphasis on a consistent description of light and heavy element systems. The mean absolute deviations for the S22 benchmark set of noncovalent interactions for 11 standard density functionals decrease by 15%-40% compared to the previous (already accurate) DFT-D version. Spectacular improvements are found for a tripeptide-folding model and all tested metallic systems. The rectification of the long-range behavior and the use of more accurate C(6) coefficients also lead to a much better description of large (infinite) systems as shown for graphene sheets and the adsorption of benzene on an Ag(111) surface. For graphene it is found that the inclusion of three-body terms substantially (by about 10%) weakens the interlayer binding. We propose the revised DFT-D method as a general tool for the computation of the dispersion energy in molecules and solids of any kind with DFT and related (low-cost) electronic structure methods for large systems.

32,589 citations