Open AccessJournal ArticleDOI

Machine Learning for Accurate Force Calculations in Molecular Dynamics Simulations

- 29 Jul 2020 -

- Vol. 124, Iss: 34, pp 6954-6967

Chats0

TLDR

This work explores an approach to make use of the data obtained using the quantum mechanical density functional theory on small systems and use deep learning to subsequently simulate large systems by taking liquid argon as a test case.

Abstract:

The computationally expensive nature of ab initio molecular dynamics simulations severely limits its ability to simulate large system sizes and long time scales, both of which are necessary to imit...

Content maybe subject to copyright Report

doi.org/10.26434/chemrxiv.12271289.v1

Machine Learning for Accurate Force Calculations in Molecular

Dynamics Simulations

Punyaslok Pattnaik, Shampa Raghunathan, Tarun Kalluri, Prabhakar Bhimalapuram, C. V. Jawahar, U. Deva

Priyakumar

Submitted date: 08/05/2020 • Posted date: 08/05/2020

Licence: CC BY-NC-ND 4.0

Citation information: Pattnaik, Punyaslok; Raghunathan, Shampa; Kalluri, Tarun; Bhimalapuram, Prabhakar;

Jawahar, C. V.; Priyakumar, U. Deva (2020): Machine Learning for Accurate Force Calculations in Molecular

Dynamics Simulations. ChemRxiv. Preprint. https://doi.org/10.26434/chemrxiv.12271289.v1

The computationally expensive nature of ab initio molecular dynamics simulations severely limits its ability to

simulate large system sizes and long time scales, both of which are necessary to imitate experimental

conditions. In this work, we explore an approach to make use of the data obtained using the quantum

mechanical density functional theory (DFT) on small systems and use deep learning to subsequently simulate

large systems by taking liquid argon as a test case. A suitable vector representation was chosen to represent

the surrounding environment of each Ar atom, and a DNetFF machine learning model where, the neural

network was trained to predict the difference in resultant forces obtained by DFT and classical force fields was

introduced. Molecular dynamics simulations were then performed using forces from the neural network for

various system sizes and time scales depending on the properties we calculated. A comparison of properties

obtained from the classical force field and the neural network model was presented alongside available

experimental data to validate the proposed method.

File list (2)

download fileview on ChemRxivargon_submitted_may2.pdf (7.97 MiB)

download fileview on ChemRxivargon_si.pdf (847.95 KiB)

Machine Learning for Accurate Force

Calculations in Molecular Dynamics Simulations

Punyaslok Pattnaik,

†

Shampa Raghunathan,

†

Tarun Kalluri,

‡

Prabhakar

Bhimalapuram,

†

C. V. Jawahar ,

‡

and U. Deva Priyakumar

⇤,†

†Center for Computational Natural Sciences and Bioinformatics, International Institute of

Information Technology, Hyderabad 500 032, India

‡Center for Visual Information Technology, KCIS, International Institute of Information

Technology, Hyderabad 500 032, India

E-mail: deva@iiit.ac.in

Abstract

The computationally expensive nature of ab initio molecular dynamics simulations

severely limits its ability to simulate large system s i zes and long time scales, both of

which are necessary to imitate experimental conditions. In this work, we explore an

approach to make use of the data obtained using the quantum mechanical density func-

tional theory (DFT) on small systems and use deep learning to subsequently simulate

large systems by taking liquid argon as a test case. A suitable vector representation

was chosen to represent the surroundi ng environment of each Ar atom, and a -NetFF

machine learning model where, the neural network was trained to predict the di↵erence

in result ant forces obtained by DFT and classical force ﬁelds was introduced. Molec-

ular dynamics simulations were then performed using forces from the neural network

for various system sizes and time scales depending on the properties we calculated. A

comparison of properties obtained from the classical force ﬁeld and t h e neural networ k

model was presented alongside available experimental data to validate the prop os ed

method.

Introduction

The modeling of a condensed phase system involving chemical pro ce sses s p an n i ng multiple

time and length scales is particularly challenging. Ab initio molecular dynamics ( AIMD )

which explicitly treats elect r on ic degrees of freedom is n at u r al ly the ﬁrst method of choice,

however computati o n al l y deman d i n g , thus prohibiting its application to large molecular

systems.

1,2

Classical molecular dynamics (MD) simulati on s employing for ce ﬁelds can do a

proper sampling of the phase space of large systems (up to million atoms),

3,4

but underlying

interatomic potentials are often not accurate enough to obtain quantitavely accurate results.

Their transferability to situations that were not originally used in the paramet er i za t i on is

questionable, which further limits their accuracy.

The need to construct a multiscale model (considering electronic, nuclear dynamics

5–7

and their coupl i n g to slower, cooperative motions of the system) to capture accurate dy-

namics of chemical processes cannot be overstated.

8–10

The fundamental question is: Can

one quantify the relevance of atomistic models to electronic interactions employing any nu-

merical formalism and how corresponding MD errors reﬂect emergent features in ab initio

driving forces? One of the most successful approaches relies on quantum mechanical (QM)

calculations on gas-phase (sometimes considering the imp l i cit sol vent model) cluster s to pa-

rameterize a model meant for bulk phase simulations. Another empirical procedure is based

on the minimization of a “loss function” or “objective function” between simulated and

experimental physical properties.

With th e increasing availability of computational resou r ces and d a t a, machine learning

(ML) techniques have been popularly applied to predict quantum mechanical properties.

11–17

AplethoraofsophisticatedMLapproachesexist:Forpredictinggroundstateenergies,ap-

proaches, such as, b oosted regression tree algorithms,

high-dimensional neural network

potential energy surfaces using symmetry functions,

continuous-ﬁlter convolutional lay-

ers

and single-atom atomic environment vectors (AEV)

have been used. Atomization

energies for molecul es have been predicted based on nuclear charges and atomic positions

only.

Multiple electronic, ground , an d excited-state properties have also been predicted

simultaneously using Coulomb matrices in conjunction with deep multi-task artiﬁcial neural

networks.

14,20

The bag-of-bonds model was used to predict accurate electronic properties

of molecules, such as, their polarizability and molecular frontier orbital energies.

21,22

Using

artiﬁcial neural networks (ANNs), energies of molecules have also been predicted as a sum

of intrinsic bond energies, while also providing valuable insight into the relative strengths

of bonds as a function of their molecular environment.

ANNs have also been used al on g

with genetic algorithm ( GA) optimization to discover unconventio n a l spin-crossover com-

plexes, which emphasizes their power for discovering new inorganic materials.

Recently

an ML model was proposed where a novel molecular descriptor inspired by classical force

ﬁelds ter m s – bonds, angles, non-bonded interactions a n d dihedrals to perform geometry

optimizations along with predicting their energies.

This model employs feed-forward fully

connected deep neural networks. Graph neural networks were used to predict solvent-solute

interaction map

for studying solvation fr ee energies of dru g-l ike m ol ecu les/sol u te. Instead

of applying ML techniques to directly compute prop erties of new molecules through inter-

polation in chemical compound space, recently, ML of force ﬁeld parameters was performed

for semi-empi r i ca l modeling.

In the r ecent years, machine learni n g (ML) has emerged as a potential technique for de-

veloping a new generation of highly accurate force ﬁelds (FFs) for simulations of molecules

and materials. Ramprasad and coworkers

27–29

have developed ML-based atomistic force

ﬁelds for MD simulations. They have mainly focused on bul k solid-state materials. Another

approach, on-the-ﬂy ML of QM forces in MD simulations was recently reported by Li and

coworkers

on bulk Si. The sm ooth overlap of atom i c positions (S OAP) metric has been

used to construct potential energy surfaces, and its performance was evaluated for small

silicon clusters.

Gaussian ap p r oximation potentials have been used to generate trajectories

for water di mer s, energetics path for a migrating vacancy and the transformation of rhomb o-

hedral graphite to diamond.

Another popularly used class of ML-FFs based on Gaussian

process (GP) regression was developed for stduyin g 19-atom Ni nanocluster

as well as

adsorption energies of small molecules on NiGa and RhAu nanoclusters.

33,34

Interatomic

potentials for metallic aluminium, carbon and dimer potentials for noble gases have been

reconstructed using neural networks.

The e↵ect s of such ﬁtted potentials on the calculation

of physical properties obtained from their trajectories, at di↵erent physical conditions, such

as, temperature and pr es su r e, n e ed to be studied to further reinforce on their fu t u r e applica-

tions. Machine learning is also being successfully used to analyze longtime scale simulation

data on large systems.

36,37

As the area of development of ML-FFs for MD si mulations is expand i n g towards assessing

and improving the accuracy and transferability of the model, learning and predicting atomic

forces have b een receiving notable successes. Because, atomic forces can be seen as true QM

observation within the BO-approximation to abide by the Hellmann-Feynman theorem.

The energy of a m ol ecu l ar system would then be recovered through appropriate integration

of the force-ﬁeld kernel. ML models are often trained on reference data obtained from QM-

based methods, such as, density functional theory (D FT) within the Kohn-Sham formalism.

DFT cont i nues to exist as one of the most popular and widely used QM-t ar g et from molecular

regime

14,19,21,27,39–42

to condensed matter and materials informatics.

11,15,20,31,43–48

In this a r t i cl e, we explore an approach in which DFT calculations for smaller systems can

be used, in conjunction with machine learning to simulate larger systems at a computational

e↵ort comparable to cl a ssi ca l force ﬁelds, while being able to predict forces sim i l a r to DFT.

A -NetFF model that uses the di↵erence in forces obtained fr o m t h e molecular mechanical

force ﬁeld and the quantum mechan i ca l DFT app r oa ches to train the NN was introduced

for const r u ct in g the force ﬁeld for MD simulations. The predictive power of the present ML

HTML Viewer

Figures

Figure 3: Distribution of the three components of the force before aligning vector connecting the origin and the nearest atom to the X-axis.

Table 4: Comparison of viscosity coe cients calculated using various methods with the experimental values at di↵erent thermodynamic conditions.

Figure 12: Time evolution of viscosity coe cients calculated using various methods at different thermodynamic conditions.

Figure 5: Architecture of the neural network used for the present -NetFF model. The numbers of inputs, outputs and the number of nodes in each hidden layer are given in parenthesis.

Figure 9: Self-di↵usion coe cients of argon obtained at di↵erent temperatures at 13.07 bar and 58.6 bar.

Figure 1: Workflow to train the -NetFF and to perform simulations using the predicted forces.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Machine Learning of Molecular Electronic Properties in Chemical Compound Space

Grégoire Montavon, +9 more

- 30 May 2013 -

arXiv: Chemical Physics

TL;DR: In this paper, a deep multi-task artificial neural network is used to predict multiple electronic ground-and excited-state properties, such as atomization energy, polarizability, frontier orbital eigenvalues, ionization potential, electron affinity, and excitation energies.

...read moreread less

Journal ArticleDOI

Neural Network Potential Energy Surfaces for Small Molecules and Reactions

Sergei Manzhos, +1 more

- 25 Aug 2021 -

Chemical Reviews

TL;DR: This work considers NN-based approaches to build PESs in the sums-of-product form important for quantum dynamics, ways to treat symmetry, and issues related to sampling data distributions and the relation between PES errors and errors in observables.

...read moreread less

Posted Content

Fourier series of atomic radial distribution functions: A molecular fingerprint for machine learning models of quantum chemical properties

O. Anatole von Lilienfeld, +4 more

- 10 Jul 2013 -

arXiv: Chemical Physics

TL;DR: In this article, a Fourier series of atomic radial distribution functions is used to represent molecules and obtain an invariance with respect to translation, rotation and nuclear permutation, and requires no pre-conceived knowledge about chemical bonding, topology, or electronic orbitals.

...read moreread less

Journal ArticleDOI

A review on machine learning algorithms for the ionic liquid chemical space

Spyridon Koutsoukos, +3 more

- 26 May 2021 -

Chemical Science

TL;DR: In this article, the use of machine learning algorithms as property prediction tools for ionic liquids (either as standalone methods or in conjunction with molecular dynamics simulations), presents common problems of training datasets and proposes ways that could lead to more accurate and efficient models.

...read moreread less

Journal ArticleDOI

DeepPocket: Ligand Binding Site Detection and Segmentation using 3D Convolutional Neural Networks.

Rishal Aggarwal, +4 more

- 10 Aug 2021 -

Journal of Chemical Information and Mode...

TL;DR: A novel framework, DeepPocket is reported that utilises 3D convolutional neural networks for the rescoring of pockets identified by Fpocket and further segments these identified cavities on the protein surface and highlights its better performance over current state-of-the-art methods and good generalization ability over novel structures.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

Journal ArticleDOI

Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density

Chengteh Lee, +2 more

- 15 Jan 1988 -

Physical Review B

TL;DR: Numerical calculations on a number of atoms, positive ions, and molecules, of both open- and closed-shell type, show that density-functional formulas for the correlation energy and correlation potential give correlation energies within a few percent.

...read moreread less

Book

CRC Handbook of Chemistry and Physics

William M. Haynes

TL;DR: CRC handbook of chemistry and physics, CRC Handbook of Chemistry and Physics, CRC handbook as discussed by the authors, CRC Handbook for Chemistry and Physiology, CRC Handbook for Physics,

...read moreread less

Journal ArticleDOI

Density-functional exchange-energy approximation with correct asymptotic behavior.

Axel D. Becke

- 15 Sep 1988 -

Physical Review A

TL;DR: This work reports a gradient-corrected exchange-energy functional, containing only one parameter, that fits the exact Hartree-Fock exchange energies of a wide variety of atomic systems with remarkable accuracy, surpassing the performance of previous functionals containing two parameters or more.

...read moreread less

Journal ArticleDOI

Multilayer feedforward networks are universal approximators

Kurt Hornik, +2 more

- 01 Jul 1989 -

Neural Networks

TL;DR: It is rigorously established that standard multilayer feedforward networks with as few as one hidden layer using arbitrary squashing functions are capable of approximating any Borel measurable function from one finite dimensional space to another to any desired degree of accuracy, provided sufficiently many hidden units are available.

...read moreread less

Collapse

Machine Learning for Molecular Simulation.

Frank Noé, +7 more

- 20 Apr 2020 -

Annual Review of Physical Chemistry

Random Sampling High Dimensional Model Representation Gaussian Process Regression (RS-HDMR-GPR) for Multivariate Function Representation: Application to Molecular Potential Energy Surfaces

Mohamed Ali Boussaidi, +4 more

- 20 Aug 2020 -

Journal of Physical Chemistry A

Generalized neural-network representation of high-dimensional potential-energy surfaces.

Jörg Behler, +1 more

- 02 Apr 2007 -

Physical Review Letters

Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons.

Albert P. Bartók, +3 more

- 01 Apr 2010 -

Physical Review Letters

Machine learning molecular dynamics for the simulation of infrared spectra

Michael Gastegger, +2 more

- 25 Sep 2017 -

Chemical Science

Frequently Asked Questions (17)

Q1. What are the contributions in "Machine learning for accurate force calculations in molecular dynamics simulations" ?

In this work, the authors explore an approach to make use of the data obtained using the quantum mechanical density functional theory ( DFT ) on small systems and use deep learning to subsequently simulate large systems by taking liquid argon as a test case. A suitable vector representation was chosen to represent the surrounding environment of each Ar atom, and a -NetFF machine learning model where, the neural network was trained to predict the di↵erence in resultant forces obtained by DFT and classical force fields was introduced.

Q2. What have the authors stated for future works in "Machine learning for accurate force calculations in molecular dynamics simulations" ?

Di↵usion coe cient and viscosity calculations indicate that the new forces bias the simulation closer towards experimental data, indicating that these properties calculated from a long DFT simulation would most likely be closer to experimental values as compared to a purely classical force field. Future work to modify the feature vector to extend this method to complex multicomponent systems is in progress. Finally, the time comparison data further emphasizes the e ciency of the model to run long and multiple replicate simulations which are vital in the calculation of thermodynamic and kinetic properties.

Q3. What is the common method used to train ML models?

ML models are often trained on reference data obtained from QMbased methods, such as, density functional theory (DFT) within the Kohn-Sham formalism.

Q4. Why are these regions excluded from the fitting procedure?

Due to the wellknown,84 large oscillations at very short times and noise at very long times, these regions are excluded from the fitting procedure.

Q5. What is the common method used to predict electronic properties of molecules?

The bag-of-bonds model was used to predict accurate electronic properties of molecules, such as, their polarizability and molecular frontier orbital energies.

Q6. Why was DFT chosen as the ab initio method to calculate forces?

Calculating forces for each frameDFT was chosen as the ab initio method to calculate forces because of its speed of calculations, so that a dataset can be generated in a reasonable amount of time.

Q7. What is the successful approach to predicting bulk phase systems?

One of the most successful approaches relies on quantum mechanical (QM) calculations on gas-phase (sometimes considering the implicit solvent model) clusters to parameterize a model meant for bulk phase simulations.

Q8. How many ANNs have been used to discover new inorganic materials?

16 ANNs have also been used along with genetic algorithm (GA) optimization to discover unconventional spin-crossover complexes, which emphasizes their power for discovering new inorganic materials.

Q9. What is the effect of random sampling on the atoms?

random sampling makes it very hard to place all points at some distance apart from each other, which is essential in their case to avoid two atoms being too close together resulting in non-physical contacts.

Q10. What is the role of ML in predicting atomic forces?

As the area of development of ML-FFs for MD simulations is expanding towards assessing and improving the accuracy and transferability of the model, learning and predicting atomic forces have been receiving notable successes.

Q11. What is the general shape of the plots at both pressures?

The general shape of the plots at both pressures indicate that the model generated trajectories follow the trend of the classical trajectories, due to the Fclassical contribution in the force model.

Q12. Why is it not feasible to calculate diusion coe cients using a?

Since di↵usion coe cient calculations require longer time scale trajectories (especially for gaseous states, owing to the high mean free path),77 it is not feasible to calculate this property using a DFT trajectory.

Q13. How have multiple electronic, ground, and excited-state properties been predicted simultaneously?

19 Multiple electronic, ground, and excited-state properties have also been predicted simultaneously using Coulomb matrices in conjunction with deep multi-task artificial neural networks.

Q14. How can the authors test the eectiveness of the trained model in real world scenarios?

The e↵ectiveness of the trained model in real world scenarios can be better understood by calculating physical properties using trajectories generated by using the neural network in addition to the ability to predict the target force data.

Q15. What is the need to construct a multiscale model to capture accurate dynamics of chemical processes?

The need to construct a multiscale model (considering electronic, nuclear dynamics5–7 and their coupling to slower, cooperative motions of the system) to capture accurate dynamics of chemical processes cannot be overstated.

Q16. What are the effects of fitted potentials on the calculation of physical properties?

The e↵ects of such fitted potentials on the calculation of physical properties obtained from their trajectories, at di↵erent physical conditions, such as, temperature and pressure, need to be studied to further reinforce on their future applications.

Q17. How many argon atoms were in the -NetFF model?

The code for the-NetFF model is freely available from https://github.com/devalab/delNetFF.The primary system used for generating the data for ML contained 96 argon atoms in a cubic box of size 16.7 x 16.7 x 16.7 Å3.

Machine Learning for Accurate Force Calculations in Molecular Dynamics Simulations

Figures

Citations

Machine Learning of Molecular Electronic Properties in Chemical Compound Space

Neural Network Potential Energy Surfaces for Small Molecules and Reactions

Fourier series of atomic radial distribution functions: A molecular fingerprint for machine learning models of quantum chemical properties

A review on machine learning algorithms for the ionic liquid chemical space

DeepPocket: Ligand Binding Site Detection and Segmentation using 3D Convolutional Neural Networks.

References

Adam: A Method for Stochastic Optimization

Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density

CRC Handbook of Chemistry and Physics

Density-functional exchange-energy approximation with correct asymptotic behavior.

Multilayer feedforward networks are universal approximators

Related Papers (5)

Machine Learning for Molecular Simulation.

Random Sampling High Dimensional Model Representation Gaussian Process Regression (RS-HDMR-GPR) for Multivariate Function Representation: Application to Molecular Potential Energy Surfaces

Generalized neural-network representation of high-dimensional potential-energy surfaces.

Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons.

Machine learning molecular dynamics for the simulation of infrared spectra

Frequently Asked Questions (17)

Q1. What are the contributions in "Machine learning for accurate force calculations in molecular dynamics simulations" ?

Q2. What have the authors stated for future works in "Machine learning for accurate force calculations in molecular dynamics simulations" ?

Q3. What is the common method used to train ML models?

Q4. Why are these regions excluded from the fitting procedure?

Q5. What is the common method used to predict electronic properties of molecules?

Q6. Why was DFT chosen as the ab initio method to calculate forces?

Q7. What is the successful approach to predicting bulk phase systems?

Q8. How many ANNs have been used to discover new inorganic materials?

Q9. What is the effect of random sampling on the atoms?

Q10. What is the role of ML in predicting atomic forces?

Q11. What is the general shape of the plots at both pressures?

Q12. Why is it not feasible to calculate diusion coe cients using a?

Q13. How have multiple electronic, ground, and excited-state properties been predicted simultaneously?

Q14. How can the authors test the eectiveness of the trained model in real world scenarios?

Q15. What is the need to construct a multiscale model to capture accurate dynamics of chemical processes?

Q16. What are the effects of fitted potentials on the calculation of physical properties?

Q17. How many argon atoms were in the -NetFF model?