scispace - formally typeset
Open AccessJournal ArticleDOI

Machine Learning for Accurate Force Calculations in Molecular Dynamics Simulations

Reads0
Chats0
TLDR
This work explores an approach to make use of the data obtained using the quantum mechanical density functional theory on small systems and use deep learning to subsequently simulate large systems by taking liquid argon as a test case.
Abstract
The computationally expensive nature of ab initio molecular dynamics simulations severely limits its ability to simulate large system sizes and long time scales, both of which are necessary to imit...

read more

Content maybe subject to copyright    Report

doi.org/10.26434/chemrxiv.12271289.v1
Machine Learning for Accurate Force Calculations in Molecular
Dynamics Simulations
Punyaslok Pattnaik, Shampa Raghunathan, Tarun Kalluri, Prabhakar Bhimalapuram, C. V. Jawahar, U. Deva
Priyakumar
Submitted date: 08/05/2020 Posted date: 08/05/2020
Licence: CC BY-NC-ND 4.0
Citation information: Pattnaik, Punyaslok; Raghunathan, Shampa; Kalluri, Tarun; Bhimalapuram, Prabhakar;
Jawahar, C. V.; Priyakumar, U. Deva (2020): Machine Learning for Accurate Force Calculations in Molecular
Dynamics Simulations. ChemRxiv. Preprint. https://doi.org/10.26434/chemrxiv.12271289.v1
The computationally expensive nature of ab initio molecular dynamics simulations severely limits its ability to
simulate large system sizes and long time scales, both of which are necessary to imitate experimental
conditions. In this work, we explore an approach to make use of the data obtained using the quantum
mechanical density functional theory (DFT) on small systems and use deep learning to subsequently simulate
large systems by taking liquid argon as a test case. A suitable vector representation was chosen to represent
the surrounding environment of each Ar atom, and a DNetFF machine learning model where, the neural
network was trained to predict the difference in resultant forces obtained by DFT and classical force fields was
introduced. Molecular dynamics simulations were then performed using forces from the neural network for
various system sizes and time scales depending on the properties we calculated. A comparison of properties
obtained from the classical force field and the neural network model was presented alongside available
experimental data to validate the proposed method.
File list (2)
download fileview on ChemRxivargon_submitted_may2.pdf (7.97 MiB)
download fileview on ChemRxivargon_si.pdf (847.95 KiB)

Machine Learning for Accurate Force
Calculations in Molecular Dynamics Simulations
Punyaslok Pattnaik,
Shampa Raghunathan,
Tarun Kalluri,
Prabhakar
Bhimalapuram,
C. V. Jawahar ,
and U. Deva Priyakumar
,
Center for Computational Natural Sciences and Bioinformatics, International Institute of
Information Technology, Hyderabad 500 032, India
Center for Visual Information Technology, KCIS, International Institute of Information
Technology, Hyderabad 500 032, India
E-mail: deva@iiit.ac.in
Abstract
The computationally expensive nature of ab initio molecular dynamics simulations
severely limits its ability to simulate large system s i zes and long time scales, both of
which are necessary to imitate experimental conditions. In this work, we explore an
approach to make use of the data obtained using the quantum mechanical density func-
tional theory (DFT) on small systems and use deep learning to subsequently simulate
large systems by taking liquid argon as a test case. A suitable vector representation
was chosen to represent the surroundi ng environment of each Ar atom, and a -NetFF
machine learning model where, the neural network was trained to predict the dierence
in result ant forces obtained by DFT and classical force fields was introduced. Molec-
ular dynamics simulations were then performed using forces from the neural network
for various system sizes and time scales depending on the properties we calculated. A
comparison of properties obtained from the classical force field and t h e neural networ k
1

model was presented alongside available experimental data to validate the prop os ed
method.
Introduction
The modeling of a condensed phase system involving chemical pro ce sses s p an n i ng multiple
time and length scales is particularly challenging. Ab initio molecular dynamics ( AIMD )
which explicitly treats elect r on ic degrees of freedom is n at u r al ly the first method of choice,
however computati o n al l y deman d i n g , thus prohibiting its application to large molecular
systems.
1,2
Classical molecular dynamics (MD) simulati on s employing for ce fields can do a
proper sampling of the phase space of large systems (up to million atoms),
3,4
but underlying
interatomic potentials are often not accurate enough to obtain quantitavely accurate results.
Their transferability to situations that were not originally used in the paramet er i za t i on is
questionable, which further limits their accuracy.
The need to construct a multiscale model (considering electronic, nuclear dynamics
5–7
and their coupl i n g to slower, cooperative motions of the system) to capture accurate dy-
namics of chemical processes cannot be overstated.
8–10
The fundamental question is: Can
one quantify the relevance of atomistic models to electronic interactions employing any nu-
merical formalism and how corresponding MD errors reflect emergent features in ab initio
driving forces? One of the most successful approaches relies on quantum mechanical (QM)
calculations on gas-phase (sometimes considering the imp l i cit sol vent model) cluster s to pa-
rameterize a model meant for bulk phase simulations. Another empirical procedure is based
on the minimization of a “loss function” or “objective function” between simulated and
experimental physical properties.
With th e increasing availability of computational resou r ces and d a t a, machine learning
(ML) techniques have been popularly applied to predict quantum mechanical properties.
11–17
AplethoraofsophisticatedMLapproachesexist:Forpredictinggroundstateenergies,ap-
2

proaches, such as, b oosted regression tree algorithms,
18
high-dimensional neural network
potential energy surfaces using symmetry functions,
11
continuous-filter convolutional lay-
ers
17
and single-atom atomic environment vectors (AEV)
13
have been used. Atomization
energies for molecul es have been predicted based on nuclear charges and atomic positions
only.
19
Multiple electronic, ground , an d excited-state properties have also been predicted
simultaneously using Coulomb matrices in conjunction with deep multi-task artificial neural
networks.
14,20
The bag-of-bonds model was used to predict accurate electronic properties
of molecules, such as, their polarizability and molecular frontier orbital energies.
21,22
Using
artificial neural networks (ANNs), energies of molecules have also been predicted as a sum
of intrinsic bond energies, while also providing valuable insight into the relative strengths
of bonds as a function of their molecular environment.
16
ANNs have also been used al on g
with genetic algorithm ( GA) optimization to discover unconventio n a l spin-crossover com-
plexes, which emphasizes their power for discovering new inorganic materials.
23
Recently
an ML model was proposed where a novel molecular descriptor inspired by classical force
fields ter m s bonds, angles, non-bonded interactions a n d dihedrals to perform geometry
optimizations along with predicting their energies.
24
This model employs feed-forward fully
connected deep neural networks. Graph neural networks were used to predict solvent-solute
interaction map
25
for studying solvation fr ee energies of dru g-l ike m ol ecu les/sol u te. Instead
of applying ML techniques to directly compute prop erties of new molecules through inter-
polation in chemical compound space, recently, ML of force field parameters was performed
for semi-empi r i ca l modeling.
26
In the r ecent years, machine learni n g (ML) has emerged as a potential technique for de-
veloping a new generation of highly accurate force fields (FFs) for simulations of molecules
and materials. Ramprasad and coworkers
27–29
have developed ML-based atomistic force
fields for MD simulations. They have mainly focused on bul k solid-state materials. Another
approach, on-the-fly ML of QM forces in MD simulations was recently reported by Li and
coworkers
30
on bulk Si. The sm ooth overlap of atom i c positions (S OAP) metric has been
3

used to construct potential energy surfaces, and its performance was evaluated for small
silicon clusters.
15
Gaussian ap p r oximation potentials have been used to generate trajectories
for water di mer s, energetics path for a migrating vacancy and the transformation of rhomb o-
hedral graphite to diamond.
31
Another popularly used class of ML-FFs based on Gaussian
process (GP) regression was developed for stduyin g 19-atom Ni nanocluster
32
as well as
adsorption energies of small molecules on NiGa and RhAu nanoclusters.
33,34
Interatomic
potentials for metallic aluminium, carbon and dimer potentials for noble gases have been
reconstructed using neural networks.
35
The eect s of such fitted potentials on the calculation
of physical properties obtained from their trajectories, at dierent physical conditions, such
as, temperature and pr es su r e, n e ed to be studied to further reinforce on their fu t u r e applica-
tions. Machine learning is also being successfully used to analyze longtime scale simulation
data on large systems.
36,37
As the area of development of ML-FFs for MD si mulations is expand i n g towards assessing
and improving the accuracy and transferability of the model, learning and predicting atomic
forces have b een receiving notable successes. Because, atomic forces can be seen as true QM
observation within the BO-approximation to abide by the Hellmann-Feynman theorem.
38
The energy of a m ol ecu l ar system would then be recovered through appropriate integration
of the force-field kernel. ML models are often trained on reference data obtained from QM-
based methods, such as, density functional theory (D FT) within the Kohn-Sham formalism.
DFT cont i nues to exist as one of the most popular and widely used QM-t ar g et from molecular
regime
14,19,21,27,39–42
to condensed matter and materials informatics.
11,15,20,31,43–48
In this a r t i cl e, we explore an approach in which DFT calculations for smaller systems can
be used, in conjunction with machine learning to simulate larger systems at a computational
eort comparable to cl a ssi ca l force fields, while being able to predict forces sim i l a r to DFT.
A -NetFF model that uses the dierence in forces obtained fr o m t h e molecular mechanical
force field and the quantum mechan i ca l DFT app r oa ches to train the NN was introduced
for const r u ct in g the force field for MD simulations. The predictive power of the present ML
4

Citations
More filters
Journal ArticleDOI

Machine Learning of Molecular Electronic Properties in Chemical Compound Space

TL;DR: In this paper, a deep multi-task artificial neural network is used to predict multiple electronic ground-and excited-state properties, such as atomization energy, polarizability, frontier orbital eigenvalues, ionization potential, electron affinity, and excitation energies.
Journal ArticleDOI

Neural Network Potential Energy Surfaces for Small Molecules and Reactions

TL;DR: This work considers NN-based approaches to build PESs in the sums-of-product form important for quantum dynamics, ways to treat symmetry, and issues related to sampling data distributions and the relation between PES errors and errors in observables.
Posted Content

Fourier series of atomic radial distribution functions: A molecular fingerprint for machine learning models of quantum chemical properties

TL;DR: In this article, a Fourier series of atomic radial distribution functions is used to represent molecules and obtain an invariance with respect to translation, rotation and nuclear permutation, and requires no pre-conceived knowledge about chemical bonding, topology, or electronic orbitals.
Journal ArticleDOI

A review on machine learning algorithms for the ionic liquid chemical space

TL;DR: In this article, the use of machine learning algorithms as property prediction tools for ionic liquids (either as standalone methods or in conjunction with molecular dynamics simulations), presents common problems of training datasets and proposes ways that could lead to more accurate and efficient models.
Journal ArticleDOI

DeepPocket: Ligand Binding Site Detection and Segmentation using 3D Convolutional Neural Networks.

TL;DR: A novel framework, DeepPocket is reported that utilises 3D convolutional neural networks for the rescoring of pockets identified by Fpocket and further segments these identified cavities on the protein surface and highlights its better performance over current state-of-the-art methods and good generalization ability over novel structures.
References
More filters
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Journal ArticleDOI

Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density

TL;DR: Numerical calculations on a number of atoms, positive ions, and molecules, of both open- and closed-shell type, show that density-functional formulas for the correlation energy and correlation potential give correlation energies within a few percent.
Book

CRC Handbook of Chemistry and Physics

TL;DR: CRC handbook of chemistry and physics, CRC Handbook of Chemistry and Physics, CRC handbook as discussed by the authors, CRC Handbook for Chemistry and Physiology, CRC Handbook for Physics,
Journal ArticleDOI

Density-functional exchange-energy approximation with correct asymptotic behavior.

TL;DR: This work reports a gradient-corrected exchange-energy functional, containing only one parameter, that fits the exact Hartree-Fock exchange energies of a wide variety of atomic systems with remarkable accuracy, surpassing the performance of previous functionals containing two parameters or more.
Journal ArticleDOI

Multilayer feedforward networks are universal approximators

TL;DR: It is rigorously established that standard multilayer feedforward networks with as few as one hidden layer using arbitrary squashing functions are capable of approximating any Borel measurable function from one finite dimensional space to another to any desired degree of accuracy, provided sufficiently many hidden units are available.
Related Papers (5)
Frequently Asked Questions (17)
Q1. What are the contributions in "Machine learning for accurate force calculations in molecular dynamics simulations" ?

In this work, the authors explore an approach to make use of the data obtained using the quantum mechanical density functional theory ( DFT ) on small systems and use deep learning to subsequently simulate large systems by taking liquid argon as a test case. A suitable vector representation was chosen to represent the surrounding environment of each Ar atom, and a -NetFF machine learning model where, the neural network was trained to predict the di↵erence in resultant forces obtained by DFT and classical force fields was introduced. 

Di↵usion coe cient and viscosity calculations indicate that the new forces bias the simulation closer towards experimental data, indicating that these properties calculated from a long DFT simulation would most likely be closer to experimental values as compared to a purely classical force field. Future work to modify the feature vector to extend this method to complex multicomponent systems is in progress. Finally, the time comparison data further emphasizes the e ciency of the model to run long and multiple replicate simulations which are vital in the calculation of thermodynamic and kinetic properties. 

ML models are often trained on reference data obtained from QMbased methods, such as, density functional theory (DFT) within the Kohn-Sham formalism. 

Due to the wellknown,84 large oscillations at very short times and noise at very long times, these regions are excluded from the fitting procedure. 

The bag-of-bonds model was used to predict accurate electronic properties of molecules, such as, their polarizability and molecular frontier orbital energies. 

Calculating forces for each frameDFT was chosen as the ab initio method to calculate forces because of its speed of calculations, so that a dataset can be generated in a reasonable amount of time. 

One of the most successful approaches relies on quantum mechanical (QM) calculations on gas-phase (sometimes considering the implicit solvent model) clusters to parameterize a model meant for bulk phase simulations. 

16 ANNs have also been used along with genetic algorithm (GA) optimization to discover unconventional spin-crossover complexes, which emphasizes their power for discovering new inorganic materials. 

random sampling makes it very hard to place all points at some distance apart from each other, which is essential in their case to avoid two atoms being too close together resulting in non-physical contacts. 

As the area of development of ML-FFs for MD simulations is expanding towards assessing and improving the accuracy and transferability of the model, learning and predicting atomic forces have been receiving notable successes. 

The general shape of the plots at both pressures indicate that the model generated trajectories follow the trend of the classical trajectories, due to the Fclassical contribution in the force model. 

Since di↵usion coe cient calculations require longer time scale trajectories (especially for gaseous states, owing to the high mean free path),77 it is not feasible to calculate this property using a DFT trajectory. 

19 Multiple electronic, ground, and excited-state properties have also been predicted simultaneously using Coulomb matrices in conjunction with deep multi-task artificial neural networks. 

The e↵ectiveness of the trained model in real world scenarios can be better understood by calculating physical properties using trajectories generated by using the neural network in addition to the ability to predict the target force data. 

The need to construct a multiscale model (considering electronic, nuclear dynamics5–7 and their coupling to slower, cooperative motions of the system) to capture accurate dynamics of chemical processes cannot be overstated. 

The e↵ects of such fitted potentials on the calculation of physical properties obtained from their trajectories, at di↵erent physical conditions, such as, temperature and pressure, need to be studied to further reinforce on their future applications. 

The code for the-NetFF model is freely available from https://github.com/devalab/delNetFF.The primary system used for generating the data for ML contained 96 argon atoms in a cubic box of size 16.7 x 16.7 x 16.7 Å3.