Discovering Collective Variables of Molecular Transitions via Genetic Algorithms and Neural Networks

doi:10.1021/ACS.JCTC.0C00981

Open AccessJournal ArticleDOI

Discovering Collective Variables of Molecular Transitions via Genetic Algorithms and Neural Networks

Ferry Hooft, +2 more

- 04 Mar 2021 -

Journal of Chemical Theory and Computati...

- Vol. 17, Iss: 4, pp 2294-2306

Chats0

TLDR

In this article, a framework was proposed to find an optimal set of CVs from a pool of candidates using a combination of artificial neural networks and genetic algorithms, and the successful retrieval of optimal CVs by this framework is illustrated at the hand of two case studies: the well-known conformational change in the alanine dipeptide molecule and the more intricate transition of a base pair in B-DNA.

Abstract:

With the continual improvement of computing hardware and algorithms, simulations have become a powerful tool for understanding all sorts of (bio)molecular processes. To handle the large simulation data sets and to accelerate slow, activated transitions, a condensed set of descriptors, or collective variables (CVs), is needed to discern the relevant dynamics that describes the molecular process of interest. However, proposing an adequate set of CVs that can capture the intrinsic reaction coordinate of the molecular transition is often extremely difficult. Here, we present a framework to find an optimal set of CVs from a pool of candidates using a combination of artificial neural networks and genetic algorithms. The approach effectively replaces the encoder of an autoencoder network with genes to represent the latent space, i.e., the CVs. Given a selection of CVs as input, the network is trained to recover the atom coordinates underlying the CV values at points along the transition. The network performance is used as an estimator of the fitness of the input CVs. Two genetic algorithms optimize the CV selection and the neural network architecture. The successful retrieval of optimal CVs by this framework is illustrated at the hand of two case studies: the well-known conformational change in the alanine dipeptide molecule and the more intricate transition of a base pair in B-DNA from the classic Watson-Crick pairing to the alternative Hoogsteen pairing. Key advantages of our framework include the following: optimal interpretable CVs, avoiding costly calculation of committor or time-correlation functions, and automatic hyperparameter optimization. In addition, we show that applying a time-delay between the network input and output allows for enhanced selection of slow variables. Moreover, the network can also be used to generate molecular configurations of unexplored microstates, for example, for augmentation of the simulation data.

Discovering Collective Variables of Molecular Transitions via Genetic Algorithms and Neural Networks

Citations

Reaction coordinates in complex systems-a perspective

Explaining reaction coordinates of alanine dipeptide isomerization obtained from deep neural networks using Explainable Artificial Intelligence (XAI).

DeepCV: A Deep Learning Framework for Blind Search of Collective Variables in Expanded Configurational Space

Computational strategies for protein conformational ensemble detection.

Computational strategies for protein conformational ensemble detection

References

A smooth particle mesh Ewald method

The perceptron: a probabilistic model for information storage and organization in the brain.

Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling

Improved side‐chain torsion potentials for the Amber ff99SB protein force field

Escaping free-energy minima

Related Papers (5)

Evolutionary neural networks : models and applications

Analysis of genetic algorithms from a global random search method perspective with techniques for algorithmic improvement

Interpretation and approximation tools for big, dense Markov chain transition matrices in population genetics

Reconstructing Networks from Profit Sequences in Evolutionary Games via a Multiobjective Optimization Approach with Lasso Initialization.

An evolutionary algorithm taking account of mutual interactions among substances for inference of genetic networks