scispace - formally typeset
Search or ask a question
Author

Ferry Hooft

Bio: Ferry Hooft is an academic researcher from University of Amsterdam. The author has contributed to research in topics: Autoencoder & Hyperparameter optimization. The author has an hindex of 1, co-authored 1 publications receiving 6 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: In this article, a framework was proposed to find an optimal set of CVs from a pool of candidates using a combination of artificial neural networks and genetic algorithms, and the successful retrieval of optimal CVs by this framework is illustrated at the hand of two case studies: the well-known conformational change in the alanine dipeptide molecule and the more intricate transition of a base pair in B-DNA.
Abstract: With the continual improvement of computing hardware and algorithms, simulations have become a powerful tool for understanding all sorts of (bio)molecular processes. To handle the large simulation data sets and to accelerate slow, activated transitions, a condensed set of descriptors, or collective variables (CVs), is needed to discern the relevant dynamics that describes the molecular process of interest. However, proposing an adequate set of CVs that can capture the intrinsic reaction coordinate of the molecular transition is often extremely difficult. Here, we present a framework to find an optimal set of CVs from a pool of candidates using a combination of artificial neural networks and genetic algorithms. The approach effectively replaces the encoder of an autoencoder network with genes to represent the latent space, i.e., the CVs. Given a selection of CVs as input, the network is trained to recover the atom coordinates underlying the CV values at points along the transition. The network performance is used as an estimator of the fitness of the input CVs. Two genetic algorithms optimize the CV selection and the neural network architecture. The successful retrieval of optimal CVs by this framework is illustrated at the hand of two case studies: the well-known conformational change in the alanine dipeptide molecule and the more intricate transition of a base pair in B-DNA from the classic Watson-Crick pairing to the alternative Hoogsteen pairing. Key advantages of our framework include the following: optimal interpretable CVs, avoiding costly calculation of committor or time-correlation functions, and automatic hyperparameter optimization. In addition, we show that applying a time-delay between the network input and output allows for enhanced selection of slow variables. Moreover, the network can also be used to generate molecular configurations of unexplored microstates, for example, for augmentation of the simulation data.

22 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A number of methods have been developed to reduce the vast amount of high-dimensional data to a small number of essential degrees of freedom representing the reaction coordinate, and if the reaction coordinates is known, a variety of approaches have been proposed to enhance the sampling along the important degree of freedom as mentioned in this paper.
Abstract: In molecular simulations, the identification of suitable reaction coordinates is central to both the analysis and sampling of transitions between metastable states in complex systems. If sufficient simulation data are available, a number of methods have been developed to reduce the vast amount of high-dimensional data to a small number of essential degrees of freedom representing the reaction coordinate. Likewise, if the reaction coordinate is known, a variety of approaches have been proposed to enhance the sampling along the important degrees of freedom. Often, however, neither one nor the other is available. One of the key questions is therefore, how to construct reaction coordinates and evaluate their validity. Another challenges arises from the physical interpretation of reaction coordinates, which is often addressed by correlating physically meaningful parameters with conceptually well-defined but abstract reaction coordinates. Furthermore, machine learning based methods are becoming more and more applicable also to the reaction coordinate problem. This perspective highlights central aspects in the identification and evaluation of reaction coordinates and discusses recent ideas regarding automated computational frameworks to combine the optimization of reaction coordinates and enhanced sampling.

12 citations

Journal ArticleDOI
TL;DR: The present study offers an AI-aided framework to explain the appropriate reaction coordinates, which acquires considerable significance when the number of degrees of freedom increases.
Abstract: A method for obtaining appropriate reaction coordinates is required to identify transition states distinguishing the product and reactant in complex molecular systems. Recently, abundant research has been devoted to obtaining reaction coordinates using artificial neural networks from deep learning literature, where many collective variables are typically utilized in the input layer. However, it is difficult to explain the details of which collective variables contribute to the predicted reaction coordinates owing to the complexity of the nonlinear functions in deep neural networks. To overcome this limitation, we used Explainable Artificial Intelligence (XAI) methods of the Local Interpretable Model-agnostic Explanation (LIME) and the game theory-based framework known as Shapley Additive exPlanations (SHAP). We demonstrated that XAI enables us to obtain the degree of contribution of each collective variable to reaction coordinates that is determined by nonlinear regressions with deep learning for the committor of the alanine dipeptide isomerization in vacuum. In particular, both LIME and SHAP provide important features to the predicted reaction coordinates, which are characterized by appropriate dihedral angles consistent with those previously reported from the committor test analysis. The present study offers an AI-aided framework to explain the appropriate reaction coordinates, which acquires considerable significance when the number of degrees of freedom increases.

8 citations

Journal ArticleDOI
TL;DR: DeepCV as mentioned in this paper is an open-source software written in Python/C++ object-oriented languages, based on the TensorFlow framework, which can be used to calculate molecular features, train models, generate CVs, validate rare events from sampling, and analyze a trajectory for chemical reactions of interest.
Abstract: We present Deep learning for Collective Variables (DeepCV), a computer code that provides an efficient and customizable implementation of the deep autoencoder neural network (DAENN) algorithm that has been developed in our group for computing collective variables (CVs) and can be used with enhanced sampling methods to reconstruct free energy surfaces of chemical reactions. DeepCV can be used to conveniently calculate molecular features, train models, generate CVs, validate rare events from sampling, and analyze a trajectory for chemical reactions of interest. We use DeepCV in an example study of the conformational transition of cyclohexene, where metadynamics simulations are performed using DAENN-generated CVs. The results show that the adopted CVs give free energies in line with those obtained by previously developed CVs and experimental results. DeepCV is open-source software written in Python/C++ object-oriented languages, based on the TensorFlow framework and distributed free of charge for noncommercial purposes, which can be incorporated into general molecular dynamics software. DeepCV also comes with several additional tools, i.e., an application program interface (API), documentation, and tutorials.

6 citations

Journal ArticleDOI
TL;DR: The ability to detect and juggle protein conformations supplemented by a physics-based understanding has implications for not only in vivo problems but also for resistance impeding drug discovery and bionano-sensor design as discussed by the authors.

5 citations

Journal ArticleDOI
TL;DR: The ability to detect and juggle protein conformations supplemented by a physics-based understanding has implications for not only in vivo problems but also for resistance impeding drug discovery and bionano-sensor design as discussed by the authors .

5 citations