scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Constructing the equilibrium ensemble of folding pathways from short off-equilibrium simulations

10 Nov 2009-Proceedings of the National Academy of Sciences of the United States of America (National Academy of Sciences)-Vol. 106, Iss: 45, pp 19011-19016
TL;DR: An approach is presented that allows for the reconstruction of the full ensemble of folding pathways from simulations that are much shorter than the folding time, and reveals the existence of misfolded trap states outside the network of efficient folding intermediates that significantly reduce the folding speed.
Abstract: Characterizing the equilibrium ensemble of folding pathways, including their relative probability, is one of the major challenges in protein folding theory today. Although this information is in principle accessible via all-atom molecular dynamics simulations, it is difficult to compute in practice because protein folding is a rare event and the affordable simulation length is typically not sufficient to observe an appreciable number of folding events, unless very simplified protein models are used. Here we present an approach that allows for the reconstruction of the full ensemble of folding pathways from simulations that are much shorter than the folding time. This approach can be applied to all-atom protein simulations in explicit solvent. It does not use a predefined reaction coordinate but is based on partitioning the state space into small conformational states and constructing a Markov model between them. A theory is presented that allows for the extraction of the full ensemble of transition pathways from the unfolded to the folded configurations. The approach is applied to the folding of a PinWW domain in explicit solvent where the folding time is two orders of magnitude larger than the length of individual simulations. The results are in good agreement with kinetic experimental data and give detailed insights about the nature of the folding process which is shown to be surprisingly complex and parallel. The analysis reveals the existence of misfolded trap states outside the network of efficient folding intermediates that significantly reduce the folding speed.
Citations
More filters
Journal ArticleDOI
15 Oct 2010-Science
TL;DR: Simulation of the folding of a WW domain showed a well-defined folding pathway and simulation of the dynamics of bovine pancreatic trypsin inhibitor showed interconversion between distinct conformational states.
Abstract: Molecular dynamics (MD) simulations are widely used to study protein motions at an atomic level of detail, but they have been limited to time scales shorter than those of many biologically critical conformational changes. We examined two fundamental processes in protein dynamics—protein folding and conformational change within the folded state—by means of extremely long all-atom MD simulations conducted on a special-purpose machine. Equilibrium simulations of a WW protein domain captured multiple folding and unfolding events that consistently follow a well-defined folding pathway; separate simulations of the protein’s constituent substructures shed light on possible determinants of this pathway. A 1-millisecond simulation of the folded protein BPTI reveals a small number of structurally distinct conformational states whose reversible interconversion is slower than local relaxations within those states by a factor of more than 1000.

1,650 citations

Journal ArticleDOI
TL;DR: An upper bound for the approximation error made by modeling molecular dynamics with a Markov chain is described and it is shown that this error can be made arbitrarily small with surprisingly little effort.
Abstract: Markov state models of molecular kinetics (MSMs), in which the long-time statistical dynamics of a molecule is approximated by a Markov chain on a discrete partition of configuration space, have seen widespread use in recent years. This approach has many appealing characteristics compared to straightforward molecular dynamics simulation and analysis, including the potential to mitigate the sampling problem by extracting long-time kinetic information from short trajectories and the ability to straightforwardly calculate expectation values and statistical uncertainties of various stationary and dynamical molecular observables. In this paper, we summarize the current state of the art in generation and validation of MSMs and give some important new results. We describe an upper bound for the approximation error made by modeling molecular dynamics with a MSM and we show that this error can be made arbitrarily small with surprisingly little effort. In contrast to previous practice, it becomes clear that the best MSM is not obtained by the most metastable discretization, but the MSM can be much improved if non-metastable states are introduced near the transition states. Moreover, we show that it is not necessary to resolve all slow processes by the state space partitioning, but individual dynamical processes of interest can be resolved separately. We also present an efficient estimator for reversible transition matrices and a robust test to validate that a MSM reproduces the kinetics of the molecular dynamics data.

1,082 citations

Journal ArticleDOI
TL;DR: In a systematic review of scaffold architectures, the underlying effects and control options will be demonstrated, and suggestions will be given for designing effective multivalent binding systems, as well as for polyvalent therapeutics.
Abstract: Multivalent interactions can be applied universally for a targeted strengthening of an interaction between different interfaces or molecules. The binding partners form cooperative, multiple receptor-ligand interactions that are based on individually weak, noncovalent bonds and are thus generally reversible. Hence, multi- and polyvalent interactions play a decisive role in biological systems for recognition, adhesion, and signal processes. The scientific and practical realization of this principle will be demonstrated by the development of simple artificial and theoretical models, from natural systems to functional, application-oriented systems. In a systematic review of scaffold architectures, the underlying effects and control options will be demonstrated, and suggestions will be given for designing effective multivalent binding systems, as well as for polyvalent therapeutics.

820 citations

Journal ArticleDOI
TL;DR: The variational principle of conformation dynamics is used to derive an optimal way of identifying the "slow subspace" of a large set of prior order parameters - either generic internal coordinates or a user-defined set of parameters.
Abstract: A goal in the kinetic characterization of a macromolecular system is the description of its slow relaxation processes via (i) identification of the structural changes involved in these processes and (ii) estimation of the rates or timescales at which these slow processes occur. Most of the approaches to this task, including Markov models, master-equation models, and kinetic network models, start by discretizing the high-dimensional state space and then characterize relaxation processes in terms of the eigenvectors and eigenvalues of a discrete transition matrix. The practical success of such an approach depends very much on the ability to finely discretize the slow order parameters. How can this task be achieved in a high-dimensional configuration space without relying on subjective guesses of the slow order parameters? In this paper, we use the variational principle of conformation dynamics to derive an optimal way of identifying the "slow subspace" of a large set of prior order parameters - either generic internal coordinates or a user-defined set of parameters. Using a variational formulation of conformational dynamics, it is shown that an existing method-the time-lagged independent component analysis-provides the optional solution to this problem. In addition, optimal indicators-order parameters indicating the progress of the slow transitions and thus may serve as reaction coordinates-are readily identified. We demonstrate that the slow subspace is well suited to construct accurate kinetic models of two sets of molecular dynamics simulations, the 6-residue fluorescent peptide MR121-GSGSW and the 30-residue intrinsically disordered peptide kinase inducible domain (KID). The identified optimal indicators reveal the structural changes associated with the slow processes of the molecular system under analysis.

813 citations

Journal ArticleDOI
TL;DR: The open-source Python package PyEMMA is presented, derived a systematic and accurate way to coarse-grain MSMs to few states and to illustrate the structures of the metastable states of the system.
Abstract: Markov (state) models (MSMs) and related models of molecular kinetics have recently received a surge of interest as they can systematically reconcile simulation data from either a few long or many short simulations and allow us to analyze the essential metastable structures, thermodynamics, and kinetics of the molecular system under investigation. However, the estimation, validation, and analysis of such models is far from trivial and involves sophisticated and often numerically sensitive methods. In this work we present the open-source Python package PyEMMA (http://pyemma.org) that provides accurate and efficient algorithms for kinetic model construction. PyEMMA can read all common molecular dynamics data formats, helps in the selection of input features, provides easy access to dimension reduction algorithms such as principal component analysis (PCA) and time-lagged independent component analysis (TICA) and clustering algorithms such as k-means, and contains estimators for MSMs, hidden Markov models, an...

809 citations

References
More filters
Journal ArticleDOI
TL;DR: The software suite GROMACS (Groningen MAchine for Chemical Simulation) that was developed at the University of Groningen, The Netherlands, in the early 1990s is described, which is a very fast program for molecular dynamics simulation.
Abstract: This article describes the software suite GROMACS (Groningen MAchine for Chemical Simulation) that was developed at the University of Groningen, The Netherlands, in the early 1990s. The software, written in ANSI C, originates from a parallel hardware project, and is well suited for parallelization on processor clusters. By careful optimization of neighbor searching and of inner loop performance, GROMACS is a very fast program for molecular dynamics simulation. It does not have a force field of its own, but is compatible with GROMOS, OPLS, AMBER, and ENCAD force fields. In addition, it can handle polarizable shell models and flexible constraints. The program is versatile, as force routines can be added by the user, tabulated functions can be specified, and analyses can be easily customized. Nonequilibrium dynamics and free energy determinations are incorporated. Interfaces with popular quantum-chemical packages (MOPAC, GAMES-UK, GAUSSIAN) are provided to perform mixed MM/QM simulations. The package includes about 100 utility and analysis programs. GROMACS is in the public domain and distributed (with source code and documentation) under the GNU General Public License. It is maintained by a group of developers from the Universities of Groningen, Uppsala, and Stockholm, and the Max Planck Institute for Polymer Research in Mainz. Its Web site is http://www.gromacs.org.

13,116 citations

Journal ArticleDOI
TL;DR: The general energy landscape picture provides a conceptual framework for understanding both two-state and multi-state folding kinetics and hopes to learn much more about the real shapes of protein folding landscapes.
Abstract: A new view of protein folding kinetics replaces the idea of ‘folding pathways’ with the broader notions of energy landscapes and folding funnels. New experiments are needed to explore them.

2,320 citations


"Constructing the equilibrium ensemb..." refers result in this paper

  • ...Moreover, the picture we suggest here is fully compatible with the widely accepted folding-funnel model (38, 39), which suggests a narrowing down of a large conformational heterogeneity to the native conformations via parallel routes....

    [...]

Journal ArticleDOI
TL;DR: This article reviews the concepts and methods of transition path sampling, which allow computational studies of rare events without requiring prior knowledge of mechanisms, reaction coordinates, and transition states.
Abstract: This article reviews the concepts and methods of transition path sampling. These methods allow computational studies of rare events without requiring prior knowledge of mechanisms, reaction coordinates, and transition states. Based upon a statistical mechanics of trajectory space, they provide a perspective with which time dependent phenomena, even for systems driven far from equilibrium, can be examined with the same types of importance sampling tools that in the past have been applied so successfully to static equilibrium properties.

1,843 citations


"Constructing the equilibrium ensemb..." refers background in this paper

  • ...What is the probability distribution of the trajectories leaving A and continuing on to B? That is, what is the typical sequence of I states used along the transition pathways? The essential ingredient required to compute the statistics of transition pathways is the committor probability, qi , defined as the probability, when being at state i, that the system will reach the set B next rather than A (22, 24, 25)....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the main contributions of microscopic consideration can offer are (1) the understanding and interpretation of experimental results, (2) semiquantitative estimates of experimental result, and (3) the capability to interpolate or extrapolate experimental data into regions that are only difficultly accessible in the laboratory.
Abstract: During recent decades it has become feasible to simulate the dynamics of molecular systems on a computer. The method of molecular dynamics (MD) solves Newton's equations of motion for a molecular system, which results in trajectories for all atoms in the system. From these atomic trajectories a variety of properties can be calculated. The aim of computer simulations of molecular systems is to compute macroscopic behavior from microscopic interactions. The main contributions a microscopic consideration can offer are (1) the understanding and (2) interpretation of experimental results, (3) semiquantitative estimates of experimental results, and (4) the capability to interpolate or extrapolate experimental data into regions that are only difficultly accessible in the laboratory. One of the two basic problems in the field of molecular modeling and simulation is how to efficiently search the vast configuration space which is spanned by all possible molecular conformations for the global low (free) energy regions which will be populated by a molecular system in thermal equilibrium. The other basic problem is the derivation of a sufficiently accurate interaction energy function or force field for the molecular system of interest. An important part of the art of computer simulation is to choose the unavoidable assumptions, approximations and simplifications of the molecular model and computational procedure such that their contributions to the overall inaccuracy are of comparable size, without affecting significantly the property of interest. Methodology and some practical applications of computer simulation in the field of (bio)chemistry will be reviewed.

1,443 citations

Journal ArticleDOI
17 Mar 1995-Science
TL;DR: Using experimental data, Onuchic et al. have estimated the extent, ruggedness, and slope of the folding funnel and similar parameters characterize the energy landscape of simple computer models of self-interacting necklaces of beads, which lack most of the details of helical real proteins.
Abstract: To fold, a protein navigates with remarkable ease through a complicated energy landscape as it explores many possible physical configurations. This feat is beginning to be quantitatively understood by means of statistical mechanics and simplified computer models (1). Folded proteins are marvels of molecular engineering and it is hard to avoid thinking that all of their complex structural features play a role in their folding through an obligate En multistep mechanism. A unique folding pathway, if it exists, could be elucidated with classical chemical experiments. A newer view holds that in the earlier stages a protein possesses a large ensemble of structures. The problem is not to find a single route but to characterE ize the dynamics of the ensemble n through a statistical description of 2 the topography of the free-energy landscape. Folding is easy if the landscape resembles a many-dimensional funnel leading through a myriad of pathways to the native structure. Only a few parameters should be needed to characterize statistically the topography of and routes down the folding funnel. Using experimental data, Onuchic et al. have estimated the extent, ruggedness, and slope of the folding funnel (2). Similar parameters characterize the energy landscape Enat of simple computer models of pron teins. These models of self-interacting necklaces of beads, often on Fig. 1. lattices, lack most of the details of helical real proteins, but establishing a represE quantitative correspondence bethrougt tween the landscapes of computer emerg models and real proteins makes it Q, is in possible to use simulations to understand folding kinetics. The extent of a protein energy landscape is huge. Before folding, each residue can take on about 10 different conformations; thus, a 60-residue protein can be in any of 1060 states. An unguided search, like a

1,111 citations