scispace - formally typeset
Open AccessJournal ArticleDOI

Diffusion maps, reduction coordinates, and low dimensional representation of stochastic systems

Reads0
Chats0
TLDR
This paper uses the first few eigenfunctions of the backward Fokker–Planck diffusion operator as a coarse-grained low dimensional representation for the long-term evolution of a stochastic system and shows that they are optimal under a certain mean squared error criterion.
Abstract
The concise representation of complex high dimensional stochastic systems via a few reduced coordinates is an important problem in computational physics, chemistry, and biology. In this paper we use the first few eigenfunctions of the backward Fokker–Planck diffusion operator as a coarse-grained low dimensional representation for the long-term evolution of a stochastic system and show that they are optimal under a certain mean squared error criterion. We denote the mapping from physical space to these eigenfunctions as the diffusion map. While in high dimensional systems these eigenfunctions are difficult to compute numerically by conventional methods such as finite differences or finite elements, we describe a simple computational data-driven method to approximate them from a large set of simulated data. Our method is based on defining an appropriately weighted graph on the set of simulated data and computing the first few eigenvectors and eigenvalues of the corresponding random walk matrix on this graph...

read more

Content maybe subject to copyright    Report

DIFFUSION MAPS, REDUCTION COORDINATES AND LOW
DIMENSIONAL REPRESENTATION OF STOCHASTIC SYSTEMS
R.R. COIFMAN
, I.G. KEVREKIDIS
, S. LAFON
, M. MAGGIONI
§
, AND B. NADLER
Abstract.
The concise representation of complex high dimensional stochastic systems via a few reduced
coordinates is an important problem in computational physics, chemistry and biology. In this paper
we use the first few eigenfunctions of the backward Fokker-Planck diffusion operator as a coarse
grained low dimensional representation for the long term evolution of a stochastic system, and show
that they are optimal under a certain mean squared error criterion. We denote the mapping from
physical space to these eigenfunctions as the diffusion map. While in high dimensional systems these
eigenfunctions are difficult to compute numerically by conventional methods such as finite differences
or finite elements, we describe a simple computational data-driven method to approximate them from
a large set of simulated data. Our method is based on defining an appropriately weighted graph on the
set of simulated data, and computing the first few eigenvectors and eigenvalues of the corresponding
random walk matrix on this graph. Thus, our algorithm incorporates the local geometry and density
at each point into a global picture that merges in a natural way data from different simulation runs.
Furthermore, we describe lifting and restriction operators between the diffusion map space and the
original space. These operators facilitate the description of the coarse-grained dynamics, possibly
in the form of a low-dimensional effective free energy surface parameterized by the diffusion map
reduction coordinates. They also enable a systematic exploration of such effective free energy surfaces
through the design of additional “intelligently biased” computational experiments. We conclude by
demonstrating our method on a few examples.
Key words. Diffusion maps, dimensional reduction, stochastic dynamical systems, Fokker
Planck operator, metastable states, normalized graph Laplacian.
AMS subject classifications. 60H10, 60J60, 62M05
1. Introduction. Systems of stochastic differential equations (SDE’s) are com-
monly used as models for the time evolution of many chemical, physical and biological
systems of interacting particles [22, 45, 52]. There are two main approaches to the
study of such systems. The first is by detailed Brownian Dynamics (BD) or other
stochastic simulations, which follow the motion of each particle (or more generally
variable) in the system and generate one or more long trajectories. The second is
via analysis of the time evolution of the probability densities of these trajectories us-
ing the numerical solution of the corresponding time dependent Fokker-Planck (FP)
partial differential equation.
For typical high dimensional systems, both approaches suffer from severe limi-
tations, when applied directly. The main limitation of standard BD simulations is
the scale gap between the atomistic time scale of single particle motions, at which
the SDE’s are formulated, and the macroscopic time scales that characterize the long
term evolution and equilibration of these systems. This scale gap puts severe con-
straints on detailed simulations due to the requirement to accurately integrate the
fastest motions and degrees of freedom in the system, such as fast chemical reactions
and particle-particle collisions. Therefore, the time step in detailed simulations is typ-
ically constrained to b e orders of magnitude smaller than the characteristic times of
Department of Mathematics, Yale University, New Haven, CT, 06520.
Chemical Engineering and PACM, Princeton University, Princeton, NJ 08544.
Currently at Google, Inc.
§
Currently at department of mathematics, Duke University, Durham, NC 27708.
Corresponding author, Department of Computer Science and Applied Mathematics, Weizmann
Institute of Science, Rehovot, 76100, Israel. boaz.nadler@weizmann.ac.il.
1

the phenomena we wish to study. Moreover, for systems with well defined metastable
states and relatively rare transitions between them, direct simulations spend the ma-
jority of computer resources exploring the motion “within” the metastable states and
only an exponentially small part exploring the transitions “between them”, which are
often the quantity of interest.
The main limitation of standard computational methods that solve the FP equa-
tion is the curse of dimensionality. While for dimension n 3 the FP equation can
typically be solved numerically, in higher dimensional systems the solution of the
relevant partial differential equation is practically impossible by standard methods
such as finite differences or finite elements, since the number of grid points grows like
(1/h)
n
where h is the grid spacing. We note, however, that for some high dimensional
systems this direct computation is still possible with the construction of sparse grids
[9].
While in both approaches the detailed time evolution of a stochastic system re-
quires a high dimensional description with many degrees of freedom, often its long
term or coarse grained evolution is of a low dimensional nature. The main challenges
in this case are the identification of dynamically meaningful slow variables, or re-
duction coordinates
1
, and the description of the effective dynamics of the system in
this low dimensional representation. The main requirements for good reduction co-
ordinates is that they are dynamically meaningful, in the sense that we can write an
effective SDE for the long-term dynamics of the system based on these coordinates.
Thus, on a coarse enough time scale, the dynamics of the reduction coordinates is
approximately Markovian, without further dependence on the fine details of the high
dimensional system.
In some systems, either the form of the equations, prior experience, or some under-
lying physical intuition help determine good reduction coordinates. Then appropriate
equations can b e formulated in these variables, and in some special cases their exact
form can even be found by rigorous mathematics based on the Mori-Zwanzig pro-
jection approach [24, 55]. In more complex cases where a rigorous derivation of the
dynamics is mathematically intractable, many numerical approaches to solve these
tasks have been suggested in the literature, such as transition path sampling, the
nudged elastic band, the string method, the transfer operator approach, Perron clus-
ter analysis and many others, see [16, 17, 18, 19, 20, 26, 46], and references therein. In
addition, given further knowledge about the system, such as a good dividing surface
between reactant and product regions, several algorithms for the efficient computation
of the transition rates have been developed [44, 37, 53]. Despite these results, still in
many high dimensional systems useful low dimensional representations are far from
obvious.
In this paper we show that there is an intimate connection between the eigenfunc-
tions of the backward FP operator and useful global low dimensional representations,
and hence propose to use the first few of these eigenfunctions as reduction coordi-
nates. We show that the first few eigenfunctions are optimal under a mean square
error criterion for the approximation of probability densities in a suitable Hilbert
space, and denote the mapping from the physical space to the first few eigenfunctions
as a diffusion map. As in the case of the time-dependent FP equation, the compu-
1
We make a distinction between reduction coordinates of a general dynamical system, and the
reaction coordinate of chemical physics, which is typically a single variable that quantifies the progress
of a reaction. As we describe in the paper, reduction coordinates may also be introduced in the
absence of a chemical reaction and without well defined reactant and product regions.
2

tation of the eigenfunctions of the FP operator is practically impossible by standard
space discretization methods. In this paper we present a different approach, which
approximates these first few eigenfunctions from a large set of simulated data points.
Our algorithm is based on the definition of a weighted graph on the simulated points
and the subsequent computation of the first few eigenvalues and eigenvectors of a ran-
dom walk on this graph. The proposed method does not take into account the time
ordering of the simulated points, and can therefore easily merge data from different
simulation runs (with different initial conditions, different initial seeds of the random
number generator, etc.). As proven theoretically and shown in a few illustrative ex-
amples, in the presence of a spectral gap, the description of the system by the first
few eigenfunctions gives a dynamically meaningful low dimensional representation.
Furthermore, taking a step beyond data analysis and a low dimensional represen-
tation, we describe restriction and lifting operators between the original space and
the diffusion map space. These operators enable efficient extraction of the macro-
scopic dynamics in this lower dimensional representation. Specifically, following the
equation-free coarse molecular dynamics approach [31, 32, 33], we propose to explore
the effective free energy and diffusion coefficients as a function of the diffusion map
coordinates by a series of multiple short simulations appropriately initialized at given
values of these reduction coordinates. This methodology thus outlines a systematic
manner to bridge the scale gap and estimate macroscopic dynamics and quantities of
interest, such as mean exit times, transition probabilities, etc.
The paper is organized as follows. In section 2 we describe our problem and
present a concise review of known results in the theory of stochastic differential equa-
tions, making the paper reasonably self-contained. In section 3 we define the diffusion
distance between different configurations of a stochastic system and its relation to
the eigenfunctions of the FP operator and to a low dimensional representation of
the system. Section 4 describes an algorithm to approximate the diffusion map from
discrete data, as well as restriction and lifting operators that allow communication
between the two spaces. In section 5 we present applications of our method to a few
illustrative examples. We conclude in section 6 with a summary and discussion.
2. Problem Setup.
2.1. The Langevin Equation. Consider a stochastic system with n variables,
confined for simplicity to a finite compact connected region R
n
with smooth
reflecting boundaries. We assume that the time evolution of the system, described by
its state x(t) at time t (x(t) Ω), follows a first order stochastic differential equation
(SDE) written in non-dimensional form as
˙
x = −∇U (x) +
p
2
˙
w(2.1)
where U(x) is the potential energy of a configuration x, β = 1/k
B
T is a thermal factor,
and w(t) is standard Brownian motion in n dimensions. We assume the potential
U(x) to be smooth and in particular bounded from ab ove and below. However, much
of what follows, with suitable technical modifications, could be derived under more
general conditions, for example for a non-compact region Ω, or a potential U not
necessarily smooth or bounded, as long as the condition
Z
e
βU(x)
dx < (2.2)
is satisfied and under the assumption that the process is ergodic.
3

In this paper we focus on systems whose long time evolution is of a low dimen-
sional nature. This is the case, for example, in systems governed by rare events where
the potential U has a few deep wells separated by high barriers, or in systems with
well defined low dimensional manifolds where the potential U contains steep gradients
in all directions normal to the manifold, thus effectively constraining the system to
approximately lie on it. The task at hand is to find good low dimensional repre-
sentations of such systems and the characteristics of their coarse grained dynamics
in this representation. In the context of systems governed by rare events, typical
system level tasks include the identification of the metastable configurations and the
transition pathways and rates between them.
2.2. Forward and Backward Fokker-Planck Equations. Integration of the
SDE (2.1) produces random paths whose ensemble defines time dependent probability
distributions on Ω. To study the dynamics of the system, it is convenient to consider
the time evolution of these probability distributions. Specifically, from the theory of
stochastic processes [22, 45], the transition probability density p(x, t|x
0
, 0) of finding
the system at location x at time t, given an initial location x
0
at time t = 0 satisfies
the forward Fokker-Planck (also known as Smoluchowski) equation
p
t
= Lp =
1
β
p + · (pU)(2.3)
defined in (x, t) × R
+
, with reflecting (no flux) boundary conditions on Ω.
Under the smoothness assumption on the potential U and the compactness as-
sumption on the domain in which the Fokker-Planck equation (FPE) is defined,
the operator L has a discrete spectrum of non-positive eigenvalues {−λ
j
}
j=0
, with
λ
0
= 0 > λ
1
λ
2
. . ., with a single accumulation point at −∞ and with
associated eigenfunctions {ϕ
j
}
j=0
[13]. The solution of (2.3) can be written as
p(x, t|x
0
, 0) =
X
j=0
a
j
e
λ
j
t
ϕ
j
(x)(2.4)
where the coefficients a
j
depend on the initial conditions at time t = 0. Under fairly
general conditions on the potential U and the region Ω, the eigenfunctions ϕ
j
are
smooth bounded functions, and the sum in (2.4) converges uniformly in x for all
times t > t
0
with t
0
> 0, see for example [15]. The eigenfunction ϕ
0
(x) corresponding
to the eigenvalue λ
0
= 0 is given by the Boltzmann equilibrium distribution
ϕ
0
(x) = C
β
e
βU(x)
(2.5)
where C
β
is a temperature dependent normalization factor.
Since the stochastic process x(t) is ergodic, then regardless of the initial configu-
ration x
0
Ω,
lim
t→∞
p(x, t|x
0
, 0) = ϕ
0
(x)(2.6)
which means that a
0
= 1. Thus, according to (2.4) the approach to the equilibrium
density ϕ
0
(x) is governed by the next eigenfunctions {ϕ
j
}
j1
, and their corresponding
eigenvalues λ
j
and coefficients a
j
.
A different way to study the approach to equilibrium is to consider the time
evolution of averages of functions defined on Ω. Let f : R be a smooth function
in L
2
(Ω), and define
g(x, t) = E{f(x(t)) |x(0) = x}.(2.7)
4

Then, g satisfies the backward Fokker-Planck equation, also known as the Chapman-
Kolmogorov equation,
g
t
= L
g =
1
β
g g ·U(2.8)
in the domain (x, t) × R
+
, with initial conditions
g(x, 0) = f(x).(2.9)
The operator L
is the adjoint of L under the standard inner product in L
2
(Ω),
hu, vi =
Z
u(x)v(x)dx(2.10)
that is hLu, vi = hu, L
vi. Therefore, L
has the same eigenvalues {−λ
j
}
j0
as L
with corresponding eigenfunctions ψ
j
(x), and the solution to (2.8) can be written as
g(x, t) =
X
j
b
j
e
λ
j
t
ψ
j
(x).(2.11)
The eigenfunction corresponding to λ
0
= 0 is the constant function ψ
0
(x) = 1. Thus
lim
t→∞
g(x, t) = b
0
(2.12)
with the approach to this equilibrium constant governed by the next eigenfunctions
and eigenvalues {ψ
j
, λ
j
}, for j 1.
The operators L and L
are adjoint and thus the two sets of eigenfunctions ϕ
j
and ψ
j
can, and from now on will be normalized to be bi-orthonormal
hϕ
i
, ψ
j
i = δ
i,j
.(2.13)
Under this normalization, the coefficients a
j
, b
j
are given by
b
j
=
Z
f(x)ϕ
j
(x)dx(2.14)
and
a
j
=
Z
p(x, 0)ψ
j
(x)dx = ψ
j
(x
0
).(2.15)
One last theoretical result of interest is the connection between the eigenfunctions
ϕ
j
and ψ
j
. The transformation p(x) = e
U(x)
g(x) gives
Lp = e
U
L
g.(2.16)
Therefore, up to a normalization constant
ψ
j
(x) = ϕ
j
(x)e
U(x)
= ϕ
j
(x)
0
(x).(2.17)
Furthermore, under the normalization (2.13), the eigenfunctions ϕ
j
of the operator L
are orthonormal in L
2
(Ω, w(x)), where the inner product is with respect to the weight
function w(x) = 1
0
(x),
hu, vi
w
=
Z
u(x)v(x)w(x)dx.(2.18)
5

Citations
More filters
Journal ArticleDOI

Stochastic Processes in Physics and Chemistry

D Sherrington
- 01 Apr 1983 - 
TL;DR: Van Kampen as mentioned in this paper provides an extensive graduate-level introduction which is clear, cautious, interesting and readable, and could be expected to become an essential part of the library of every physical scientist concerned with problems involving fluctuations and stochastic processes.
Journal ArticleDOI

Enhancing Important Fluctuations: Rare Events and Metadynamics from a Conceptual Viewpoint

TL;DR: As demonstrated, metadynamics is not just a practical tool but can also be considered an important development in the theory of statistical mechanics.
Journal ArticleDOI

destiny: diffusion maps for large-scale single-cell data in R.

TL;DR: This work exemplarily applies destiny, an efficient R implementation of the diffusion map algorithm, to a recent time-resolved mass cytometry dataset of cellular reprogramming and presents an efficient nearest-neighbour approximation.
Journal ArticleDOI

Data-Driven Sparse Sensor Placement for Reconstruction: Demonstrating the Benefits of Exploiting Known Patterns

TL;DR: This article explores how to design optimal sensor locations for signal reconstruction in a framework that scales to arbitrarily large problems, leveraging modern techniques in machine learning and sparse sampling.
Journal ArticleDOI

Determination of reaction coordinates via locally scaled diffusion map

TL;DR: The technique is general enough to be applied to any system for which a Boltzmann-sampled set of molecular configurations is available, and the resulting global coordinates are correlated with the time scales of the molecular motion.
References
More filters
Journal ArticleDOI

Normalized cuts and image segmentation

TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.
Proceedings ArticleDOI

Normalized cuts and image segmentation

TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.
Book

Stochastic processes in physics and chemistry

TL;DR: In this article, the authors introduce the Fokker-planck equation, the Langevin approach, and the diffusion type of the master equation, as well as the statistics of jump events.
Book

Methods of Mathematical Physics

TL;DR: In this paper, the authors present an algebraic extension of LINEAR TRANSFORMATIONS and QUADRATIC FORMS, and apply it to EIGEN-VARIATIONS.
Related Papers (5)
Frequently Asked Questions (8)
Q1. What are the contributions mentioned in the paper "Diffusion maps, reduction coordinates and low dimensional representation of stochastic systems" ?

In this paper the authors use the first few eigenfunctions of the backward Fokker-Planck diffusion operator as a coarse grained low dimensional representation for the long term evolution of a stochastic system, and show that they are optimal under a certain mean squared error criterion. While in high dimensional systems these eigenfunctions are difficult to compute numerically by conventional methods such as finite differences or finite elements, the authors describe a simple computational data-driven method to approximate them from a large set of simulated data. Furthermore, the authors describe lifting and restriction operators between the diffusion map space and the original space. 

Beyond the benefits of dimensional reduction, the projection of the system onto the diffusion map coordinates also allows systematic design of computational experiments, where biased simulations are initialized at chosen values of the diffusion map coordinates, thus allowing efficient exploration of the dynamics of the system in these coordinates. 

since ψ1 is a slow coordinate, the equilibration time of the modified system is still of the same order of magnitude as the fast relaxation time τR. 

Their computational approach is closely related to the transfer operator approach [46], which also computes an approximation to the eigenfunctions of the FP operator [27], and to Perron cluster analysis [18, 19]. 

By computing the diffusion map, this simply amounts to searching for a rotation angle θ which makes the variable w cos(θ) + z sin(θ) as closest to oneto-one with ψ1 as possible. 

In this case, starting from xL a direct simulation would require an extremely long time to exit this well and find the other metastable states. 

The authors also note that [27] in fact considered the more general case of non-reversible diffusions and proved that the backward eigenfunctions can be used to partition the space into metastable states in this case as well. 

The main differences in their work is that the proposed dimensionality reduction is intrinsically related to the dynamics, and has provably good properties in approximating long-term behavior of the system.