What is the main reason for the use of diffusion maps?

Beyond the benefits of dimensional reduction, the projection of the system onto the diffusion map coordinates also allows systematic design of computational experiments, where biased simulations are initialized at chosen values of the diffusion map coordinates, thus allowing efficient exploration of the dynamics of the system in these coordinates.

What is the equilibration time of the modified system?

since ψ1 is a slow coordinate, the equilibration time of the modified system is still of the same order of magnitude as the fast relaxation time τR.

What is the way to compute the eigenfunctions of the FP operator?

Their computational approach is closely related to the transfer operator approach [46], which also computes an approximation to the eigenfunctions of the FP operator [27], and to Perron cluster analysis [18, 19].

How do the authors compute the diffusion map?

By computing the diffusion map, this simply amounts to searching for a rotation angle θ which makes the variable w cos(θ) + z sin(θ) as closest to oneto-one with ψ1 as possible.

How long would it take to find the other metastable states?

In this case, starting from xL a direct simulation would require an extremely long time to exit this well and find the other metastable states.

What is the way to solve the problem of non-reversible diffusions?

The authors also note that [27] in fact considered the more general case of non-reversible diffusions and proved that the backward eigenfunctions can be used to partition the space into metastable states in this case as well.

What is the main difference between the two approaches?

The main differences in their work is that the proposed dimensionality reduction is intrinsically related to the dynamics, and has provably good properties in approximating long-term behavior of the system.

(Open Access) Diffusion maps, reduction coordinates, and low dimensional representation of stochastic systems (2008) | Ronald R. Coifman

Q: What are the contributions mentioned in the paper "Diffusion maps, reduction coordinates and low dimensional representation of stochastic systems" ?

In this paper the authors use the first few eigenfunctions of the backward Fokker-Planck diffusion operator as a coarse grained low dimensional representation for the long term evolution of a stochastic system, and show that they are optimal under a certain mean squared error criterion. While in high dimensional systems these eigenfunctions are difficult to compute numerically by conventional methods such as finite differences or finite elements, the authors describe a simple computational data-driven method to approximate them from a large set of simulated data. Furthermore, the authors describe lifting and restriction operators between the diffusion map space and the original space.

DIFFUSION MAPS, REDUCTION COORDINATES AND LOW

DIMENSIONAL REPRESENTATION OF STOCHASTIC SYSTEMS

R.R. COIFMAN

∗

, I.G. KEVREKIDIS

†

, S. LAFON

∗‡

, M. MAGGIONI

∗§

, AND B. NADLER

Abstract.

The concise representation of complex high dimensional stochastic systems via a few reduced

coordinates is an important problem in computational physics, chemistry and biology. In this paper

we use the ﬁrst few eigenfunctions of the backward Fokker-Planck diﬀusion operator as a coarse

grained low dimensional representation for the long term evolution of a stochastic system, and show

that they are optimal under a certain mean squared error criterion. We denote the mapping from

physical space to these eigenfunctions as the diﬀusion map. While in high dimensional systems these

eigenfunctions are diﬃcult to compute numerically by conventional methods such as ﬁnite diﬀerences

or ﬁnite elements, we describe a simple computational data-driven method to approximate them from

a large set of simulated data. Our method is based on deﬁning an appropriately weighted graph on the

set of simulated data, and computing the ﬁrst few eigenvectors and eigenvalues of the corresponding

random walk matrix on this graph. Thus, our algorithm incorporates the local geometry and density

at each point into a global picture that merges in a natural way data from diﬀerent simulation runs.

Furthermore, we describe lifting and restriction operators between the diﬀusion map space and the

original space. These operators facilitate the description of the coarse-grained dynamics, possibly

in the form of a low-dimensional eﬀective free energy surface parameterized by the diﬀusion map

reduction coordinates. They also enable a systematic exploration of such eﬀective free energy surfaces

through the design of additional “intelligently biased” computational experiments. We conclude by

demonstrating our method on a few examples.

Key words. Diﬀusion maps, dimensional reduction, stochastic dynamical systems, Fokker

Planck operator, metastable states, normalized graph Laplacian.

AMS subject classiﬁcations. 60H10, 60J60, 62M05

1. Introduction. Systems of stochastic diﬀerential equations (SDE’s) are com-

monly used as models for the time evolution of many chemical, physical and biological

systems of interacting particles [22, 45, 52]. There are two main approaches to the

study of such systems. The ﬁrst is by detailed Brownian Dynamics (BD) or other

stochastic simulations, which follow the motion of each particle (or more generally

variable) in the system and generate one or more long trajectories. The second is

via analysis of the time evolution of the probability densities of these trajectories us-

ing the numerical solution of the corresponding time dependent Fokker-Planck (FP)

partial diﬀerential equation.

For typical high dimensional systems, both approaches suﬀer from severe limi-

tations, when applied directly. The main limitation of standard BD simulations is

the scale gap between the atomistic time scale of single particle motions, at which

the SDE’s are formulated, and the macroscopic time scales that characterize the long

term evolution and equilibration of these systems. This scale gap puts severe con-

straints on detailed simulations due to the requirement to accurately integrate the

fastest motions and degrees of freedom in the system, such as fast chemical reactions

and particle-particle collisions. Therefore, the time step in detailed simulations is typ-

ically constrained to b e orders of magnitude smaller than the characteristic times of

∗

Department of Mathematics, Yale University, New Haven, CT, 06520.

†

Chemical Engineering and PACM, Princeton University, Princeton, NJ 08544.

‡

Currently at Google, Inc.

Currently at department of mathematics, Duke University, Durham, NC 27708.

Corresponding author, Department of Computer Science and Applied Mathematics, Weizmann

Institute of Science, Rehovot, 76100, Israel. boaz.nadler@weizmann.ac.il.

the phenomena we wish to study. Moreover, for systems with well deﬁned metastable

states and relatively rare transitions between them, direct simulations spend the ma-

jority of computer resources exploring the motion “within” the metastable states and

only an exponentially small part exploring the transitions “between them”, which are

often the quantity of interest.

The main limitation of standard computational methods that solve the FP equa-

tion is the curse of dimensionality. While for dimension n ≤ 3 the FP equation can

typically be solved numerically, in higher dimensional systems the solution of the

relevant partial diﬀerential equation is practically impossible by standard methods

such as ﬁnite diﬀerences or ﬁnite elements, since the number of grid points grows like

(1/h)

where h is the grid spacing. We note, however, that for some high dimensional

systems this direct computation is still possible with the construction of sparse grids

[9].

While in both approaches the detailed time evolution of a stochastic system re-

quires a high dimensional description with many degrees of freedom, often its long

term or coarse grained evolution is of a low dimensional nature. The main challenges

in this case are the identiﬁcation of dynamically meaningful slow variables, or re-

duction coordinates

, and the description of the eﬀective dynamics of the system in

this low dimensional representation. The main requirements for good reduction co-

ordinates is that they are dynamically meaningful, in the sense that we can write an

eﬀective SDE for the long-term dynamics of the system based on these coordinates.

Thus, on a coarse enough time scale, the dynamics of the reduction coordinates is

approximately Markovian, without further dependence on the ﬁne details of the high

dimensional system.

In some systems, either the form of the equations, prior experience, or some under-

lying physical intuition help determine good reduction coordinates. Then appropriate

equations can b e formulated in these variables, and in some special cases their exact

form can even be found by rigorous mathematics based on the Mori-Zwanzig pro-

jection approach [24, 55]. In more complex cases where a rigorous derivation of the

dynamics is mathematically intractable, many numerical approaches to solve these

tasks have been suggested in the literature, such as transition path sampling, the

nudged elastic band, the string method, the transfer operator approach, Perron clus-

ter analysis and many others, see [16, 17, 18, 19, 20, 26, 46], and references therein. In

addition, given further knowledge about the system, such as a good dividing surface

between reactant and product regions, several algorithms for the eﬃcient computation

of the transition rates have been developed [44, 37, 53]. Despite these results, still in

many high dimensional systems useful low dimensional representations are far from

obvious.

In this paper we show that there is an intimate connection between the eigenfunc-

tions of the backward FP operator and useful global low dimensional representations,

and hence propose to use the ﬁrst few of these eigenfunctions as reduction coordi-

nates. We show that the ﬁrst few eigenfunctions are optimal under a mean square

error criterion for the approximation of probability densities in a suitable Hilbert

space, and denote the mapping from the physical space to the ﬁrst few eigenfunctions

as a diﬀusion map. As in the case of the time-dependent FP equation, the compu-

We make a distinction between reduction coordinates of a general dynamical system, and the

reaction coordinate of chemical physics, which is typically a single variable that quantiﬁes the progress

of a reaction. As we describe in the paper, reduction coordinates may also be introduced in the

absence of a chemical reaction and without well deﬁned reactant and product regions.

tation of the eigenfunctions of the FP operator is practically impossible by standard

space discretization methods. In this paper we present a diﬀerent approach, which

approximates these ﬁrst few eigenfunctions from a large set of simulated data points.

Our algorithm is based on the deﬁnition of a weighted graph on the simulated points

and the subsequent computation of the ﬁrst few eigenvalues and eigenvectors of a ran-

dom walk on this graph. The proposed method does not take into account the time

ordering of the simulated points, and can therefore easily merge data from diﬀerent

simulation runs (with diﬀerent initial conditions, diﬀerent initial seeds of the random

number generator, etc.). As proven theoretically and shown in a few illustrative ex-

amples, in the presence of a spectral gap, the description of the system by the ﬁrst

few eigenfunctions gives a dynamically meaningful low dimensional representation.

Furthermore, taking a step beyond data analysis and a low dimensional represen-

tation, we describe restriction and lifting operators between the original space and

the diﬀusion map space. These operators enable eﬃcient extraction of the macro-

scopic dynamics in this lower dimensional representation. Speciﬁcally, following the

equation-free coarse molecular dynamics approach [31, 32, 33], we propose to explore

the eﬀective free energy and diﬀusion coeﬃcients as a function of the diﬀusion map

coordinates by a series of multiple short simulations appropriately initialized at given

values of these reduction coordinates. This methodology thus outlines a systematic

manner to bridge the scale gap and estimate macroscopic dynamics and quantities of

interest, such as mean exit times, transition probabilities, etc.

The paper is organized as follows. In section 2 we describe our problem and

present a concise review of known results in the theory of stochastic diﬀerential equa-

tions, making the paper reasonably self-contained. In section 3 we deﬁne the diﬀusion

distance between diﬀerent conﬁgurations of a stochastic system and its relation to

the eigenfunctions of the FP operator and to a low dimensional representation of

the system. Section 4 describes an algorithm to approximate the diﬀusion map from

discrete data, as well as restriction and lifting operators that allow communication

between the two spaces. In section 5 we present applications of our method to a few

illustrative examples. We conclude in section 6 with a summary and discussion.

2. Problem Setup.

2.1. The Langevin Equation. Consider a stochastic system with n variables,

conﬁned for simplicity to a ﬁnite compact connected region Ω ⊆ R

with smooth

reﬂecting boundaries. We assume that the time evolution of the system, described by

its state x(t) at time t (x(t) ∈ Ω), follows a ﬁrst order stochastic diﬀerential equation

(SDE) written in non-dimensional form as

x = −∇U (x) +

2/β

w(2.1)

where U(x) is the potential energy of a conﬁguration x, β = 1/k

T is a thermal factor,

and w(t) is standard Brownian motion in n dimensions. We assume the potential

U(x) to be smooth and in particular bounded from ab ove and below. However, much

of what follows, with suitable technical modiﬁcations, could be derived under more

general conditions, for example for a non-compact region Ω, or a potential U not

necessarily smooth or bounded, as long as the condition

Ω

−βU(x)

dx < ∞(2.2)

is satisﬁed and under the assumption that the process is ergodic.

In this paper we focus on systems whose long time evolution is of a low dimen-

sional nature. This is the case, for example, in systems governed by rare events where

the potential U has a few deep wells separated by high barriers, or in systems with

well deﬁned low dimensional manifolds where the potential U contains steep gradients

in all directions normal to the manifold, thus eﬀectively constraining the system to

approximately lie on it. The task at hand is to ﬁnd good low dimensional repre-

sentations of such systems and the characteristics of their coarse grained dynamics

in this representation. In the context of systems governed by rare events, typical

system level tasks include the identiﬁcation of the metastable conﬁgurations and the

transition pathways and rates between them.

2.2. Forward and Backward Fokker-Planck Equations. Integration of the

SDE (2.1) produces random paths whose ensemble deﬁnes time dependent probability

distributions on Ω. To study the dynamics of the system, it is convenient to consider

the time evolution of these probability distributions. Speciﬁcally, from the theory of

stochastic processes [22, 45], the transition probability density p(x, t|x

, 0) of ﬁnding

the system at location x at time t, given an initial location x

at time t = 0 satisﬁes

the forward Fokker-Planck (also known as Smoluchowski) equation

∂p

∂t

= Lp =

∆p + ∇ · (p∇U)(2.3)

deﬁned in (x, t) ∈ Ω × R

, with reﬂecting (no ﬂux) boundary conditions on ∂Ω.

Under the smoothness assumption on the potential U and the compactness as-

sumption on the domain Ω in which the Fokker-Planck equation (FPE) is deﬁned,

the operator L has a discrete spectrum of non-positive eigenvalues {−λ

}

∞

j=0

, with

= 0 > −λ

≥ −λ

≥ . . ., with a single accumulation point at −∞ and with

associated eigenfunctions {ϕ

}

∞

j=0

[13]. The solution of (2.3) can be written as

p(x, t|x

, 0) =

∞

j=0

−λ

(x)(2.4)

where the coeﬃcients a

depend on the initial conditions at time t = 0. Under fairly

general conditions on the potential U and the region Ω, the eigenfunctions ϕ

are

smooth bounded functions, and the sum in (2.4) converges uniformly in x for all

times t > t

with t

> 0, see for example [15]. The eigenfunction ϕ

(x) corresponding

to the eigenvalue λ

= 0 is given by the Boltzmann equilibrium distribution

(x) = C

−βU(x)

(2.5)

where C

is a temperature dependent normalization factor.

Since the stochastic process x(t) is ergodic, then regardless of the initial conﬁgu-

ration x

∈ Ω,

lim

t→∞

p(x, t|x

, 0) = ϕ

(x)(2.6)

which means that a

= 1. Thus, according to (2.4) the approach to the equilibrium

density ϕ

(x) is governed by the next eigenfunctions {ϕ

}

j≥1

, and their corresponding

eigenvalues λ

and coeﬃcients a

A diﬀerent way to study the approach to equilibrium is to consider the time

evolution of averages of functions deﬁned on Ω. Let f : Ω → R be a smooth function

in L

(Ω), and deﬁne

g(x, t) = E{f(x(t)) |x(0) = x}.(2.7)

Then, g satisﬁes the backward Fokker-Planck equation, also known as the Chapman-

Kolmogorov equation,

∂g

∂t

= L

∗

g =

∆g − ∇g ·∇U(2.8)

in the domain (x, t) ∈ Ω × R

, with initial conditions

g(x, 0) = f(x).(2.9)

The operator L

∗

is the adjoint of L under the standard inner product in L

(Ω),

hu, vi =

Ω

u(x)v(x)dx(2.10)

that is hLu, vi = hu, L

∗

vi. Therefore, L

∗

has the same eigenvalues {−λ

}

j≥0

as L

with corresponding eigenfunctions ψ

(x), and the solution to (2.8) can be written as

g(x, t) =

−λ

(x).(2.11)

The eigenfunction corresponding to λ

= 0 is the constant function ψ

(x) = 1. Thus

lim

t→∞

g(x, t) = b

(2.12)

with the approach to this equilibrium constant governed by the next eigenfunctions

and eigenvalues {ψ

, λ

}, for j ≥ 1.

The operators L and L

∗

are adjoint and thus the two sets of eigenfunctions ϕ

and ψ

can, and from now on will be normalized to be bi-orthonormal

hϕ

, ψ

i = δ

i,j

.(2.13)

Under this normalization, the coeﬃcients a

, b

are given by

Ω

f(x)ϕ

(x)dx(2.14)

and

Ω

p(x, 0)ψ

(x)dx = ψ

).(2.15)

One last theoretical result of interest is the connection between the eigenfunctions

and ψ

. The transformation p(x) = e

−U(x)

g(x) gives

Lp = e

−U

∗

g.(2.16)

Therefore, up to a normalization constant

(x) = ϕ

(x)e

U(x)

= ϕ

(x)/ϕ

(x).(2.17)

Furthermore, under the normalization (2.13), the eigenfunctions ϕ

of the operator L

are orthonormal in L

(Ω, w(x)), where the inner product is with respect to the weight

function w(x) = 1/ϕ

(x),

hu, vi

Ω

u(x)v(x)w(x)dx.(2.18)

Diffusion maps, reduction coordinates, and low dimensional representation of stochastic systems

Figures

Citations

Stochastic Processes in Physics and Chemistry

Enhancing Important Fluctuations: Rare Events and Metadynamics from a Conceptual Viewpoint

destiny: diffusion maps for large-scale single-cell data in R.

Data-Driven Sparse Sensor Placement for Reconstruction: Demonstrating the Benefits of Exploiting Known Patterns

Determination of reaction coordinates via locally scaled diffusion map

References

Normalized cuts and image segmentation

Normalized cuts and image segmentation

Stochastic processes in physics and chemistry

Methods of Mathematical Physics

Handbook of Stochastic Methods

Related Papers (5)

Diffusion maps, spectral clustering and reaction coordinates of dynamical systems

Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps

A global geometric framework for nonlinear dimensionality reduction.

Laplacian Eigenmaps for dimensionality reduction and data representation

Nonlinear dimensionality reduction by locally linear embedding.

Frequently Asked Questions (8)

Q1. What are the contributions mentioned in the paper "Diffusion maps, reduction coordinates and low dimensional representation of stochastic systems" ?

Q2. What is the main reason for the use of diffusion maps?

Q3. What is the equilibration time of the modified system?

Q4. What is the way to compute the eigenfunctions of the FP operator?

Q5. How do the authors compute the diffusion map?

Q6. How long would it take to find the other metastable states?

Q7. What is the way to solve the problem of non-reversible diffusions?

Q8. What is the main difference between the two approaches?