scispace - formally typeset
Open AccessJournal ArticleDOI

Data-driven predictions of the Lorenz system

Reads0
Chats0
TLDR
The use of a data-driven method to model the dynamics of the chaotic Lorenz system leads to good prediction scores and does not require statistics of errors to be known, thus providing significant benefits compared to a simple Kalman filter update.
About
This article is published in Physica D: Nonlinear Phenomena.The article was published on 2020-07-01 and is currently open access. It has received 38 citations till now. The article focuses on the topics: Lorenz system & Recurrent neural network.

read more

Content maybe subject to copyright    Report

HAL Id: hal-02475962
https://hal.archives-ouvertes.fr/hal-02475962v2
Submitted on 10 Jun 2020
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-
entic research documents, whether they are pub-
lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diusion de documents
scientiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Data-driven predictions of the Lorenz system
Pierre Dubois, Thomas Gomez, Laurent Planckaert, Laurent Perret
To cite this version:
Pierre Dubois, Thomas Gomez, Laurent Planckaert, Laurent Perret. Data-driven predictions
of the Lorenz system. Physica D: Nonlinear Phenomena, Elsevier, 2020, 408, pp.132495.
�10.1016/j.physd.2020.132495�. �hal-02475962v2�

Data-driven predictions of the Lorenz system
Pierre Dubois
a,
, Thomas Gomez
a
, Laurent Planckaert
a
, Laurent Perret
b
a
Univ. Lille, CNRS, ONERA, Arts et Metiers Institute of Technology, Centrale Lille, UMR 9014 - LMFL -
Laboratoire de ecanique des fluides de Lille - Kamp´e de eriet, F-59000 Lille, France
b
Centrale Nantes, LHEEA UMR CNRS 6598, Nantes, France
Abstract
This paper investigates the use of a data-driven method to model the dynamics of the chaotic Lorenz
system. An architecture based on a recurrent neural network with long and short term dependencies
predicts multiple time steps ahead the position and velocity of a particle using a sequence of past
states as input. To account for modeling errors and make a continuous forecast, a dense artificial
neural network assimilates online data to detect and update wrong predictions such as non-relevant
switchings between lobes. The data-driven strategy leads to good prediction scores and does not require
statistics of errors to be known, thus providing significant benefits compared to a simple Kalman filter
update.
Keywords: data-driven modeling, data assimilation, chaotic system, neural networks
1. Introduction
Chaotic dynamical systems exhibit character-
istics (nonlinearities, boundedness, initial condi-
tion sensitivity) [1] encountered in real-world prob-
lems such as meteorology [2] and oceanography
[3]. The multiple time steps ahead prediction
of such a system is challenging because govern-
ing equations may be unknown or too costly to
evaluate. For instance, the Navier Stokes equa-
tions require prohibitive computational resources
to predict with great accuracy the velocity field
of a turbulent flow [4].
Data-driven modeling of dynamical systems is
an active research field whose objective is to in-
fer dynamics from data [5]. Regressive methods
in machine learning [6] are particularly suitable
for such tasks and have proven to reliably recon-
struct the state of a given system [7]. If param-
eters are not overfitted to training examples, the
data-driven model can also be used for predictive
Corresponding author: pierre.dubois@onera.fr
tasks, providing the input lies in the input domain
used for training. Main techniques in the litera-
ture include autoregressive techniques [8], dynam-
ical mode decomposition (DMD) [9], Hankel al-
ternative view of Koopman (HAVOK) [10] or un-
supervised methods such as CROM [11]. Neural
networks are also of increasing interest since they
can perform nonlinear regressions that are fast to
evaluate. Architectures with recurrent units are
recommended for time-series predictions because
memory is incorporated in the prediction process.
Neural networks can then learn chaotic dynamics
[12] and predict with great accuracy the future
state [13].
However, errors in modeling can lead to bad
multiple time steps ahead predictions of chaotic
dynamical systems: a tiny change in the initial
condition results in a big change in the output
[12]. To overcome the propagation of uncertain-
ties from the dynamical model (bad regression
choice in a data-driven approach or bad turbu-
lence modeling in CFD for instance) data assimi-
lation (DA) techniques have been developed [14].
March 12, 2020

They combine the predicted state of a system with
online measurements to get an updated state. Such
methods have successfully been applied in fluid
mechanics to obtain a better description of initial
or boundary conditions by finding the best com-
promise between experimental measurements and
CFD predictions [15]. Nevertheless, the dynam-
ical model can be slow to evaluate (limiting the
use to offline assimilations) and errors (initial con-
dition, dynamical model, measurements and un-
certainties) can be hard to estimate in real-world
applications.
In this paper, a data-driven approach is used
to discover a dynamical model for the Lorenz sys-
tem. To handle the chaotic nature of the system,
a recurrent neural network (RNN) dealing with
long and short term dependencies (LSTM) is con-
sidered [16]. To correct modeling errors, a dense
neural network (denoted hereafter DAN) whose
design is based on Kalman filtering techniques
is developed. Results are promising for predict-
ing multiple steps ahead the position and velocity
of a particle on the Lorenz attractor, using only
the initial sequence and real-time measurements
of the complete acceleration, the complete veloc-
ity or a single component of the velocity.
The paper is organized as follows. In Sec-
tion 2, the overall strategy is presented with a
quick understanding of how neural networks work.
In Section 3, results about the low dimensional
Lorenz system are shown, with a particular inter-
est in the impact of forecast horizon and noise.
A discussion is given in Section 4 before giving
concluding remarks.
2. Strategy
2.1. Proposed methodology
This paper investigates the use of neural net-
works to continuously predict a chaotic system
using a data-driven dynamical model and online
measurements. The method is summarized in Fig-
ure 1 and contains the following steps:
Consider m temporal states of the system.
The sequence is denoted [s]
t
tm1
where s is
the state of the system and whose dimension
is n
f
.
Predict n future states using a RNN with
long and short-term memory (LSTM). This
gives a predicted sequence [s
b
]
t+n
t+1
where su-
perscript b indicates a prediction.
Predict the measured sequence. This gives
[y
b
]
t+n
t+1
where y
b
is the predicted measure of
the state. The mapping between the state
space and the measurement space is per-
formed by a dense neural network called the
shallow encoder (SE).
Assimilate the exact sequence of measure-
ments [y]
t+n
t+1
and update the predicted se-
quence of states. This work is performed
by a dense neural network which gives an
updated sequence [s
a
]
t+n
t+1
where superscript
a stands for ”analyzed”. The network is
called the data assimilation network (DAN).
Construct [s
a
]
t+n
t+nm+1
by adding m n up-
dated states from the previous iteration. This
gives a new input that can be used to cycle
and continue the forecasting process.
In this section, we give a quick overview of
neural networks and explain architectures behind
the dynamical model (RNN-LSTM), the measure-
ment operator (SE) and the data assimilation pro-
cess (DAN).
2.2. Quick overview of neural networks
A neuron is a unit passing a sum of weighted
inputs through an activation function that intro-
duce nonlinearities. These functions are classi-
cally a sigmoid σ(x) =
1
1 + e
x
, a hyperbolic tan-
gent tanh(x) or a rectified linear unit relu(x) =
max(0, x). When neurons are organized in fully
connected layers, the resulting network is called
a dense neural network. The universal approxi-
mation theorem [17] states that any function can
be approximated by a sufficiently large network
i.e. one hidden layer with a large number of neu-
rons. Just like a linear regression y = ax + b aims
2

[s]
t
tm1
RNN - LSTM
[s
b
]
t+n
t+1
SE
[y
b
]
t+n
t+1
[y]
t+n
t+1
DAN
[s
a
]
t+n
t+1
[s
a
]
t+n
t+nm+1
Figure 1: Summary of the data-driven method to make predictions of a chaotic system. A data-driven dynamical model
(RNN-LSTM) predicts n future states of the system and the predicted sequence is updated according to a real sequence
of measurements.
at learning the best a and b parameters, a neu-
ral network regression y = NN(x) aims at learn-
ing the best weights and biases in the network by
optimizing a loss function evaluated on a set of
training data.
Although they are universal approximators,
dense neural networks face some limitations: they
may suffer from vanishing or exploding gradient
(arising from derivatives of activation functions,
see [18]), are prone to overfitting (fitting that cor-
responds too much to training data) and inputs
are not individually processed. Other architec-
tures of artificial neural networks have then been
developed, including convolutional networks (CNN,
for image recognition) or recurrent neural net-
works (RNN, inputs are taken sequentially). Re-
current networks use their internal state (denoted
h) to process each input from the sequence of in-
puts. This internal state is computed using an
activation function but to avoid limitations from
dense networks, its form is more elaborate. For
example, Long Short-Term Memory (LSTM) cells
[19] are combinations of classical activation func-
tions (sigmoids and tanh) that incorparate a long
and short term memory mechanism through the
cell state (see Figure 2).
Several techniques exist to learn parameters in
neural networks. The most common is the gradi-
ent descent, which iteratively update parameters
according to the gradient of the cost function with
respect to weights and biases. The computation
of gradients is made by backpropagating errors
in the network, using backpropagation for dense
neural networks or backpropagation through time
for RNN [6]. The equations can be found in [20]
for the curious reader. In this paper, all neural
networks are implemented using the Keras library
[21].
In this paper, hyperparameters are not tuned.
No grid search or genetic optimization is intended
and number of neurons, number of hidden lay-
ers and activation functions are found by succes-
sive trials. Defined architectures must not then
be considered as a rule of thumb.
2.3. Novelty of the work
This paper proposes a regressive framework
for assimilating data as opposed to standard data
assimilation techniques whose architecture does
not depend on the problem. Besides, the present
paper considers time marching of an entire se-
quence of the state while the most standard ap-
proaches involve a time marching of the predicted
state at regular time units. More details about
existing works are given in Section 4.
2.4. Dynamical model
The first step is to establish a dynamical model
mapping m previous states s(t) to n future states.
The chosen architecture is summarized in Figure
3

s(j)
tanh
h(j)
(a) RNN
s(j)
LSTM
h(j) and C(j)
(b) RNN - LSTM
σ σ
tanh
σ
× +
× ×
tanh
C(j 1)
h(j 1)
s(j)
C(j)
h(j)
F G
IG OG
(c) LSTM cell. The recurrent unit is composed of a cell state and
gating mechanisms. The cell state C is modified when fed with a
new time step from the input sequence, forgetting past information
(via Forget Gate FG), storing new information (via Input Gate IG)
and creating a short-memory (via Output Gate OG). Mathematical
details are given in the appendix.
Figure 2: Two types of recurrent neural networks: simple RNN handling short-term depedencies via a hidden state h
(subfigure a) and RNN-LSTM handling short and long-term depedencies via a hidden state h, a cell state C and gating
mechanisms (subfigures b and c). Each time step s(j) from the input sequence is combined with h(j 1) (and C(j 1)
for LSTM-RNN) which was (were) computed at previous time step.
3. In the recurent layer, 2m LSTM cells (making
the cell state a 2m dimensional vector) processes
the input sequence [s]
t
tm+1
. This results in a final
output o(t) = h(t) summarizing all relevant infor-
mation from the input sequence. In dense layers,
the final output from the recurent layer is used to
predict n future states [s
b
]
t+n
t+1
Concerning the the
number of recurent units, it has been chosen to
echo results of Faqih et al. [1] where best scores
were obtained by considering twice as many neu-
rons than the history window. Authors made this
conclusion after trying to predict multiple steps
ahead the state of the Lorenz 63 system using a
dense neural network with radial basis functions.
About the training of the model, the procedure is
as follows:
1. Simulate the system to get data t s(t).
For the considered Lorenz system, only one
trajectory is simulated but it covers a good
region of the phase space.
2. Split data into training and testing sets. In
this work, 2/3 of the data is used for the
4

Figures
Citations
More filters
Journal ArticleDOI

Data-driven modeling and analysis based on complex network for multimode recognition of industrial processes

TL;DR: A construction method of complex network is proposed to describe the potential relationship in industrial big data and takes the maximum modularity as the optimization objective, and partitions each node into one community through greedy search algorithm.
Journal ArticleDOI

Prediction of chaotic time series using recurrent neural networks and reservoir computing techniques: A comparative study

TL;DR: In this article , the authors investigate the capability of gated recurrent neural networks, including long short-term memory (LSTM) and GRU networks, for the prediction of chaotic time series and compare their performance in terms of accuracy, efficiency and robustness.
Journal ArticleDOI

Machine learning for fluid flow reconstruction from limited measurements

TL;DR: The results suggest that proper machine learning approaches to fluid flow data can lead to effective reconstruction models that can be used for the rapid estimation of complex flows.
Journal Article

Learning continuous models for continuous physics

TL;DR: A convergence test based on numerical analysis theory is developed that verifies whether a model has learned a function that accurately approximates a system’s underlying continuous dynamics.
Journal ArticleDOI

New Results for Prediction of Chaotic Systems Using Deep Recurrent Neural Networks

TL;DR: In this article, the authors compared the performance of deep learning neural network models such as LSTM-RNN and gate recurrent unit deep recurrent neural network (GRU-DRNN) for predicting three different chaotic systems such as the Lorenz system, Rabinovich-Fabrikant and Rossler System.
References
More filters
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Journal ArticleDOI

Long short-term memory

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Posted Content

Adam: A Method for Stochastic Optimization

TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.
Journal ArticleDOI

Multilayer feedforward networks are universal approximators

TL;DR: It is rigorously established that standard multilayer feedforward networks with as few as one hidden layer using arbitrary squashing functions are capable of approximating any Borel measurable function from one finite dimensional space to another to any desired degree of accuracy, provided sufficiently many hidden units are available.
Journal ArticleDOI

Deterministic nonperiodic flow

TL;DR: In this paper, it was shown that nonperiodic solutions are ordinarily unstable with respect to small modifications, so that slightly differing initial states can evolve into considerably different states, and systems with bounded solutions are shown to possess bounded numerical solutions.
Related Papers (5)
Frequently Asked Questions (15)
Q1. What are the contributions in "Data-driven predictions of the lorenz system" ?

This paper investigates the use of a data-driven method to model the dynamics of the chaotic Lorenz system. An architecture based on a recurrent neural network with long and short term dependencies predicts multiple time steps ahead the position and velocity of a particle using a sequence of past states as input. 

Future works could include the tuning of hyperparameters ( to have an optimal design for each neural networks ) and the application to a high dimensional attractor where, similarly to Lorenz system, extreme events could be encountered. 

To make a continuous forecast of the state using a data-driven dynamical model, it is necessary to limit the accumulation of prediction errors [23] by incorporating online data in the prediction process. 

The system is simulated using a Runge Kutta 4 method, a random initial condition and a time step of 0.005s, for a total of 15000 samples. 

Future works could include the tuning of hyperparameters (to have an optimal design for each neural networks) and the application to a high dimensional attractor where, similarly to Lorenz system, extreme events could be encountered. 

Other architectures of artificial neural networks have then been developed, including convolutional networks (CNN, for image recognition) or recurrent neural networks (RNN, inputs are taken sequentially). 

To avoid overfitting and ensure that weights and biases learned during training are relevant for future use on test set, errors evaluated on training and validation sets should be close. 

errors in modeling can lead to bad multiple time steps ahead predictions of chaotic dynamical systems: a tiny change in the initial condition results in a big change in the output [12]. 

It appears that small sequences of vxare linearly correlated to all features in the state (linear correlation coefficient close to 1), which is no longer the case for medium and large sequences where nonlinearities arise (linear correlation coefficient between 0.6 and 0.7). 

Vashista [27] directly train a RNN - LSTM network to simulate ensemble kalman filter data assimilation using the differentiable architecture search framework. 

Results are promising for predicting multiple steps ahead the position and velocity of a particle on the Lorenz attractor, using only the initial sequence and real-time measurements of the complete acceleration, the complete velocity or a single component of the velocity. 

In Loh et al. [23], authors update LSTM predictions of flow rates in gaz wells using an ensemble kalman filter, thus estimating errors via the covariance of an ensemble of predictions. 

As expected, increasing the forecast window leads to a bigger impact on the global score (e2/e1 increasing) because prediction errors accumulate on longer sequences. 

Following this method, forcing statistics appear nongaussian, with long tails corresponding to rare intermitting forcing preceding switching events (see Figures 7a and 7c). 

The authors can observe that the DAN performs better for medium and large sequences but has poor performance on small sequences compared to the Kalman filter.