Well-Posedness And Accuracy Of The Ensemble Kalman Filter In Discrete And Continuous Time
Summary (4 min read)
1 Introduction
- The algorithm is used in oceanography, oil reservoir simulation and weather prediction [BVLE98, EVL00, Kal03, ORL08], for example.
- Its behaviour is not well understood.
- Section 4 contains theoretical analyses of the perturbed observation EnKF, without and with variance inflation.
2.1 Filtering Distribution
- The authors assume that the observed dynamics are governed by an evolution equation du dt = F (u) (2.1) which generates a one-parameter semigroup.
- The authors also assume that K ⊂ H is another Hilbert space, which acts as the observation space.
- The authors assume that noisy observations are made in K every h time units and write Ψ = Ψh.
- That is, given the distribution uj |Yj as well as the observation yj+1, find the distribution of uj+.
- The authors refer to the sequence P(uj |Yj) as the filtering distribution.
2.2 Assumptions
- To write down the EnKF as the authors do in section 3, and indeed to derive the continuum limit of the EnKF, as they do in section 5, they need make no further assumptions about the underlying dynamics and observation operator other than those made above.
- In infinite dimensions the existence of a global attractor in V follows from the techniques in [Tem97] for the Navier-Stokes equation by application of more subtle inequalities relating to the bilinear operator B – see section 2.2 in Chapter III of [Tem97].
- Other equations arising in dissipative fluid mechanics can be treated similarly.
- Whilst the preceding assumptions on the underlying dynamics apply to a range of interesting models arising in applications, the following assumptions on the observation model are rather restrictive; however the authors have been unable to extend the analysis without making them.
- The following consequence of Assumption 2.3 will be useful to us.
3.1 The Algorithm
- The analysis step is achieved by performing a randomised version of the Kalman update formula, and using the empirical covariance of the prediction ensemble to compute the Kalman gain.
- There are many variants on the basic EnKF idea and the authors will study the perturbed observation form of the method.
- The sequence of minimisers vj+1 can be written down explicitly by simply solving the quadratic minimization problem.
- This straightforward exercise yields the following result.
3.3 Connection to Randomized Maximum Likelihood
- The analysis step of EnKF can be understood in terms of the Randomised Maximum Likelihood (RML) method widely used in oil reservoir history matching applications [ORL08].
- The authors will now briefly describe this method.
- There are several reasons that this update step only produces approximate samples from the filtering distribution.
- The decision to use the empirical distribution instead of say the push-forward of the covariance uj |Yj gives a huge advantage to the EnKF in terms of computational efficiency.
3.5 Variance Inflation
- The minimization step of the EnKF computes an update which is a compromise between the model predictions and the data.
- This compromise is weighted by the empirical covariance on the model and the fixed noise covariance on the data.
- The model typically allows for unstable divergence of trajectories, whilst the data tends to stabilize.
- Variance inflation is a technique of adding stability to the algorithm by increasing the size of the model covariance in order to weight the data more heavily.
- Furthermore, by adding a positive definite operator, one eliminates the null-space of Ĉj+1 (which will always be present if the number of ensemble members is smaller than the dimension of H) effectively preventing the ensemble from becoming degenerate.
4 Discrete-Time Estimates
- The authors will derive long-time estimates for the discrete-time EnKF, under the Assumptions 2.3 and 2.5 on the dynamics and observation models respectively.
- The authors study the algorithm without and then with variance inflation.
- The technique is to consider evolution of the error between the filter and the true signal underlying the data.
- (4.1) Throughout this section the authors use E to denote expectation with respect to the independent i.i.d. noise sequences {ξj} and {ξ(k)j } and initial conditions u0 and v (k) 0 .
4.1 Well-Posedness Without Variance Inflation
- Note also that Ĉj+1 has rank K and let Pj+1 denote projection into the finite dimensional subspace orthogonal to the kernel of Ĉj+1.
- Here the authors have used the fact that Pj+1 projects onto a space of dimension at most K.
- The preceding result shows that the EnKF is well-posed and does not blow-up faster than exponentially.
- The authors now show that, with the addition of variance inflation, a stronger result can be proved, implying accuracy of the EnKF.
4.3 Accuracy With Variance Inflation
- The authors will focus on the variance inflation technique with A = α2I .
- Again assuming H = I and Γ = γ2I , the EnKF ensemble is governed by the following update equations.
- The authors will now show that with variance inflation, one obtains much stronger long-time estimates than without it.
- The proof is almost identical to the proof of Theorem 4.2.
- Choosing α large enough to ensure θ < 1 will result in filter boundedness; furthermore, if the observational noise standard deviation γ is small then choosing α large enough results in filter accuracy.
5 Derivation Of The Continuous Time Limit
- In this section the authors formally derive the continuous time scaling limits of the EnKF. (5.1b) The final step is to find an SDE for which the above represents a reasonable numerical scheme.
- Of course, this depends crucially on the choice of scaling parameter s.
- In fact, it is not hard to show that the one non-trivial limiting SDEs corresponds to the choice s =.
- The stabilizing term, which draws the ensemble member back towards the truth, and the noise, both act only orthogonal to the null-space of the empirical covariance of the set of particles.
6 Continuous-Time Estimates
- Under Assumptions 2.3 and 2.5.the authors.
- As can be seen in [DZ92, Theorem 4.17], these conditions are sufficient to utilise Itô’s formula.
- The authors note that, in the case of (6.1), it is not unreasonable to assume the existence of strong solutions.
- In finite dimensions this is in fact a consequence of the mean square estimate provided by the theorem; since the authors have been unable to prove this in the rather general infinite dimensional setting, however, they make it an assumption.
- The above argument does not work, but nevertheless it is still informative to see why it doesn’t work.
7 Numerical Results
- In this section the authors confirm the validity of the theorems derived in the previous sections for variants of the EnKF when applied to the dynamical system (2.2).
- Furthermore, the authors extend their numerical explorations beyond the strict range of validity of the theory and, in particular, consider the case of partial observations.
- The authors conduct all of their numerical experiments in the case of the incompressible Navier-Stokes equation on a two dimensional torus.
- The filter is always inaccurate when used without inflation.
- The authors thus turn to study the effect of inflation and note that their results indicate the filter can then always be made accurate, even in the case of partial observations, provided that sufficiently many low Fourier modes are observed.
7.1 Setup
- J∇ with J the canonical skew-symmetric matrix.
- The true initial condition u0 is randomly drawn from N(0, ν2A−2).
- The authors use the notation m(t) to denote the mean of the ensemble.
- The method used to approximate the forward model is a modification of a fourth-order Runge-Kutta method, ETD4RK [CM02], in which the Stokes semi-group is computed exactly by working in the incompressible Fourier basis {ψm(x)}m∈Z2\{0}, and Duhamel’s principle (variation of constants formula) is used to incorporate the nonlinear term.
7.2.1 Full observations
- Here the authors consider observations made at all numerically resolved, and hence observable, wavenumbers in the system; hence K = 322, not including padding in the spectral domain which avoids aliasing.
- Observations of the full-field are made every J = 20 time-steps.
- In Figure 2 there is no variance inflation and, whilst typical ensemble members remain bounded on the timescales shown, the error between the ensemble mean and the truth is O(1); indeed comparison with Figure 1 shows that the error in the mean is in fact worse than that of an ensemble evolving without access to data.
- Using variance inflation removes this problem and filter accuracy is obtained: see Figure 3.
7.2.2 Partial observations
- The observations are again made every J = 20 time-steps, but the authors will now consider observing only projections inside and outside a ring of radius |kλ| = 5 in Fourier space.
- Figure 4 shows that when observing all Fourier modes inside a ring of radius |kλ| = 5 the filter is accurate over long time-scales.
- In contrast, Figure 5 shows that observing all Fourier modes outside a ring of radius |kλ| = 5 does not provide enough information to induce accurate filtering.
7.3 Continuous Time
- And its relation to the underlying truth governed by (2.2), by means of numerical experiments.the authors.
- The authors thereby illustrate and extend the results of section 6.
- The Navier-Stokes equation 2.2 itself is solved by the method described in section 7.1.
- The stochastic process is also diagonalized in the Fourier basis (7.1) and then time-approximated by the EulerMaruyama scheme [KP92].
7.3.1 Full observations
- Here the authors consider observations made at all numerically resolved, and hence observable, wavenumbers in the system; hence K = 322, not including padding in the spectral domain which avoids aliasing.
- Figure 6 shows that, without inflation, the ensemble remains bounded, but the mean is inaccurate, on the time-scales of interest.
- In contrast Figure 7 demonstrates that inflation leads to accurate reconstruction of the truth via the ensemble mean.
8 Conclusions
- The authors have developed a method for the analysis of the EnKF.
- Instead of viewing it as an algorithm designed to accurately approximate the true filtering distribution, which it cannot do, in general, outside Gaussian scenarios and in the large ensemble limit, the authors study it as an algorithm for signal estimation in the finite (possibly small) ensemble limit.
- These positive results about the EnKF are encouraging and serve to underpin its perceived effectiveness in applications.
- On the other hand it is important to highlight that their analysis applies only to fully observed dynamics and interesting open questions remain concerning the partially observed case.
- These numerical results demonstrate two interesting potential extensions of their theory: (i) to strengthen well-posedness to obtain boundedness of trajectories, at least in mean square; (ii) to extend well-posedness and accuracy results to certain partial observation scenarios.
Did you find this useful? Give us your feedback
Citations
246 citations
206 citations
176 citations
167 citations
70 citations
References
6,284 citations
5,038 citations
4,144 citations
2,206 citations
1,801 citations
Related Papers (5)
Frequently Asked Questions (13)
Q2. What future works have the authors mentioned in the paper "Well-posedness and accuracy of the ensemble kalman filter in discrete and continuous time" ?
These numerical results demonstrate two interesting potential extensions of their theory: ( i ) to strengthen well-posedness to obtain boundedness of trajectories, at least in mean square ; ( ii ) to extend well-posedness and accuracy results to certain partial observation scenarios. Furthermore the authors highlight the fact that their results have assumed exact solution of the underlying differential equation model ; understanding how filtering interacts with numerical approximations, and potentially induces numerical instabilities, is a subject which requires further investigation ; this issue is highlighted in [ GM13 ].
Q3. What is the method used to approximate the forward model?
The method used to approximate the forward model is a modification of a fourth-order Runge-Kutta method, ETD4RK [CM02], in which the Stokes semi-group is computed exactly by working in the incompressible Fourier basis {ψm(x)}m∈Z2\\{0}, and Duhamel’s principle (variation of constants formula) is used to incorporate the nonlinear term.
Q4. What is the purpose of the EnKF?
The idea of the EnKF is to represent the filtering distribution through an ensemble of particles, to propagate this ensemble under the model to approximate the mapping P(uj |Yj) to P(uj+1|Yj) (refered to as prediction in the applied literature), and to update the ensemble distribution to include the data point Yj+1 by using a Gaussian approximation based on the second order statistics of the ensemble (refered to as analysis in the applied literature).
Q5. What is the effect of the noise contribution on the particles?
The perturbed observations noise contribution will act to prevent the particles from synchronizing which, in their absence, could happen.
Q6. What is the reason why the RML method is not an approximation of samples?
1|Yj is certainly not Gaussian in general, unless the dynamics are linear, hence the RML method becomes an approximation of samples.
Q7. What does it mean that the noise is finite dimensional?
However the fact that the noise is effectively finite dimensional, due to the presence of the finite rank covariance operator C, does mean that existence of strong solutions may well be established on a case-by-case basis in some infinite dimensional settings.
Q8. What is the simplest way to achieve the prediction step?
The prediction step is achieved by simply flowing forward the ensemble under the model dynamics, that isv̂ (k) j+1 = Ψ(v (k) j ) , for k = 1 . . .K.The analysis step is achieved by performing a randomised version of the Kalman update formula, and using the empirical covariance of the prediction ensemble to compute the Kalman gain.
Q9. What is the aim of this work?
The aim of their work is to try and build mathematical foundations for the analysis of the EnKF, in particular with regards to well-posedness (lack of blow-up) and accuracy (tracking the signal over arbitrarily long time-intervals).
Q10. What is the effect of inflation on the filter?
The authors thus turn to study the effect of inflation and note that their results indicate the filter can then always be made accurate, even in the case of partial observations, provided that sufficiently many low Fourier modes are observed.
Q11. What is the definition of a global attractor?
In infinite dimensions the existence of a global attractor in V follows from the techniques in [Tem97] for the Navier-Stokes equation by application of more subtle inequalities relating to the bilinear operator B – see section 2.2 in Chapter III of [Tem97].
Q12. What is the effect of variance inflation on the EnKF?
v(k) j+1 = Ψ(v (k) j ) + (α2 γ2 The author+ 1 γ2 Ĉj+1)y (k) j+1 . (4.4)The authors will now show that with variance inflation, one obtains much stronger long-time estimates than without it.
Q13. What is the inverse of the central limit theorem?
from the second equation the authors must have 1 − s/2 ≥ 1/2, for otherwise the stochastic terms would diverge when summed up, in accordance with the central limit theorem.