Predicting seasonal and hydro-meteorological impact in environmental variables modelling via Kalman filtering
Summary (2 min read)
1 Introduction
- The administration of hydrologic resources has been deserving a special prominence in the context of domestic and international politics in order to solve the complexity and the uncertainty of the problems associated with a worldwide and local scale of sustainable administration (environmental, social, and economical) of natural water resources.
- At a river basin scale there is a need to establish a methodology for systematic data monitoring, for the characterization of surface water quality and for the correct analysis of collected data (Vega et al 1998).
- The River Ave differs from the other Northern region rivers not only because of its high pollution levels but also due to the large space-time variability of pollutants concentration.
- It was demonstrated that state-space models improved the predictions accuracy in comparison with the linear regression models.
2 Data set description
- Northern Environment and Natural Resources (DRAN) and the Institute of Water (INAG) monthly monitor surface water quality along the River Ave and its main adjacent streams with a net of monitoring sites that comprises more than 23 variables to assess river water quality: industry, domestic wastewater, agriculture, wastewater treatment plants.
- In total, eight water monitoring sites are considered in this study: five located in the River Ave’s mainstream– Cantelães (CANT), Taipas (TAI), Riba d’Ave (RAV), Santo Tirso (STI), and Ponte Trofa (PTR)–and Golães (GOL), Ferro (FER), and Vizela Santo Adrião (VSA) in the adjacent stream River Vizela.
- These eight monitoring sites result from the restructuring of the water quality monitoring network in 1998, which implied the closure of other previous sites, and so the data set reports to the period between May 1998 and December 2009.
- DO concentration is an important indicator since most aquatic fauna and flora need oxygen to survive.
- Milligrams per liter (mg/l) is the amount of oxygen in a liter of water and it is the same as ”parts per million” or ppm.
3 Cluster analysis
- Taking into account previous works based on hydrological river basins (Shresta and Kazama 2007; Costa and Gonçalves 2011), a cluster analysis (CA) was performed for grouping monitoring sites with similar water quality characteristics in time, based on the DO concentration levels.
- Hierarchical agglomerative CA was performed on the raw data set by means of Ward’s method.
- The monitoring sites dendrogram obtained by means of Ward’s method is shown in Figure 1.
- There is a set of locations which have the best water quality indicators (the highest values obtained from the DO concentration), including sites situated upstream the Rivers Ave and Vizela (CANT corresponds to the source of River Ave); these monitoring sites receive pollution mostly from domestic wastewater and from agricultural and manure discharges.
5 The linear state-space model
- The main advantage of state-space models is to allow obtaining more accurate filtered predictions than the usual linear models by using the Kalman filter recursions.
- As expected, the more polluted cluster (Cluster II) is more affected by hydro- meteorological conditions because its calibration factor has higher values than Cluster I. Moreover, as expected by parameters estimates in Table 5, there is a different relationship between seasonal and hydro-meteorological factors in the two clusters.
6 Conclusions
- The analysis present in this paper allows to conclude that the hydro-meteorological factor constructed on the basis of the precipitation measure in River Ave’s basin improved the prediction accuracy.
- Besides, the linear state-space models, associated with the Kalman filter procedure, allow to distinguish the impact of the hydro-meteorological conditions from a structural component which can incorporate exogenous factors with repercussion on the water quality variable behaviour.
- This modelling approach can effectively integrate these different components, and their impacts can be measured and monitored.
- This approach could be used to assess water quality evolution, namely in change point detection.
- Indeed, the analysis of calibration factors of the structural component, such as the seasonality, could detect important changes in the water quality variability and thus attenuate the effects of the hydrometeorological conditions.
Did you find this useful? Give us your feedback
Citations
4,252 citations
144 citations
Cites methods from "Predicting seasonal and hydro-meteo..."
...In addition, the state space methods can easily handle missing observations; they are extendible to non-linear state space models, to hierarchical parameterizations, and to non-Gaussian errors (e.g. Durbin and Koopman 2012 and Gonçalves and Costa 2013)....
[...]
30 citations
Cites background from "Predicting seasonal and hydro-meteo..."
...A reduction of level of DO may cause long-term adverse effects in the aquatic environment (Gonçalves and Costa 2013), and a deficiency of DO is a sign of an unhealthy river (Mondal et al....
[...]
...A reduction of level of DO may cause long-term adverse effects in the aquatic environment (Gonçalves and Costa 2013), and a deficiency of DO is a sign of an unhealthy river (Mondal et al. 2016)....
[...]
14 citations
Cites methods from "Predicting seasonal and hydro-meteo..."
...The Kalman filter (Kalman 1960) is a recursive algorithm offering the optimal state estimate which is most consistent with the observation data at each time step because previous data affects the current data (Costa and Gonçalves 2011; Gonçalves and Costa 2013; Samain et al. 2008)....
[...]
6 citations
Cites methods from "Predicting seasonal and hydro-meteo..."
...ied changes in ambient mean air pollution levels following the introduction of a tra c management scheme at Marylebone Road, Central London, and Jarušková (1996) analyzed air pressure time series at Swiss meteorological stations....
[...]
References
5,071 citations
4,252 citations
"Predicting seasonal and hydro-meteo..." refers methods in this paper
...In many applications, the state-space models parameters are estimated by maximum Gaussian likelihood via the Newton-Raphson method (Harvey 1996) or, more often, by the EM algorithm (Shumway and Stoffer 1982)....
[...]
...As states are unobservable variables, their predictions are obtained by means of the Kalman filter algorithm (Harvey 1996)....
[...]
...In many applications, the statespace models parameters are estimated by maximum Gaussian likelihood via the Newton–Raphson method (Harvey 1996) or, more often, by the EM algorithm (Shumway and Stoffer 1982)....
[...]
4,203 citations
"Predicting seasonal and hydro-meteo..." refers methods in this paper
...The method used for the calculation of the empirical semivariogram is the method of moments (Matheron 1963), modified for a random space-time process Zðs; tÞ : s 2 IR(2); t 1⁄4 1; ....
[...]
...The method used for the calculation of the empirical semivariogram is the method of moments (Matheron 1963), modified for a random space-time process {Z(s, t) : s ∈ IR2, t = 1, ..., T}....
[...]
1,513 citations
"Predicting seasonal and hydro-meteo..." refers methods in this paper
...In many applications, the state-space models parameters are estimated by maximum Gaussian likelihood via the Newton-Raphson method (Harvey 1996) or, more often, by the EM algorithm (Shumway and Stoffer 1982)....
[...]
...In many applications, the statespace models parameters are estimated by maximum Gaussian likelihood via the Newton–Raphson method (Harvey 1996) or, more often, by the EM algorithm (Shumway and Stoffer 1982)....
[...]
1,481 citations
"Predicting seasonal and hydro-meteo..." refers background or methods in this paper
...Multivariate statistical analysis has been widely applied in water quality assessment and sources apportionment of water over the last years (Wunderlin et al. 2001; Simeonov et al. 2003; Shrestha and Kazama 2007)....
[...]
...A river is a system comprising both the main course and its tributaries, carrying the one-way flow of a significant load of matter in dissolved and particulate phases from both natural and anthropogenic sources (Shrestha and Kazama 2007)....
[...]
Related Papers (5)
Frequently Asked Questions (18)
Q2. What contributions have the authors mentioned in the paper "Predicting seasonal and hydro-meteorological impact in environmental variables modelling via kalman filtering" ?
This study focuses on the potential improvement of environmental variables modelling by using linear state-space models, as an improvement of the linear regression model, and by incorporating a constructed hydrometeorological covariate. This idea is illustrated based on a rather extended data set relative to the River Ave basin ( Portugal ) that consists mainly of monthly measurements of dissolved oxygen concentration ( DO ) in a network of water quality monitoring sites.
Q3. What is the linear predictor for a state-space model?
Assuming that parameters of a state-space model are known, the Kalman filter recursions give the best linear predictors to filter, forecast, and smooth the prediction of vector of states.
Q4. What is the main advantage of state-space models?
The main advantage of state-space models is to allow obtaining more accurate filtered predictions than the usual linear models by using the Kalman filter recursions.
Q5. What is the common method of analyzing water quality?
In several works, multivariate statistical analyses are applied to sets of water quality variables, usually quantitative analytical data consisting of physico-chemical variables.
Q6. What is the common approach to agglomerative clustering?
Hierarchical agglomerative clustering is the most common approach, providing intuitive similarity relationships between any given sample and the entire data set, and is typically illustrated by a dendrogram (McKenna 2003).
Q7. What is the method used for the calculation of the empirical semivariogram?
The method used for the calculation of the empirical semivariogram is the method of moments (Matheron 1963), modified for a random space-time process {Z(s, t) : s ∈ IR2, t = 1, ..., T}.
Q8. What is the state noise covariance matrix?
The state noise covariance matrix is based on relation Σβ = ΦΣΦ ′ +Σε that is valid in a VAR(1) stationary process, whereΣβ is the covariance matrix of the vector of states.
Q9. Why is it important to assess the adjustment of filtered predictions?
Atβ̂t|t because one of the contributions of the proposed model is its ability to separate a structural component that accommodates a global behaviour (as the seasonality) from another component associated to hydro-meteorological conditions, represented in the hydro-meteorological covariate, which must be filtered in order to obtain the best linear predictions.
Q10. What is the important factor in assessing the health of a water body?
Dissolved oxygen concentration is probably the most important factor in assessing the health of a water body, but other factors outside the water managers direct control also determine a water body’s health to a variable extent.
Q11. What is the common method used to classify water quality monitoring sites?
The proposed methodology starts by using a multivariate statistical approach– cluster analysis–to classify the water quality monitoring sites into homogeneous space-time groups based on the DO quality variable which was selected and considered relevant to characterize the water quality.
Q12. What is the expected mean value of the hydro-meteorological factor in cluster II?
The expected mean value of time varying coefficient of the hydro-meteorological factor varies around zero in Cluster II and −0.73 in Cluster I.
Q13. What is the importance of a time-space variability of rainfall in a geographical area?
Many hydrological and ecological studies recognize the importance of characterizing the time-space variability of precipitation in a geographical area (Goodrich et al 1995), for it is essential to estimate the hydrological balance.
Q14. What is the semivariogram model for the month of June?
The semivariogram model that has best performed has been the Gaussian with a nugget effect, for June in particular, as shown in Table 3.
Q15. How many sites are used to estimate the area rainfall?
A hydro-meteorological factor is constructed for each quality monitoring site (totalling 8 sites) based on the analysis of the space-time behaviour of the precipitation (monthly total) observed in a rain gauge network constituted by a total of 19 meteorological sites located in the area of the River Ave’s basin, between 1931-2009.
Q16. What is the problem of interpolation of data from irregularly spaced rain gauges?
One of the problems faced by meteorologists and hydrologists that study spatial rainfall patterns is the interpolation of data from irregularly spaced rain gauges in order to determine mean area rainfalls or to characterize rainfallvariability within a region or catchment (Dirks et al 1998; Ciach and Krajewski 2006).
Q17. How is the correlation coefficient between the dissimilarities in the dendrogram?
It has a cophenetic correlation coefficient of 0.85 (i.e., the correlation between the actual dissimilarities as recorded in the original dissimilarity matrix, and the dissimilarities which can be found in the dendrogram).
Q18. What is the semivariogram model to describe the spatial continuity of the process in June?
The estimated semivariogram model to describe the spatial continuity of the process in June is postulated in Eq. (2):γ Z (h) = 0, h = 0 440.537 + 96.5 ( 1− exp ( − ( ‖h‖2770.083)2)) , h 6= 0 . (2)In order to assess the quality of the semivariogram fitting, the authors performed a cross-validation procedure as follows: the authors selected a given rain gauge monitoring site at, say, s0, based on data from the other 18 sites, then the authors fitted new semivariograms and estimated the ordinary Kriging (point) to obtain point estimates of Zt(s0) across time, and finally the authors evaluated the corresponding residuals (differences between estimated and true values of Zt(s0)).