scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Predicting seasonal and hydro-meteorological impact in environmental variables modelling via Kalman filtering

TL;DR: In this paper, the potential improvement of environmental variables modelling by using linear state-space models, as an improvement of the linear regression model, and by incorporating a constructed hydro-meteorological covariate is discussed.
Abstract: This study focuses on the potential improvement of environmental variables modelling by using linear state-space models, as an improvement of the linear regression model, and by incorporating a constructed hydro-meteorological covariate. The Kalman filter predictors allow to obtain accurate predictions of calibration factors for both seasonal and hydro-meteorological components. This methodology can be used to analyze the water quality behaviour by minimizing the effect of the hydrological conditions. This idea is illustrated based on a rather extended data set relative to the River Ave basin (Portugal) that consists mainly of monthly measurements of dissolved oxygen concentration in a network of water quality monitoring sites. The hydro-meteorological factor is constructed for each monitoring site based on monthly precipitation estimates obtained by means of a rain gauge network associated with stochastic interpolation (kriging). A linear state-space model is fitted for each homogeneous group (obtained by clustering techniques) of water monitoring sites. The adjustment of linear state-space models is performed by using distribution-free estimators developed in a separate section.

Summary (2 min read)

1 Introduction

  • The administration of hydrologic resources has been deserving a special prominence in the context of domestic and international politics in order to solve the complexity and the uncertainty of the problems associated with a worldwide and local scale of sustainable administration (environmental, social, and economical) of natural water resources.
  • At a river basin scale there is a need to establish a methodology for systematic data monitoring, for the characterization of surface water quality and for the correct analysis of collected data (Vega et al 1998).
  • The River Ave differs from the other Northern region rivers not only because of its high pollution levels but also due to the large space-time variability of pollutants concentration.
  • It was demonstrated that state-space models improved the predictions accuracy in comparison with the linear regression models.

2 Data set description

  • Northern Environment and Natural Resources (DRAN) and the Institute of Water (INAG) monthly monitor surface water quality along the River Ave and its main adjacent streams with a net of monitoring sites that comprises more than 23 variables to assess river water quality: industry, domestic wastewater, agriculture, wastewater treatment plants.
  • In total, eight water monitoring sites are considered in this study: five located in the River Ave’s mainstream– Cantelães (CANT), Taipas (TAI), Riba d’Ave (RAV), Santo Tirso (STI), and Ponte Trofa (PTR)–and Golães (GOL), Ferro (FER), and Vizela Santo Adrião (VSA) in the adjacent stream River Vizela.
  • These eight monitoring sites result from the restructuring of the water quality monitoring network in 1998, which implied the closure of other previous sites, and so the data set reports to the period between May 1998 and December 2009.
  • DO concentration is an important indicator since most aquatic fauna and flora need oxygen to survive.
  • Milligrams per liter (mg/l) is the amount of oxygen in a liter of water and it is the same as ”parts per million” or ppm.

3 Cluster analysis

  • Taking into account previous works based on hydrological river basins (Shresta and Kazama 2007; Costa and Gonçalves 2011), a cluster analysis (CA) was performed for grouping monitoring sites with similar water quality characteristics in time, based on the DO concentration levels.
  • Hierarchical agglomerative CA was performed on the raw data set by means of Ward’s method.
  • The monitoring sites dendrogram obtained by means of Ward’s method is shown in Figure 1.
  • There is a set of locations which have the best water quality indicators (the highest values obtained from the DO concentration), including sites situated upstream the Rivers Ave and Vizela (CANT corresponds to the source of River Ave); these monitoring sites receive pollution mostly from domestic wastewater and from agricultural and manure discharges.

5 The linear state-space model

  • The main advantage of state-space models is to allow obtaining more accurate filtered predictions than the usual linear models by using the Kalman filter recursions.
  • As expected, the more polluted cluster (Cluster II) is more affected by hydro- meteorological conditions because its calibration factor has higher values than Cluster I. Moreover, as expected by parameters estimates in Table 5, there is a different relationship between seasonal and hydro-meteorological factors in the two clusters.

6 Conclusions

  • The analysis present in this paper allows to conclude that the hydro-meteorological factor constructed on the basis of the precipitation measure in River Ave’s basin improved the prediction accuracy.
  • Besides, the linear state-space models, associated with the Kalman filter procedure, allow to distinguish the impact of the hydro-meteorological conditions from a structural component which can incorporate exogenous factors with repercussion on the water quality variable behaviour.
  • This modelling approach can effectively integrate these different components, and their impacts can be measured and monitored.
  • This approach could be used to assess water quality evolution, namely in change point detection.
  • Indeed, the analysis of calibration factors of the structural component, such as the seasonality, could detect important changes in the water quality variability and thus attenuate the effects of the hydrometeorological conditions.

Did you find this useful? Give us your feedback

Figures (14)

Content maybe subject to copyright    Report

Noname manuscript No.
(will be inserted by the editor)
Predicting seasonal and hydro-meteorological impact
in environmental variables modelling via Kalman
filtering
A. Manuela Gon¸calves · Marco Costa
Received: date / Accepted: date
Abstract This study focuses on the potential improvement of environmental
variables modelling by using linear state-space models, as an improvement
of the linear regression model, and by incorporating a constructed hydro-
meteorological covariate. The Kalman filter predictors allow to obtain accurate
predictions of calibration factors for both seasonal and hydro-meteorological
components. This methodology can be used to analyze the water quality be-
haviour by minimizing the effect of the hydrological conditions. This idea is
illustrated based on a rather extended data set relative to the River Ave basin
(Portugal) that consists mainly of monthly measurements of dissolved oxygen
concentration (DO) in a network of water quality monitoring sites. The hydro-
meteorological factor is constructed for each monitoring site based on monthly
precipitation estimates obtained by means of a rain gauge network associated
with stochastic interpolation (Kriging). A linear state-space model is fitted for
each homogeneous group (obtained by clustering techniques) of water moni-
toring sites. The adjustment of linear state-space models is performed by using
distribution-free estimators developed in a separate section.
Keywords Hydrological basin · Water quality · State-space modelling ·
Kalman filter · Distribution-free estimation
A. Manuela Gon¸calves
Departamento de Matem´atica e Aplica¸oes, Universidade do Minho
Campus de Azur´em da Universidade do Minho, 4800-058 Guimar˜aes, Portugal
CMAT - Centro de Matem´atica da Universidade do Minho
E-mail: mneves@math.uminho.pt
Marco Costa
Escola Superior de Tecnologia e Gest˜ao de
´
Agueda, Universidade de Aveiro
Apartado 473, 3750-127
´
Agueda, Portugal
CMAF - Centro de Matem´atica e Aplica¸oes Fundamentais da Universidade de Lisboa
E-mail: marco@ua.pt

2 A. Manuela Gon¸calves, Marco Costa
1 Introduction
The administration of hydrologic resources has been deserving a special promi-
nence in the context of domestic and international politics in order to solve
the complexity and the uncertainty of the problems associated with a world-
wide and local scale of sustainable administration (environmental, social, and
economical) of natural water resources.
The river basin, which is the primordial unity of water resources planning
and management, is usually submitted to pressures and changes due to human
activities. At a river basin scale there is a need to establish a methodology for
systematic data monitoring, for the characterization of surface water quality
and for the correct analysis of collected data (Vega et al 1998). Surface water
quality monitoring has as its main objective the characterization of water
resources, as well as the monitoring of its space-time evolution in order to
achieve an appropriate administration.
A river is a system comprising both the main course and its tributaries,
carrying the one-way flow of a significant load of matter in dissolved and par-
ticulate phases from both natural and anthropogenic sources (Shrestha and
Kazama 2007). This study focuses on a rather extended data set relative to
the River Ave’s basin in Northwest Portugal and consists mainly of monthly
measurements of physical-chemical and microbiological variables in a network
of water quality monitoring sites and of monthly precipitation in a rain gauge
network of meteorological monitoring sites. The River Ave’s hydrological basin
has an approximate area of 1400 Km
2
(from its source in Serra da Cabreira
to its mouth in Vila do Conde), it’s 101 Km in length and its average flow at
the mouth is of about 40 m
3
/s. Its main adjacent streams are the River Este
(flowing from the North) and the Rivers Selho and Vizela (from the South).
In the last thirty years, the River Ave’s hydrological basin, with the exception
of its upstream areas, has been subjected to a growing rhythm of untreated
effluents discharges from industrial activities, namely from the textile sector
strongly implanted in this region. All this situation is instrumental for the
water quality deterioration, resulting in inappropriate water for several uses:
human consumption, industrial use, recreational uses, fishing and irrigation,
thus posing a serious danger for public health (Oliveira et al 2005). The River
Ave differs from the other Northern region rivers not only because of its high
pollution levels but also due to the large space-time variability of pollutants
concentration. The water quality measurements failed to comply with the ob-
jectives of minimum quality for surface waters prescribed by the Portuguese
legislation. The Central Administration, through the Regional Directory for
the Northern Environment and Natural Resources (DRAN) and the Insti-
tute of Water (INAG) monthly monitored the surface water quality along the
River Ave and its main streams since 1988 by means of a monitoring net en-
compassing 20 water monitoring sites that in 1998 was redimensioned in order
to comply with the new legislation. This network has been constantly restruc-
tured since 2007, in order to implement its chemical status monitoring (2007)

Water quality variable modelling with linear state-space models 3
and, more recently, its ecological status monitoring (2009), as stipulated by
the Water Framework Directive (Machado et al 2010).
Multivariate statistical analysis has been widely applied in water quality
assessment and sources apportionment of water over the last years (Wunderlin
et al. 2001; Simeonov et al. 2003; Shrestha and Kazama 2007). In several
works, multivariate statistical analyses are applied to sets of water quality
variables, usually quantitative analytical data consisting of physico-chemical
variables. If the goal is to investigate water quality evaluation in its time-
space variations as in Helena et al. (2000), or the natural and anthropogenic
origins of contaminants in surface or ground water as in Ato et al. (2010),
the most suitable and applied approach is the principal components analysis
(Liu et al. 2003; Lischeid 2009; Varol and Sen 2009). In some practical studies,
there is data available from a group of sample sites, usually water monitoring
sites, which is useful to perform several statistical methodologies: for instance,
correlation analysis parametric and non-parametric tests (Elhatip et al. 2008).
When a predict model is needed, the linear regression has been the most
applied approach (e.g. Gon¸calves and Alpuim 2011; Renwick et al. 2009). How-
ever, statistical models with fixed effects are unlikely to yield a good predic-
tive accuracy, particularly in situations where the predictor and predictand
relationship changes over time (Kokic 2010). This issue has been previously
acknowledged in environmental data: Costa and Alpuim (2011) consider state-
space models in the calibration of radar precipitation measures and Charles et
al. (2004) and Greene et al. (2008) have taken hidden Markov Chain models
to represent an evolving climate system in statistical downscaling. Costa and
Gon¸calves (2011) proposed a methodology which combines the analysis of a set
of sample sites–which were obtained by means of clustering procedures–with
the adjustment of predict regression models and state-space models, in particu-
lar considering trends and seasonal components. However, it was demonstrated
that state-space models improved the predictions accuracy in comparison with
the linear regression models.
In this study, a linear state-space model is proposed for modelling con-
tinuous physical and chemical monitoring data. The model was applied to
dissolved oxygen concentrations levels (DO) (mg/l) in 8 monitoring sites in
the River Ave’s basin over a 12-year period (1998-2009). Adequate dissolved
oxygen is necessary for good water quality and it is one of the most important
variables in the assessment of river water quality and pollution grade.
The proposed methodology starts by using a multivariate statistical approach–
cluster analysis–to classify the water quality monitoring sites into homoge-
neous space-time groups based on the DO quality variable which was selected
and considered relevant to characterize the water quality. In a recent work,
Costa and Gon¸calves (2011) show that a set of water quality monitoring sites
can be modelled by applying cluster techniques that minimize the number of
models.
One of the problems faced by meteorologists and hydrologists that study
spatial rainfall patterns is the interpolation of data from irregularly spaced
rain gauges in order to determine mean area rainfalls or to characterize rainfall

4 A. Manuela Gon¸calves, Marco Costa
variability within a region or catchment (Dirks et al 1998; Ciach and Krajewski
2006). Many hydrological and ecological studies recognize the importance of
characterizing the time-space variability of precipitation in a geographical area
(Goodrich et al 1995), for it is essential to estimate the hydrological balance.
Water quality in a given location is the reflex of the dominant conditions in
the source basin of that location, namely the hydro-meteorological factors.
The behaviour of the space-time quality variable is associated with the flow
variation (variable dilution effect), which in turn is generally related to the
seasonal rainfall variation.
We present the problem of area precipitation measurement in order to es-
timate a hydro-meteorological factor that will be used in the modelling of the
surface water quality of river basins, particularly for the dissolved oxygen vari-
able. A hydro-meteorological factor is constructed for each quality monitoring
site (totalling 8 sites) based on the analysis of the space-time behaviour of the
precipitation (monthly total) observed in a rain gauge network constituted by
a total of 19 meteorological sites located in the area of the River Ave’s basin,
between 1931-2009. A geostatistical approach and ordinary Kriging method
was chosen with the main goal of identifying models which estimate monthly
average rainfall in a sub-basin associated with a water quality monitoring site
where there are no observed values. Through stochastic interpolation (Krig-
ing) it is estimated the mean area rainfall during each month in the area of
influence of each water quality monitoring site: this covariate will integrate a
hydro-meteorological component that is crucial in any water quality modelling
process.
Finally, for each cluster, a linear state-space model was fitted to modelling
the DO concentration quality variable by taking into account the seasonal
variation throughout the year and the estimated hydro-meteorological factor.
The results demonstrate the effectiveness and advantages of modelling water
quality variables according to this approach, allowing to identify two different
components as a seasonal and a hydro-metereological factor.
2 Data set description
Northern Environment and Natural Resources (DRAN) and the Institute of
Water (INAG) monthly monitor surface water quality along the River Ave and
its main adjacent streams with a net of monitoring sites that comprises more
than 23 variables to assess river water quality: industry, domestic wastewa-
ter, agriculture, wastewater treatment plants. In total, eight water monitoring
sites are considered in this study: five located in the River Ave’s mainstream–
Cantel˜aes (CANT), Taipas (TAI), Riba d’Ave (RAV), Santo Tirso (STI),
and Ponte Trofa (PTR)–and Gol˜aes (GOL), Ferro (FER), and Vizela Santo
Adri˜ao (VSA) in the adjacent stream River Vizela. These eight monitoring
sites result from the restructuring of the water quality monitoring network in
1998, which implied the closure of other previous sites, and so the data set
reports to the period between May 1998 and December 2009. Table 1 summa-

Water quality variable modelling with linear state-space models 5
Table 1 Minimum, maximum, mean, standard deviation and missing data rate of water
quality variable DO concentration at the 8 monitoring sites in the River Ave’s basin
Monitoring
Site
CANT TAI RAV STI PTR GOL FER VSA
Minimum 7.4 6.6 1.8 1.7 2.4 7.3 7.3 7.2
Maximum 12.8 11.72 11.7 12.0 11.7 11.7 11.7 12.4
Mean 9.86 9.32 8.40 8.13 7.94 9.58 9.59 9.67
Standard
deviation
1.06 1.13 1.82 2.16 1.92 1.05 1.08 1.13
Missing
data rate
7.1% 8.6% 0.7% 1.4% 2.1% 8.6% 5.0% 8.6%
rizes basic statistics for the monthly measurements of the DO water quality
variable at the 8 monitoring sites during the above-mentioned period.
DO concentration is an important indicator since most aquatic fauna and
flora need oxygen to survive. The river system both produces and consumes
oxygen. If more oxygen is consumed than it is produced, dissolved oxygen
levels decline and some sensitive animals and plants could disappear. DO is
measured in milligrams per liter. Milligrams per liter (mg/l) is the amount
of oxygen in a liter of water and it is the same as ”parts per million” or
ppm. Dissolved oxygen concentration is probably the most important factor
in assessing the health of a water body, but other factors outside the water
managers direct control also determine a water body’s health to a variable
extent. Organic pollution is the most common type of pollution in this basin
and, consequently, a frequent problem is a deficit of DO concentration. This
problem is aggravated by the existence of a sequence of small dams in the
River Ave and in its main adjacent rivers (Costa and Gon¸calves 2011).
3 Cluster analysis
Taking into account previous works based on hydrological river basins (Shresta
and Kazama 2007; Costa and Gon¸calves 2011), a cluster analysis (CA) was
performed for grouping monitoring sites with similar water quality character-
istics in time, based on the DO concentration levels. Furthermore, this type
of analysis allows reducing the number of models in the modelling process.
CA is a group of multivariate techniques whose primary purpose is to
assemble objects based on their characteristics. Hierarchical agglomerative
clustering is the most common approach, providing intuitive similarity re-
lationships between any given sample and the entire data set, and is typically
illustrated by a dendrogram (McKenna 2003).
In this study, hierarchical agglomerative CA was performed on the raw data
set by means of Ward’s method. Ward’s method uses a variance approach to
evaluate the distances between clusters, in an attempt to minimize the sum of
squares (SS) of any two clusters that can be formed at each step. As these types
of algorithms operate on dissimilarities, our first task is to build a dissimilarity
matrix based on some measure of dissimilarity that can be applied to any two

Citations
More filters
Posted Content
TL;DR: In this paper, the authors provide a unified and comprehensive theory of structural time series models, including a detailed treatment of the Kalman filter for modeling economic and social time series, and address the special problems which the treatment of such series poses.
Abstract: In this book, Andrew Harvey sets out to provide a unified and comprehensive theory of structural time series models. Unlike the traditional ARIMA models, structural time series models consist explicitly of unobserved components, such as trends and seasonals, which have a direct interpretation. As a result the model selection methodology associated with structural models is much closer to econometric methodology. The link with econometrics is made even closer by the natural way in which the models can be extended to include explanatory variables and to cope with multivariate time series. From the technical point of view, state space models and the Kalman filter play a key role in the statistical treatment of structural time series models. The book includes a detailed treatment of the Kalman filter. This technique was originally developed in control engineering, but is becoming increasingly important in fields such as economics and operations research. This book is concerned primarily with modelling economic and social time series, and with addressing the special problems which the treatment of such series poses. The properties of the models and the methodological techniques used to select them are illustrated with various applications. These range from the modellling of trends and cycles in US macroeconomic time series to to an evaluation of the effects of seat belt legislation in the UK.

4,252 citations

Journal ArticleDOI
TL;DR: The change in the mean temperature in Finland is investigated with a dynamic linear model in order to define the sign and the magnitude of the trend in the temperature time series within the last 166 years as mentioned in this paper.
Abstract: The change in the mean temperature in Finland is investigated with a dynamic linear model in order to define the sign and the magnitude of the trend in the temperature time series within the last 166 years. The data consists of gridded monthly mean temperatures. The grid has a 10 km spatial resolution, and it was created by interpolating a homogenized temperature series measured at Finnish weather stations. Seasonal variation in the temperature and the autocorrelation structure of the time series were taken account in the model. Finnish temperature time series exhibits a statistically significant trend, which is consistent with human-induced global warming. The mean temperature has risen very likely over 2 °C in the years 1847–2013, which amounts to 0.14 °C/decade. The warming after the late 1960s has been more rapid than ever before. The increase in the temperature has been highest in November, December and January. Also spring months (March, April, May) have warmed more than the annual average, but the change in summer months has been less evident. The detected warming exceeds the global trend clearly, which matches the postulation that the warming is stronger at higher latitudes.

144 citations


Cites methods from "Predicting seasonal and hydro-meteo..."

  • ...In addition, the state space methods can easily handle missing observations; they are extendible to non-linear state space models, to hierarchical parameterizations, and to non-Gaussian errors (e.g. Durbin and Koopman 2012 and Gonçalves and Costa 2013)....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the authors developed and compared two artificial intelligences technique (AI) for simultaneous modelling and forecasting hourly dissolved oxygen (DO) in river ecosystem, the two techniques are: radial basis function neural network (RBFNN) and multilayer perceptron neural network(MLPNN).
Abstract: In the present study, we developed and compared two artificial intelligences technique (AI) for simultaneous modelling and forecasting hourly dissolved oxygen (DO) in river ecosystem. The two techniques are: radial basis function neural network (RBFNN) and multilayer perceptron neural network (MLPNN). For the purpose of the study, we choose two stations from the United States Geological Survey: (USGS ID: 421015121471800) at Lost River Diversion Channel nr Klamath River, Oregon, USA (Latitude 42°10′15″, Longitude 121°47′18″ NAD83), with a total of 8703 data, and (USGS ID: 421401121480900) at Upper Klamath Lake at Link River Dam, Oregon USA (Latitude 42°14′01″, Longitude 121°48′09″ NAD83) with a total of 8552 data. The investigation is divided into two distinguished phase. Firstly, using four water quality variables that are, water pH, temperature (TE), specific conductance (SC), and sensor depth (SD); we compared five models (M1 to M5) with different combination of input variables. As a result of the first investigation we found that generally RBFNN outperform MLPNN according to the performances criteria calculated. In the second part of the study, six Different models (FM1 to FM6) having the same input data sets are developed for 1,12, 24,48,72 and 168 h ahead (in advance) forecasting. The performance of the RBFNN and MLPNN models in training, validation and testing sets are compared with the observed data. Our results reveal that the two models provided relatively similar results and they successfully forecasting DO with a high level of accuracy and the reliability of forecasting decreases with increasing the step ahead.

30 citations


Cites background from "Predicting seasonal and hydro-meteo..."

  • ...A reduction of level of DO may cause long-term adverse effects in the aquatic environment (Gonçalves and Costa 2013), and a deficiency of DO is a sign of an unhealthy river (Mondal et al....

    [...]

  • ...A reduction of level of DO may cause long-term adverse effects in the aquatic environment (Gonçalves and Costa 2013), and a deficiency of DO is a sign of an unhealthy river (Mondal et al. 2016)....

    [...]

Journal ArticleDOI
TL;DR: In this paper, four different methodologies were used to fill the gap in latent heat flux (LE) data, including FAO_PM, mean diurnal variation (MDV), Kalman filter, and dynamic linear regression (DLR).
Abstract: Over the past few decades, energy and water fluxes have been directly measured by a global flux network, which was established by regional and continental network sites based on an eddy covariance (EC) method. Although, the EC method possesses many advantages, its typical data coverage could not exceed 65 % due to various environmental factors including micrometeorological conditions and systematic malfunctions. In this study, four different methodologies were used to fill the gap in latent heat flux (LE) data. These methods were Food and Agriculture Organization Penman–Monteith (FAO_PM) equation, mean diurnal variation (MDV), Kalman filter, and dynamic linear regression (DLR). We used these methods to evaluate two flux towers at different land cover types located at Seolmacheon (SMC) and Cheongmicheon (CMC) in Korea. The LE estimated by four different approaches was a fairly close match to the observed LE, with the root mean square error ranging from 4.81 to 61.88 W m−2 at SMC and from 0.89 to 60.27 W m−2 at CMC. At both sites, the LE estimated by DLR showed the best result with the value of the coefficient of correlation (R), equal to 0.99. Cost-effectiveness analysis for evaluating four different gap-filling methods also confirmed that DLR showed the best cost effectiveness ratio (C/R). The Kalman filter showed the second highest C/R rank except in the winter season at SMC followed by MDV and FAO_PM. Energy closures with estimated LE led to further improved compare to the energy closure of the observed LE. The results showed that the estimated LE at CMC was a better fit with the observed LE than the estimated LE at SMC due to the more complicated topography and land cover at the SMC site. This caused more complex interactions between the surface and the atmosphere. The estimated LE with all approaches used in this study showed improvement in energy closure at both sites. The results of this study suggest that each method can be used as a gap-filling model for LE. However, it is important to consider the strengths and weaknesses of each method, the purpose of research, characteristics of the study site, study period and data availability.

14 citations


Cites methods from "Predicting seasonal and hydro-meteo..."

  • ...The Kalman filter (Kalman 1960) is a recursive algorithm offering the optimal state estimate which is most consistent with the observation data at each time step because previous data affects the current data (Costa and Gonçalves 2011; Gonçalves and Costa 2013; Samain et al. 2008)....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the Schwarz Information Criterion (SIC) was applied to detect change-points in the time series of surface water quality variables, and the application of change-point analysis allowed detecting changepoints in both the mean and the variance in series under study.
Abstract: In this study, the Schwarz Information Criterion (SIC) is applied in order to detect change-points in the time series of surface water quality variables. The application of change-point analysis allowed detecting change-points in both the mean and the variance in series under study. Time variations in environmental data are complex and they can hinder the identification of the so-called change-points when traditional models are applied to this type of problems. The data seasonality structure is incorporated through a linear modeling approach. The assumptions of normality and uncorrelation are not present in some time series, and so, a simulation study is carried out in order to evaluate the methodology’s performance when applied to non-normal data and/or with time correlation.

6 citations


Cites methods from "Predicting seasonal and hydro-meteo..."

  • ...ied changes in ambient mean air pollution levels following the introduction of a tra c management scheme at Marylebone Road, Central London, and Jarušková (1996) analyzed air pressure time series at Swiss meteorological stations....

    [...]

References
More filters
Book
30 Mar 1990
TL;DR: In this article, the Kalman filter and state space models were used for univariate structural time series models to estimate, predict, and smoothen the univariate time series model.
Abstract: List of figures Acknowledgement Preface Notation and conventions List of abbreviations 1. Introduction 2. Univariate time series models 3. State space models and the Kalman filter 4. Estimation, prediction and smoothing for univariate structural time series models 5. Testing and model selection 6. Extensions of the univariate model 7. Explanatory variables 8. Multivariate models 9. Continuous time Appendices Selected answers to exercises References Author index Subject index.

5,071 citations

Posted Content
TL;DR: In this paper, the authors provide a unified and comprehensive theory of structural time series models, including a detailed treatment of the Kalman filter for modeling economic and social time series, and address the special problems which the treatment of such series poses.
Abstract: In this book, Andrew Harvey sets out to provide a unified and comprehensive theory of structural time series models. Unlike the traditional ARIMA models, structural time series models consist explicitly of unobserved components, such as trends and seasonals, which have a direct interpretation. As a result the model selection methodology associated with structural models is much closer to econometric methodology. The link with econometrics is made even closer by the natural way in which the models can be extended to include explanatory variables and to cope with multivariate time series. From the technical point of view, state space models and the Kalman filter play a key role in the statistical treatment of structural time series models. The book includes a detailed treatment of the Kalman filter. This technique was originally developed in control engineering, but is becoming increasingly important in fields such as economics and operations research. This book is concerned primarily with modelling economic and social time series, and with addressing the special problems which the treatment of such series poses. The properties of the models and the methodological techniques used to select them are illustrated with various applications. These range from the modellling of trends and cycles in US macroeconomic time series to to an evaluation of the effects of seat belt legislation in the UK.

4,252 citations


"Predicting seasonal and hydro-meteo..." refers methods in this paper

  • ...In many applications, the state-space models parameters are estimated by maximum Gaussian likelihood via the Newton-Raphson method (Harvey 1996) or, more often, by the EM algorithm (Shumway and Stoffer 1982)....

    [...]

  • ...As states are unobservable variables, their predictions are obtained by means of the Kalman filter algorithm (Harvey 1996)....

    [...]

  • ...In many applications, the statespace models parameters are estimated by maximum Gaussian likelihood via the Newton–Raphson method (Harvey 1996) or, more often, by the EM algorithm (Shumway and Stoffer 1982)....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors present a new science leading to such an approach, namely geostatistics, which is a new approach for estimating the estimation of ore grades and reserves.
Abstract: Knowledge of ore grades and ore reserves as well as error estimation of these values, is fundamental for mining engineers and mining geologists. Until now no appropriate scientific approach to those estimation problems has existed: geostatistics, the principles of which are summarized in this paper, constitutes a new science leading to such an approach. The author criticizes classical statistical methods still in use, and shows some of the main results given by geostatistics. Any ore deposit evaluation as well as proper decision of starting mining operations should be preceded by a geostatistical investigation which may avoid economic failures.

4,203 citations


"Predicting seasonal and hydro-meteo..." refers methods in this paper

  • ...The method used for the calculation of the empirical semivariogram is the method of moments (Matheron 1963), modified for a random space-time process Zðs; tÞ : s 2 IR(2); t 1⁄4 1; ....

    [...]

  • ...The method used for the calculation of the empirical semivariogram is the method of moments (Matheron 1963), modified for a random space-time process {Z(s, t) : s ∈ IR2, t = 1, ..., T}....

    [...]

Journal ArticleDOI
TL;DR: In this article, an approach to smoothing and forecasting for time series with missing observations is proposed, where the EM algorithm is used in conjunction with the conventional Kalman smoothed estimators to derive a simple recursive procedure for estimating the parameters.
Abstract: . An approach to smoothing and forecasting for time series with missing observations is proposed. For an underlying state-space model, the EM algorithm is used in conjunction with the conventional Kalman smoothed estimators to derive a simple recursive procedure for estimating the parameters by maximum likelihood. An example is given which involves smoothing and forecasting an economic series using the maximum likelihood estimators for the parameters.

1,513 citations


"Predicting seasonal and hydro-meteo..." refers methods in this paper

  • ...In many applications, the state-space models parameters are estimated by maximum Gaussian likelihood via the Newton-Raphson method (Harvey 1996) or, more often, by the EM algorithm (Shumway and Stoffer 1982)....

    [...]

  • ...In many applications, the statespace models parameters are estimated by maximum Gaussian likelihood via the Newton–Raphson method (Harvey 1996) or, more often, by the EM algorithm (Shumway and Stoffer 1982)....

    [...]

Journal ArticleDOI
TL;DR: This study illustrates the usefulness of multivariate statistical techniques for analysis and interpretation of complex data sets, and in water quality assessment, identification of pollution sources/factors and understanding temporal/spatial variations in waterquality for effective river water quality management.
Abstract: Multivariate statistical techniques, such as cluster analysis (CA), principal component analysis (PCA), factor analysis (FA) and discriminant analysis (DA), were applied for the evaluation of temporal/spatial variations and the interpretation of a large complex water quality data set of the Fuji river basin, generated during 8 years (1995–2002) monitoring of 12 parameters at 13 different sites (14 976 observations). Hierarchical cluster analysis grouped 13 sampling sites into three clusters, i.e., relatively less polluted (LP), medium polluted (MP) and highly polluted (HP) sites, based on the similarity of water quality characteristics. Factor analysis/principal component analysis, applied to the data sets of the three different groups obtained from cluster analysis, resulted in five, five and three latent factors explaining 73.18, 77.61 and 65.39% of the total variance in water quality data sets of LP, MP and HP areas, respectively. The varifactors obtained from factor analysis indicate that the parameters responsible for water quality variations are mainly related to discharge and temperature (natural), organic pollution (point source: domestic wastewater) in relatively less polluted areas; organic pollution (point source: domestic wastewater) and nutrients (non-point sources: agriculture and orchard plantations) in medium polluted areas; and organic pollution and nutrients (point sources: domestic wastewater, wastewater treatment plants and industries) in highly polluted areas in the basin. Discriminant analysis gave the best results for both spatial and temporal analysis. It provided an important data reduction as it uses only six parameters (discharge, temperature, dissolved oxygen, biochemical oxygen demand, electrical conductivity and nitrate nitrogen), affording more than 85% correct assignations in temporal analysis, and seven parameters (discharge, temperature, biochemical oxygen demand, pH, electrical conductivity, nitrate nitrogen and ammonical nitrogen), affording more than 81% correct assignations in spatial analysis, of three different sampling sites of the basin. Therefore, DA allowed a reduction in the dimensionality of the large data set, delineating a few indicator parameters responsible for large variations in water quality. Thus, this study illustrates the usefulness of multivariate statistical techniques for analysis and interpretation of complex data sets, and in water quality assessment, identification of pollution sources/factors and understanding temporal/spatial variations in water quality for effective river water quality management.

1,481 citations


"Predicting seasonal and hydro-meteo..." refers background or methods in this paper

  • ...Multivariate statistical analysis has been widely applied in water quality assessment and sources apportionment of water over the last years (Wunderlin et al. 2001; Simeonov et al. 2003; Shrestha and Kazama 2007)....

    [...]

  • ...A river is a system comprising both the main course and its tributaries, carrying the one-way flow of a significant load of matter in dissolved and particulate phases from both natural and anthropogenic sources (Shrestha and Kazama 2007)....

    [...]

Frequently Asked Questions (18)
Q1. What is the important variable in the assessment of river water quality?

Adequate dissolved oxygen is necessary for good water quality and it is one of the most important variables in the assessment of river water quality and pollution grade. 

This study focuses on the potential improvement of environmental variables modelling by using linear state-space models, as an improvement of the linear regression model, and by incorporating a constructed hydrometeorological covariate. This idea is illustrated based on a rather extended data set relative to the River Ave basin ( Portugal ) that consists mainly of monthly measurements of dissolved oxygen concentration ( DO ) in a network of water quality monitoring sites. 

Assuming that parameters of a state-space model are known, the Kalman filter recursions give the best linear predictors to filter, forecast, and smooth the prediction of vector of states. 

The main advantage of state-space models is to allow obtaining more accurate filtered predictions than the usual linear models by using the Kalman filter recursions. 

In several works, multivariate statistical analyses are applied to sets of water quality variables, usually quantitative analytical data consisting of physico-chemical variables. 

Hierarchical agglomerative clustering is the most common approach, providing intuitive similarity relationships between any given sample and the entire data set, and is typically illustrated by a dendrogram (McKenna 2003). 

The method used for the calculation of the empirical semivariogram is the method of moments (Matheron 1963), modified for a random space-time process {Z(s, t) : s ∈ IR2, t = 1, ..., T}. 

The state noise covariance matrix is based on relation Σβ = ΦΣΦ ′ +Σε that is valid in a VAR(1) stationary process, whereΣβ is the covariance matrix of the vector of states. 

Atβ̂t|t because one of the contributions of the proposed model is its ability to separate a structural component that accommodates a global behaviour (as the seasonality) from another component associated to hydro-meteorological conditions, represented in the hydro-meteorological covariate, which must be filtered in order to obtain the best linear predictions. 

Dissolved oxygen concentration is probably the most important factor in assessing the health of a water body, but other factors outside the water managers direct control also determine a water body’s health to a variable extent. 

The proposed methodology starts by using a multivariate statistical approach– cluster analysis–to classify the water quality monitoring sites into homogeneous space-time groups based on the DO quality variable which was selected and considered relevant to characterize the water quality. 

The expected mean value of time varying coefficient of the hydro-meteorological factor varies around zero in Cluster II and −0.73 in Cluster I. 

Many hydrological and ecological studies recognize the importance of characterizing the time-space variability of precipitation in a geographical area (Goodrich et al 1995), for it is essential to estimate the hydrological balance. 

The semivariogram model that has best performed has been the Gaussian with a nugget effect, for June in particular, as shown in Table 3. 

A hydro-meteorological factor is constructed for each quality monitoring site (totalling 8 sites) based on the analysis of the space-time behaviour of the precipitation (monthly total) observed in a rain gauge network constituted by a total of 19 meteorological sites located in the area of the River Ave’s basin, between 1931-2009. 

One of the problems faced by meteorologists and hydrologists that study spatial rainfall patterns is the interpolation of data from irregularly spaced rain gauges in order to determine mean area rainfalls or to characterize rainfallvariability within a region or catchment (Dirks et al 1998; Ciach and Krajewski 2006). 

It has a cophenetic correlation coefficient of 0.85 (i.e., the correlation between the actual dissimilarities as recorded in the original dissimilarity matrix, and the dissimilarities which can be found in the dendrogram). 

The estimated semivariogram model to describe the spatial continuity of the process in June is postulated in Eq. (2):γ Z (h) = 0, h = 0 440.537 + 96.5 ( 1− exp ( − ( ‖h‖2770.083)2)) , h 6= 0 . (2)In order to assess the quality of the semivariogram fitting, the authors performed a cross-validation procedure as follows: the authors selected a given rain gauge monitoring site at, say, s0, based on data from the other 18 sites, then the authors fitted new semivariograms and estimated the ordinary Kriging (point) to obtain point estimates of Zt(s0) across time, and finally the authors evaluated the corresponding residuals (differences between estimated and true values of Zt(s0)).