scispace - formally typeset
Search or ask a question

Showing papers in "Environmetrics in 1998"


Journal ArticleDOI
TL;DR: In this article, a new trophic index (TRIX) based on chlorophyll, oxygen saturation, mineral and total nitrogen and phosphorus, and applicable to coastal marine waters, is proposed.
Abstract: In pursuing earlier attempts to characterize the trophic state of inland waters, a new trophic index (TRIX) based on chlorophyll, oxygen saturation, mineral and total nitrogen and phosphorus, and applicable to coastal marine waters, is proposed. Numerically, the index is scaled from 0 to 10, covering a wide range of trophic conditions from oligotrophy to eutrophy. Secchi disk transparency combined with chlorophyll, instead, defines a turbidity index (TRBIX) that serves as complementary water quality index. The two indices are combined in a general water quality index (GWQI). Statistical properties and application of these indices to specific situations are discussed on examples pertaining to the NW Adriatic Sea. It is believed that these indices will simplify and make comparison between different spatial and temporal trophic situations of marine coastal waters more consistent. © 1998 John Wiley & Sons, Ltd.

474 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider methods for spatial modelling and prediction using different distance metrics, and under fixed boundary conditions, and illustrate using data from Charleston Harbor, an estuary on the coast of South Carolina, USA.
Abstract: Estuaries are among the earth's most valuable and productive environmental resources. To further our understanding of the impact of human activities on estuaries, there is a need for appropriate statistical methods for analyzing estuarine data. Estuaries possess a number of features that must be considered during spatial data analyses. Estuaries are irregularly shaped non-convex regions. Therefore, Euclidean distance may not be an appropriate distance metric for spatial analyses of estuaries, especially if the line segment connecting two sites intercepts land. Furthermore, some environmental variables may take deterministic values at estuarine boundaries. For example, shorelines are saturated with dissolved oxygen, and the salinity at estuarine mouths should be close to that of the ocean. This paper considers methods for spatial modelling and prediction using different distance metrics, and under fixed boundary conditions. These methods are illustrated using data from Charleston Harbor, an estuary on the coast of South Carolina, USA .

91 citations


Journal ArticleDOI
TL;DR: In this paper, three approaches to modelling spatial data in which simulation plays a vital role are described and illustrated with examples. But none of these approaches are appropriate for binary data, and none of them are suitable for the analysis of spatio-temporal data.
Abstract: Three approaches to modelling spatial data in which simulation plays a vital role are described and illustrated with examples. The first approach uses flexible regression models, such as generalized additive models, together with locational covariates to fit a surface to spatial data. We show how the bootstrap can be used to quantify the effects of model selection uncertainty and to avoid oversmoothing. The second approach, which is appropriate for binary data, allows for local spatial correlation by the inclusion in a logistic regression model of a covariate derived from neighbouring values of the response variable. The resulting autologistic model can be fitted to survey data obtained from a random sample of sites by incorporating the Gibbs sampler into the modelling procedure. We show how this modelling strategy can be used not only to fit the autologistic model to sites included in the survey, but also to estimate the probability that a certain species is present in the unsurveyed sites. Our third approach relates to the analysis of spatio-temporal data. Here we model the distribution of a plant or animal species as a function of the distribution at an earlier time point. The bootstrap is used to estimate parameters and quantify their precision. © 1998 John Wiley & Sons, Ltd.

70 citations


Journal ArticleDOI
K. C. Johnson1, Yang Mao1, J. Argo1, S. Dubois1, R. Semenciw1, J. Lava1 
TL;DR: The rationale and basic design of the National Enhanced Cancer Surveillance System is described, designed to facilitate timely, systematic evaluation of environment-cancer concerns and strengthen cancer surveillance in Canada.
Abstract: This paper describes the rationale and basic design of the National Enhanced Cancer Surveillance System. The system is designed to facilitate timely, systematic evaluation of environment-cancer concerns and strengthen cancer surveillance in Canada. There are three key activities: (1) National community-level environmental quality database: The database, created by Health Canada, includes systematic, easily accessible, community-level information on air and water quality. Information includes a national inventory of municipal waste disposal sites, municipal drinking water data, air quality data, historic industrial location and productivity data. (2) Case-control surveillance: The Provincial Cancer Registries are collecting individual data from a large, Canada-wide, population-based series of newly-diagnosed cancer cases for 18 types of cancer and a population control group. Mailed questionnaires and telephone follow-up are used to gather data on residential and occupational histories, diet, physical activity and other risk factors for cancer. Data for over 20,000 cancer cases and 5000 controls have been assembled between 1994 and 1997. The cancer and population control data are being linked to the environmental database to facilitate systematic, community level, case-control assessment of cancer risk related to air and water quality. (3) Geographic surveillance network: Areas of high and/or unusual patterns of cancer incidence are being examined through temporal and spatial mapping, cluster analysis and risk factor evaluation. © 1998 John Wiley & Sons, Ltd.

67 citations


Journal ArticleDOI
TL;DR: In this article, a simulation study has been carried out to compare the results from using diAerent randomization methods to assess the significance of the F-statistics for factor e ects with analysis of variance.
Abstract: SUMMARY A simulation study has been carried out to compare the results from using diAerent randomization methods to assess the significance of the F-statistics for factor eAects with analysis of variance. Two-way and three- way designs with and without replication were considered, with the randomization of observations, the restricted randomization of observations, and the randomization of diAerent types of residuals. Data from normal, uniform, exponential, and an empirical distribution were considered. It was found that usually all methods of randomization gave similar results, as did the use of the usual F-distribution tables, and that no method of analysis was clearly superior to the others under all conditions. #1998 John Wiley & Sons, Ltd. Randomization methods for testing hypotheses are useful in environmental areas because their justification is easy to understand by government oAcials and the public at large, and they are applicable with the non-normal distributions that often occur. However, as soon as the required analysis becomes more than something as simple as the comparison of the mean values of several samples the appropriate randomization procedure becomes questionable. There has been con- troversy with several types of randomization analysis but in this paper we only consider simple factorial designs with analysis of variance. It was in this area that Crowley (1992) in his review of resampling methods remarked that 'contrasting views on interaction terms in factorial ANOVA ... needs to be resolved'. In discussing the merits of randomization tests we start with one important premise. This is that the major value of these tests is in situations where the data are grossly non-normal (e.g. with several extremely large values and many tied values), and sample sizes are not large. It is in these situations that more conventional methods are of questionable validity and the results of a randomization test may carry more weight. This premise is important because it tells us that our main concern should bewith the performance of alternative methods on grossly non-normal data. Theoretical or simulation studies suggesting that one method is somewhat better than another with normally distributed data are not necessarily of much relevance.

56 citations


Journal ArticleDOI
TL;DR: A blind test of the ability of a feed-forward artificial neural network to provide out-of-sample forecasting of rainfall run-off using real data and the extent to which the system was found to be non-linear is quantified.
Abstract: This paper presents the results of a blind test of the ability of a feed-forward artificial neural network to provide out-of-sample forecasting of rainfall run-off using real data. The results obtained are comparable with the results obtained using best methods currently available. The focus of the paper has been an easily repeatable experiment applied to rainfall and run-off data for a catchment area; which particular catchment was not revealed to the experimenters, i.e. a blind experiment. To this end, a simple model has been specified, and the architecture of the neural network and the data preparation procedures adopted are discussed in detail. The results are presented and discussed in detail and the extent to which the system was found to be non-linear is quantified. © 1998 John Wiley & Sons, Ltd.

43 citations


Journal ArticleDOI
TL;DR: In this paper, the median ranked set sampling (MRSS) was used to estimate the population mean of a variable of interest when ranking is based on a concomitant variable.
Abstract: Ranked set sampling (RSS), as suggested by McIntyre (1952), assumes perfect ranking, i.e. without errors in ranking, but for most practical applications it is not easy to rank the units without errors in ranking. As pointed out by Dell and Clutter (1972) there will be a loss in precision due to the errors in ranking the units. To reduce the errors in ranking, Muttlak (1997) suggested using the median ranked set sampling (MRSS). In this study, the MRSS is used to estimate the population mean of a variable of interest when ranking is based on a concomitant variable. The regression estimator uses an auxiliary variable to estimate the population mean of the variable of interest. When one compares the performance of the MRSS estimator to RSS and regression estimators, it turns out that the use of MRSS is more efficient, i.e. gives results with smaller variance than RSS, for all the cases considered. Also the use of MRSS gives much better results in terms of the relative precision compared to the regression estimator for most cases considered in this study unless the correlation between the variable of interest and the auxiliary is more than 90 per cent. © 1998 John Wiley & Sons, Ltd.

38 citations


Journal ArticleDOI
TL;DR: In this paper, a model for the "in control" process of one species, vendace (Coregonus albula), is constructed and used for univariate monitoring, where a set of five economically interesting species serve as bioindicators for the lake.
Abstract: Statistical surveillance comprises methods for repeated analysis of stochastic processes, aiming to detect a change in the underlying distribution. Such methods are widely used for industrial, medical, economic and other applications. By applying these general methods to data collected for environmetrical purposes, it might be possible to detect important changes fast and reliably. We exemplify the use of statistical surveillance on a data set of fish catches in Lake Malaren, Sweden, 1964–93. A model for the ‘in control’ process of one species, vendace (Coregonus albula), is constructed and used for univariate monitoring. Further, we demonstrate the application of Hotelling's T2 and the Shannon–Wiener index for monitoring biodiversity, where a set of five economically interesting species serve as bioindicators for the lake. © 1998 John Wiley & Sons, Ltd.

35 citations


Journal ArticleDOI
TL;DR: In this article, a spatial extension of the methods that integrates information from the data sites and exploits knowledge of the spatial variation of the tidal and surge constituents of the sea level along a coastline, to produce estimates at any coastal location.
Abstract: The problem of estimating the probability of extreme sea-levels along a coastline has received little attention. Most of the existing analyses are univariate approaches that are applied independently to data from individual sites. We present a spatial extension of the methods that integrates information from the data sites and exploits knowledge of the spatial variation of the tidal and surge constituents of the sea-level along a coastline, to produce estimates at any coastal location. We illustrate the method by application to the UK east coast providing a set of design level estimates along the entire coastline.

31 citations


Journal ArticleDOI
TL;DR: Sun and Zidek as mentioned in this paper proposed a Bayesian approach for estimating air pollution at locations where monitoring data are not available, using the concentrations observed at other monitoring stations and possibly at different time periods.
Abstract: Health impact studies of air pollution often require estimates of pollutant concentrations at locations where monitoring data are not available, using the concentrations observed at other monitoring stations and possibly at different time periods. Recently, a Bayesian approach for such a temporal and spatial interpolation problem has been proposed by Le. Sun and Zidek (1997). One special feature of the method is that it does not require all sites to monitor the same set of pollutants. This feature is particularly relevant in environmental health studies where pollution data are often pooled together from several monitoring networks which may or may not monitor the same set of pollutants. The methodology is applied to the data in the Province of Ontario, where monthly average concentrations for summer months ofnitrogen dioxide (NO 2 in μg/m 3 ), ozone (O 3 in ppb), sulphur dioxide (SO in μg/m 3 ) and sulfate ion (SO 4 in μg/m 3 ) are available for the period from January I of 1983 to December 31 of 1988 at 31 ambient monitoring sites. Detailed descriptions of spatial interpolation for air pollutant concentrations at 37 approximate centroids of Public Health Units in Ontario using all available data are presented. The methodology is empirically assessed by a cross-validation study where each of the 31 sites is successively removed and the remaining sites are used to predict its concentration levels. The methodology seems to perform well.

30 citations


Journal ArticleDOI
Eric P. Smith1
TL;DR: In this paper, the authors discuss a number of questions and concerns related to analysis and interpretation using the randomization method in multispecies studies. And they discuss the importance of the variables (species) and assumptions about the data.
Abstract: Data from ecological and biomonitoring studies are sometimes difficult to make inferences from owing to the high dimensionality of the data, the lack of normality and other problems. One approach for testing which has interested researchers is the randomization method. A general approach is based on replacing the multivariate data with distances between units, choosing a test statistic to summarize differences (due say to a treatment) and using a randomization test to assess the significance of the differences. This paper discusses a number of questions and concerns related to analysis and interpretation using this analytical approach. First, what can be said about the power of this test and how is the power related to the power of other tests under optimal conditions? Second, the variables (species) seem to get lost in the analysis. How important are they and should one be concerned about their importance to the power of the test? Finally, how important are assumptions about the data? These questions and others are discussed using examples from multispecies studies.

Journal ArticleDOI
TL;DR: In this paper, levels of disinfection byproducts (DBPs) in drinking water samples tended to be higher in summer than in winter, and concentrations of some DBPs tended to increase with distance (time) from the treatment plant while other DBPs increased in the first part of the distribution system and then decreased at a further distance from a treatment plant.
Abstract: Levels of disinfection by-products (DBPs) in drinking water samples tended to be higher in summer than in winter. The concentrations of some DBPs tended to increase with distance (time) from the treatment plant while other DBPs increased in the first part of the distribution system and then decreased at a further distance from the treatment plant. Samples taken near the end of the distribution system provided an estimation of maximum exposure for trihalomethanes but not for haloacetic acids. Maximum values for DCAA and TCAA usually occurred at some point within the distribution system and by the end of the distribution system DCAA and TCAA levels were below their respective maximum values. No single sampling location or season provided simultaneous maximum values for the trihalomethanes and haloacetic acids. The difficulties involved in determining current human exposures to DBPs in drinking water emphasize the problems involved in attempting a retrospective estimation of DBP exposure. It will be difficult to assess the risk of adverse health effects within an exposed population when the levels and speciation of the DBPs change between seasons and the level of exposure can depend on how far the consumer lives from the water treatment plant.

Journal ArticleDOI
TL;DR: In this paper, the authors used the "peaks over threshold" approach to estimate extreme wind loads calculated by taking into account the directional dependence of both the aerodynamic coefficients and the extreme wind climate.
Abstract: We use ‘peaks over threshold’ approach to estimate extreme wind loads calculated by taking into account the directional dependence of both the aerodynamic coefficients and the extreme wind climate. Our interest is focused primarily on ultimate wind loads, that is, loads that are sufficiently large to cause member failure. For non-hurricane regions (1) we comment on issues raised by the fact that directional data published by the National Weather Service are incomplete, and (2) note that, owing to the relatively small sizes of the data samples, results on directional effects for mean recurrence intervals longer than a few hundred years are inconclusive. For hurricane-prone regions we show that, on average, the common practice of disregarding wind directionality effects is conservative for 50-year wind loads. However, according to our results, the degree of conservatism decreases as the mean recurrence interval increases. While individual estimates of speeds with very long mean recurrence intervals are unreliable, statistics based on estimates obtained from large numbers of records can provide useful indications of average trends and suggest that, for mean recurrence intervals associated with ultimate wind loads, the favorable effect of wind directionality tends to be marginal. © 1998 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: In this article, a complete Bayesian methodology to model compositional data is developed and is illustrated on a real data set comprising sand, silt and clay compositions taken at various water depths in an Arctic lake.
Abstract: Compositional data often result when raw data are normalized or when data is obtained as proportions of a certain heterogeneous quantity. These conditions are fairly common in geology, economics and biology. The result is, therefore, a vector of such observations per specimen. The usual multivariate procedures are seldom adequate for the analysis of compositional data and there is a relative dearth of alternative techniques suitable for the same. The presence of covariates further adds to complexity of the situation. In this manuscript, a complete Bayesian methodology to model such data is developed and is illustrated on a real data set comprising sand, silt and clay compositions taken at various water depths in an Arctic lake. Alternative methods such as maximum likelihood estimates are compared with the proposed Bayesian estimates. Simulation based approach is adopted to ascertain adequacy of the fit. Several models are finally compared via a posterior predictive loss measure. Copyright © 1998 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: In this paper, two marked point process models based on the Cox process are used to describe the probabilistic structure of the rainfall intensity process and derived second-order properties of the accumulated rainfall amounts at different levels of aggregation are used in order to examine the model fit.
Abstract: We study two marked point process models based on the Cox process. These models are used to describe the probabilistic structure of the rainfall intensity process. Mathematical formulation of the models is described and some second-moment characteristics of the rainfall depth, and aggregated processes are considered. The derived second-order properties of the accumulated rainfall amounts at different levels of aggregation are used in order to examine the model fit. A brief data analysis is presented. Copyright © 1998 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: In this article, the mean ground concentration of ozone and nitrogen dioxide on data consisting of 5 years of daily averages is analyzed and the smallest averaging time over which a model with a deterministic trend and an autoregressive error distribution passes a number of statistical tests is found.
Abstract: Predictive models are built for the mean ground concentration of ozone and nitrogen dioxide on data consisting of 5 years of daily averages. We find the smallest averaging time over which a model with a deterministic trend and an autoregressive error distribution passes a number of statistical tests. Such a model implies that the error variance conditioned on the past observations is constant. The fit is good for the series of weekly averages, but a model with heteroscedastic conditional variance has to be used for daily averages. The application of a generalized autoregressive heteroscedasticity model leads both to a satisfactory fit and a good predictive power for daily average data. © 1998 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: In this article, the sign of the correlation between the variables in the models is found to be non-negative for classical Gumbel models, non-positive for conditionally specified models.
Abstract: For modelling bivariate extremes, the classical bivariate Gumbel models are limited in their flexibility for fitting real world data sets. Alternative models, derived via conditional specification, are introduced in the current paper. A key difference between the classical models and the conditional specification models is to be found in the sign of the correlation between the variables in the models: non-negative for classical models, non-positive for conditionally specified models. Copyright © 1998 John Wiley & Sons, Ltd.


Journal ArticleDOI
Roy E. Kwiatkowski1
TL;DR: In this paper, a modified version of Health Canada's model for human health risk assessment and risk management is used to provide a framework for risk assessment/risk management in the broader context of environmental assessment and briefly indicates the role retrospective exposure assessment can play in environmental assessment.
Abstract: There is an increasing awareness among environmental professionals and public that a totally risk free environment is an unattainable goal, and that the development of effective risk management strategies, involving a wide variety of scientific and societal considerations, is needed. The Department of Health Canada (HC; formally Health and Welfare Canada, HWC) has considerable experience with risk assessment and risk management with regards to human health protection. This expertise and knowledge is usefully transferrable to environmental assessment. This paper proposes using a modified version of Health Canada's model for human health risk assessment and risk management to provide a framework for risk assessment/risk management in the broader context of environmental assessment and briefly indicates the role Retrospective Exposure Assessment can play in environmental assessment. In describing the framework, details on the following will be provided: the identification of environmental hazard (spatially as well as temporally); cstimation of severity of risk; development of alternative management options; public perception of risk and risk communication; strategies for risk management: and risk monitoring and evaluation.


Journal ArticleDOI
TL;DR: In this paper, the authors show that if the data are conditioned to lie above a line with slope-λ on the log scale, then the weighted least squares estimate of λ is unbiased.
Abstract: Pharmacokinetic studies of biomarkers for environmental contaminants in humans are generally restricted to a few measurements per subject taken after the initial exposure. Subjects are selected for inclusion in the study if their measured body burden is above a threshold determined by the distribution of the biomarker in a control population. Such selection procedures introduce bias in the ordinary weighted least squares estimate of the decay rate caused by the truncation. We show that if the data are conditioned to lie above a line with slope-λ on the log scale then the weighted least squares estimate of λ is unbiased. We give an iterative estimation algorithm that produces this unbiased estimate with commercially available software for fitting a repeated measures linear model. The estimate and its efficiency are discussed in the context of a pharmacokinetic study of 2,3,7,8-tetrachlorodibenzo-p-dioxin. Unbiasedness and efficiency are demonstrated with a simulation.

Journal ArticleDOI
James Argo1
TL;DR: In this article, a case-control study design is used to estimate chemical exposure retrospectively for the purpose of studying individual chronic chemical exposure, and a procedure to partially validate the pollution estimate and the estimate of toxicity is shown.
Abstract: A procedure to estimate chemical exposure retrospectively for the purpose of studying individual chronic chemical exposure is described. We use the unique combination of a case-control study design which eliminates dilution by mobility, coupled with an emission inventory to estimate exposure nationally for up to 35 years latency. The retrospective exposure assessment (REA) methodology leads to an Exposure Index (EI) to describe the average historic exposure at a point. The approach is demonstrated with an estimate of the average chemical environment in each of 78 pulp producing communities in Canada in 1980 considering process-related emissions. A procedure to partially validate the pollution estimate and the estimate of toxicity is shown.

Journal ArticleDOI
TL;DR: In this paper, the authors show that much of this analysis is sensitive to the imposition of theoretical economic restrictions and provide a range of point estimates in a sensitivity analysis, which is due to one particular set of restrictions known as symmetry restrictions and provides a bootstrap analysis which suggests that estimation sensitivity is almost entirely in the means of the sampling distributions and not in their shapes or degrees of dispersion.
Abstract: The aggregate production function approach is one way to forecast future energy demand (a step in forecasting carbon dioxide emissions, for example) and to analyze the aggregate economic effects of measures such as the increase of taxes onenergy use. The results of such an approach tend to hinge on whether energy and capital are substitutes, implying that increases in energy prices will increase the demand for capital stock or are complements, implying that increases in energy priceswill reduce the demand for capital stock. In a famous but controversial paper, Berndt and Wood (1975) find energy and capital are complements using aggregate time series manufacturing data for the United States, 1947-1971. Ilmakunnas (1986) shows that much of this analysis is sensitive to the imposition of theoretical economic restrictions and provides a range of point estimates in a sensitivity analysis. The current paper discusses these issues further and taking the Berndt-Wood study as an empirical example, shows that the estimation sensitivity is due to one particular set of restrictions known as symmetry restrictions and provides a bootstrap analysis which suggests that estimation sensitivity is almost entirely in the means of the sampling distributions and not in their shapes or degrees of dispersion.;

Journal ArticleDOI
TL;DR: In this paper, a procedure for the analysis of seasonal variation in multiple groups is presented, which includes two tests: a multigroup test for seasonality that is capable of detecting simultaneously different seasonal variations in different groups and a test for homogeneity of multiple multinomial distributions with cyclically ordered categories.
Abstract: A procedure is presented for the analysis of seasonal variation in multiple groups. The proposed method includes two tests. One is a multigroup test for seasonality that is capable of detecting simultaneously different seasonal variations in different groups. The other is a test for homogeneity of multiple multinomial distributions with cyclically ordered categories. These tests are obtained by a geometric approach that helps both to understand the statistical analyses and to communicate the results. Mathematical expressions are given to facilitate the data analysis, and two examples are presented. © 1998 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: In this article, the authors report on the design of an experiment to measure the individual contributions of banking permits (called coupons) and trading entitlements (which are called shares) under alternative conditions of certainty and uncertainty.
Abstract: SUMMARY Two important decisions in designing markets for tradable emissions permits are whether to allow banking and whether to allow trading in entitlements to future permits. Recent experiments suggest that banking will be particularly important when uncertainty about actual emissions requires trading in a reconciliation period after the quantity of emissions has been determined. This paper reports on the design of an experiment to measure the individual contributions of banking permits (which are called coupons) and trading entitlements (which are called shares) under alternative conditions of certainty and uncertainty. Banking, share trading and uncertainty conditions are introduced in a complete factorial design with three observations per cell. Preliminary analysis shows that banking and share trading both lead to greater eAciency. Banking is particularly important in reducing price instability when uncertainty is present. #1998 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: In this paper, the authors explore least-cost strategies for reducing CO2 emissions from the electric power industry using an economic-engineering model for electric generation capacity expansion to investigate expansion plans which meet alternative CO 2 emissions constraints in the lowest cost.
Abstract: This paper explores least-cost strategies for reducing CO2 emissions from the electric power industry. It uses an economic-engineering model for electric generation capacity expansion to investigate expansion plans which meet alternative CO2 emissions constraints in the lowest cost. The model selects the mix of various energy technologies, which are either presently in use or will possibly be in use in the future, to meet a specific carbon emissions limit and, furthermore, estimates the optimal tax required to achieve the least-cost strategy for reducing emissions to the desired level. Using Greece as a case study, the study suggests that to stabilize CO2 emissions at their 1990 level by the year 2005 and thereafter, the industry should move away from lignite generation to hydro and renewables and to coal or lignite technologies with CO2 removal capabilities. An optimal tax of $105 per ton of carbon is required to achieve this target in the lowest cost. © 1998 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: In this article, the EM algorithm is extended to deal with datasets in which data screening has taken place, and the unified approach adopted appears new and, although tailored to a particular and important application, the method should have much wider application.
Abstract: Many meteorological datasets are mixtures in which components correspond to particular physical phenomena, the accurate identification of which are important from a meteorological standpoint In particular, rainfall is generated by at least two processes—one convection, the other frontal systems—each characterised by its own distribution of rain rates and durations The breakpoint data format, in which the timings of rain-rate changes and the steady rates between changes are recorded, captures the information required to parameterise these phenomena Rainfall data has only recently become available in breakpoint format, which is both more compact and contains more information than older sources such as the fixed amount and fixed interval representation commonly used Techniques such as the EM algorithm can be used to decompose the breakpoint data into its components However, the quality of the currently available breakpoint data is poor for low rates and short durations and these portions of the data need to be discarded, or screened out, and the EM algorithm modified In this paper, the EM algorithm is extended to deal with datasets in which data screening has taken place The unified approach adopted appears new and, although tailored to a particular and important application, the method should have much wider application Furthermore, in this paper the extension is applied to a large scale breakpoint dataset of about 56,000 observations with univariate and bivariate normal mixtures being fitted after censoring or truncation below a point or line respectively The procedure was also applied to simulated breakpoint data which showed that the procedure was relatively robust and gave excellent results in the majority of cases For the actual data, the results at low truncation agreed with applications of the EM algorithm to non-truncated data, but a different picture arose at moderate truncation An analysis of the dry times between periods of precipitation is also given as an example of censoring Overall, four components were required to adequately represent the wet data and another four for the dry data, giving a total of 34 parameters to model the 56,000 breakpoints Copyright © 1998 John Wiley & Sons, Ltd

Journal ArticleDOI
TL;DR: In this paper, an alternative notion of multiscaling is introduced, providing new theoretical basis for further consideration of multiplicative random cascades, as a class of models which potentially capture not only the spatial variability of spatially averaged rain rate fields, but also accounting for its evolution in the course of time.
Abstract: Recent research has shown that, conditionally on rain, probability moments of instantaneous spatial averages of rain rate, over differently scaled subregions of a rain field, possess an interesting scaling property called wide sense multiscaling, with respect to magnifying spatial scales. This empirical fact, along with some theoretical considerations, have led to implementation of the so called multiplicative random cascades in order to model the instantaneous spatial variability of rain rate, given that it rains. However, random cascades do not account for the evolution of spatial variability through time. In this article an alternative notion of multiscaling is introduced, providing new theoretical basis for further consideration of multiplicative random cascades, as a class of models which potentially capture not only the spatial variability of spatially averaged rain rate fields, but also accounting for its evolution in the course of time. The new notion of multiscaling introduced here is referred to as spectral multiscaling, is defined for second order stationary processes of spatially averaged rain rate, and (similar to probabilistic multiscaling) is discerned to that of strict and of wide sense, with regard to the normalized spectral distribution and its spectral moments, respectively. In fact, the validity of spectral multiscaling properties is verified statistically, using time series of spatially averaged rain rate data from a regularly observed (by radar) tropical rain field known with the acronym TOGA-COARE. The results of this empirical non-model-based analysis, point to the validity of multiscaling of the normalized spectral distribution across the entire spectrum of frequencies, and also of multiscaling of the corresponding spectral moments. Copyright © 1998 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: In this paper, the probability density function of concentration for a simple model of the turbulent diffusion process presented by Zimmerman and Chatwin (1995) is derived, and relationships with other work are discovered, and some implications for future research are assessed.
Abstract: The paper derives the probability density function of concentration for a simple model of the turbulent diffusion process presented by Zimmerman and Chatwin (1995). Relationships with other work are discovered, and some implications for future research are assessed.

Journal ArticleDOI
TL;DR: In this paper, a method is proposed by which the parameters of a point rainfall model, the Compound Poisson model, are estimated by Maximum Likelihood to simulate monthly rainfall in Guarico state, located at the central plains of Venezuela.
Abstract: Modelling accumulated rainfall at a given time scale has always been an important problem in hydrology for many applications. In many parts of the world rainfall is highly seasonal and parameter estimation of selected models is usually carried out by months or seasons at any particular location. A method is proposed by which the parameters of a point rainfall model, the Compound Poisson model are estimated by Maximum Likelihood to simulate monthly rainfall in Guarico state, located at the central plains of Venezuela. Due to the marked seasonal pattern in the region, the parameters are modelled by using periodic functions. Two types of periodic functions: Fourier series and quadratic polynomial splines were used. A Conditional Maximum Likelihood Method is proposed by which the coefficients of each periodic function are estimated for each parameter. At each step of the Conditional Maximum Likelihood method, an information criteria is used to select the number of Fourier harmonics or knot points to be used in the periodic representation of each model parameter. Results are presented for selected locations at the central plains of Venezuela. Regionalization of this method to simulate rainfall in locations without measurements records is also discussed. © 1998 John Wiley & Sons, Ltd.