scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Assessment of surface water quality using multivariate statistical techniques: A case study of the Fuji river basin, Japan

01 Apr 2007-Environmental Modelling and Software (ENVIRONMENTAL MODELLING AND SOFTWARE)-Vol. 22, Iss: 4, pp 464-475
TL;DR: This study illustrates the usefulness of multivariate statistical techniques for analysis and interpretation of complex data sets, and in water quality assessment, identification of pollution sources/factors and understanding temporal/spatial variations in waterquality for effective river water quality management.
Abstract: Multivariate statistical techniques, such as cluster analysis (CA), principal component analysis (PCA), factor analysis (FA) and discriminant analysis (DA), were applied for the evaluation of temporal/spatial variations and the interpretation of a large complex water quality data set of the Fuji river basin, generated during 8 years (1995–2002) monitoring of 12 parameters at 13 different sites (14 976 observations). Hierarchical cluster analysis grouped 13 sampling sites into three clusters, i.e., relatively less polluted (LP), medium polluted (MP) and highly polluted (HP) sites, based on the similarity of water quality characteristics. Factor analysis/principal component analysis, applied to the data sets of the three different groups obtained from cluster analysis, resulted in five, five and three latent factors explaining 73.18, 77.61 and 65.39% of the total variance in water quality data sets of LP, MP and HP areas, respectively. The varifactors obtained from factor analysis indicate that the parameters responsible for water quality variations are mainly related to discharge and temperature (natural), organic pollution (point source: domestic wastewater) in relatively less polluted areas; organic pollution (point source: domestic wastewater) and nutrients (non-point sources: agriculture and orchard plantations) in medium polluted areas; and organic pollution and nutrients (point sources: domestic wastewater, wastewater treatment plants and industries) in highly polluted areas in the basin. Discriminant analysis gave the best results for both spatial and temporal analysis. It provided an important data reduction as it uses only six parameters (discharge, temperature, dissolved oxygen, biochemical oxygen demand, electrical conductivity and nitrate nitrogen), affording more than 85% correct assignations in temporal analysis, and seven parameters (discharge, temperature, biochemical oxygen demand, pH, electrical conductivity, nitrate nitrogen and ammonical nitrogen), affording more than 81% correct assignations in spatial analysis, of three different sampling sites of the basin. Therefore, DA allowed a reduction in the dimensionality of the large data set, delineating a few indicator parameters responsible for large variations in water quality. Thus, this study illustrates the usefulness of multivariate statistical techniques for analysis and interpretation of complex data sets, and in water quality assessment, identification of pollution sources/factors and understanding temporal/spatial variations in water quality for effective river water quality management.
Citations
More filters
Journal ArticleDOI
TL;DR: The effects of natural processes and human influences in rural and urban aquatic systems, including pollution due to environmental parameters such as heavy metal pollution, heavy metals and bacterial and pathogenic contamination of both urban and rural areas are studied.
Abstract: Although water constitutes 71% of the earth's surface, only 0.3% of it is available as fresh water for human use. Moreover, the quality of fresh water in ground and surface systems is of great concern, as potable water needs to have appropriate mineral content. Ground and surface water quality in rural and urban environments is affected by both natural processes and anthropogenic influences. Because of this, water is becoming scarcer as the population increases across the world. Natural processes leading to changes in water quality include weathering of rocks, evapotranspiration, depositions due to wind, leaching from soil, run-off due to hydrological factors, and biological processes in the aquatic environment. These natural processes cause changes in the pH and alkalinity of the water, and also phosphorus loading, increase in fluoride content and high concentrations of sulphates. Anthropogenic factors affecting water quality include impacts due to agriculture, use of fertilizers, manures and pesticides,...

523 citations


Cites background from "Assessment of surface water quality..."

  • ...…discharged from smelting and heavy industrial enterprises (Zhang et al. 2009); however, nutrient and organic pollution includes point source and nonpoint source pollution, such as domestic wastewater, effluent from wastewater treatment plants and agricultural run-off (Shrestha and Kazama 2007)....

    [...]

Journal ArticleDOI
TL;DR: Evaluating water quality of the Aksu River, the main river recharging the Karacaören-1 Dam Lake and flowing approximately 145km from Isparta province to Mediterranean, shows that water quality is poor and very poor in the north and south of the river basin.

461 citations


Cites background from "Assessment of surface water quality..."

  • ...Similar temporal variations in concentration of nutrients have been reported by Shrestha and Kazama (2007)....

    [...]

Journal ArticleDOI
01 May 2012-Catena
TL;DR: In this article, the authors used enrichment factor (EF) and geoaccumulation index (I geo) to evaluate the level of contamination in sediment samples from the upper Tigris River and found that the sediments of sites downstream of the copper mine plant showed significant enrichment with Cd, Co, Cu, Pb and Zn indicating metallic discharges from the Ergani Copper Mine Plant.
Abstract: The concentrations of total nitrogen (TN), total phosphorus (TP), As, Cd, Co, Cr, Cu, Fe, Mn, Ni, Pb and Zn in both surface water and sediment samples from the upper Tigris River were determined to evaluate the level of contamination. All metal concentrations in water samples, except Cu, were lower than the maximum permitted concentration for the protection of aquatic life. TN, TP and metal concentrations in sediment samples from the first three sites situated downstream of Ergani Copper Mine Plant were much higher than those at other sites. There was a significant decrease in the concentrations of heavy metals in sediment from the last site downstream of the Dicle Dam. Sediment pollution assessment was undertaken using enrichment factor (EF) and geoaccumulation index ( I geo ). The sediments of sites downstream of the copper mine plant showed significant enrichment with Cd, Co, Cu, Pb and Zn, indicating metallic discharges from the Ergani Copper Mine Plant. The I geo values revealed that Cu (5.09), Co (4.26) and Zn (3.18) were significantly accumulated in the study area. Based on the comparison with sediment quality guidelines, the concentrations of Cr, Cu, Ni, Pb and Zn at sites downstream of the copper mine plant are likely to result in harmful effects on sediment-dwelling organisms. Cluster analysis suggests that As, Cd, Co, Cu, Ni, Pb and Zn are derived from anthropogenic sources, particularly metallic discharges of the copper mine plant.

442 citations

Journal ArticleDOI
TL;DR: The necessity and usefulness of multivariate statistical techniques for evaluation and interpretation of the data with a view to get better information about the water quality and design some remedial techniques to prevent the pollution caused by hazardous toxic elements in future are indicated.

310 citations

Journal ArticleDOI
TL;DR: It is concluded that the application of environmetric methods can reveal meaningful information on the spatial variability of a large and complex river water quality data.
Abstract: This study investigates the spatial water quality pattern of seven stations located along the main Langat River. Environmetric methods, namely, the hierarchical agglomerative cluster analysis (HACA), the discriminant analysis (DA), the principal component analysis (PCA), and the factor analysis (FA), were used to study the spatial variations of the most significant water quality variables and to determine the origin of pollution sources. Twenty-three water quality parameters were initially selected and analyzed. Three spatial clusters were formed based on HACA. These clusters are designated as downstream of Langat river, middle stream of Langat river, and upstream of Langat River regions. Forward and backward stepwise DA managed to discriminate six and seven water quality variables, respectively, from the original 23 variables. PCA and FA (varimax functionality) were used to investigate the origin of each water quality variable due to land use activities based on the three clustered regions. Seven principal components (PCs) were obtained with 81% total variation for the high-pollution source (HPS) region, while six PCs with 71% and 79% total variances were obtained for the moderate-pollution source (MPS) and low-pollution source (LPS) regions, respectively. The pollution sources for the HPS and MPS are of anthropogenic sources (industrial, municipal waste, and agricultural runoff). For the LPS region, the domestic and agricultural runoffs are the main sources of pollution. From this study, we can conclude that the application of environmetric methods can reveal meaningful information on the spatial variability of a large and complex river water quality data.

306 citations


Cites background or methods from "Assessment of surface water quality..."

  • ...2006) of samples (observations) or sampling stations and the identification of pollution sources (Massart et al. 1997; Vega et al. 1998; Shrestha and Kazama 2007)....

    [...]

  • ...The quotient is usually multiplied by 100 as a way to standardize the linkage distance represented by the y-axis (Singh et al. 2004, 2005; Shrestha and Kazama 2007)....

    [...]

  • ...…methods have often been used in exploratory data analysis tools for classification (Brodnjak-Voncina et al. 2002; Kowalkowski et al. 2006) of samples (observations) or sampling stations and the identification of pollution sources (Massart et al. 1997; Vega et al. 1998; Shrestha and Kazama 2007)....

    [...]

References
More filters
Book
01 Jan 1982
TL;DR: In this article, the authors present an overview of the basic concepts of multivariate analysis, including matrix algebra and random vectors, as well as a strategy for analyzing multivariate models.
Abstract: (NOTE: Each chapter begins with an Introduction, and concludes with Exercises and References.) I. GETTING STARTED. 1. Aspects of Multivariate Analysis. Applications of Multivariate Techniques. The Organization of Data. Data Displays and Pictorial Representations. Distance. Final Comments. 2. Matrix Algebra and Random Vectors. Some Basics of Matrix and Vector Algebra. Positive Definite Matrices. A Square-Root Matrix. Random Vectors and Matrices. Mean Vectors and Covariance Matrices. Matrix Inequalities and Maximization. Supplement 2A Vectors and Matrices: Basic Concepts. 3. Sample Geometry and Random Sampling. The Geometry of the Sample. Random Samples and the Expected Values of the Sample Mean and Covariance Matrix. Generalized Variance. Sample Mean, Covariance, and Correlation as Matrix Operations. Sample Values of Linear Combinations of Variables. 4. The Multivariate Normal Distribution. The Multivariate Normal Density and Its Properties. Sampling from a Multivariate Normal Distribution and Maximum Likelihood Estimation. The Sampling Distribution of 'X and S. Large-Sample Behavior of 'X and S. Assessing the Assumption of Normality. Detecting Outliners and Data Cleaning. Transformations to Near Normality. II. INFERENCES ABOUT MULTIVARIATE MEANS AND LINEAR MODELS. 5. Inferences About a Mean Vector. The Plausibility of ...m0 as a Value for a Normal Population Mean. Hotelling's T 2 and Likelihood Ratio Tests. Confidence Regions and Simultaneous Comparisons of Component Means. Large Sample Inferences about a Population Mean Vector. Multivariate Quality Control Charts. Inferences about Mean Vectors When Some Observations Are Missing. Difficulties Due To Time Dependence in Multivariate Observations. Supplement 5A Simultaneous Confidence Intervals and Ellipses as Shadows of the p-Dimensional Ellipsoids. 6. Comparisons of Several Multivariate Means. Paired Comparisons and a Repeated Measures Design. Comparing Mean Vectors from Two Populations. Comparison of Several Multivariate Population Means (One-Way MANOVA). Simultaneous Confidence Intervals for Treatment Effects. Two-Way Multivariate Analysis of Variance. Profile Analysis. Repealed Measures, Designs, and Growth Curves. Perspectives and a Strategy for Analyzing Multivariate Models. 7. Multivariate Linear Regression Models. The Classical Linear Regression Model. Least Squares Estimation. Inferences About the Regression Model. Inferences from the Estimated Regression Function. Model Checking and Other Aspects of Regression. Multivariate Multiple Regression. The Concept of Linear Regression. Comparing the Two Formulations of the Regression Model. Multiple Regression Models with Time Dependant Errors. Supplement 7A The Distribution of the Likelihood Ratio for the Multivariate Regression Model. III. ANALYSIS OF A COVARIANCE STRUCTURE. 8. Principal Components. Population Principal Components. Summarizing Sample Variation by Principal Components. Graphing the Principal Components. Large-Sample Inferences. Monitoring Quality with Principal Components. Supplement 8A The Geometry of the Sample Principal Component Approximation. 9. Factor Analysis and Inference for Structured Covariance Matrices. The Orthogonal Factor Model. Methods of Estimation. Factor Rotation. Factor Scores. Perspectives and a Strategy for Factor Analysis. Structural Equation Models. Supplement 9A Some Computational Details for Maximum Likelihood Estimation. 10. Canonical Correlation Analysis Canonical Variates and Canonical Correlations. Interpreting the Population Canonical Variables. The Sample Canonical Variates and Sample Canonical Correlations. Additional Sample Descriptive Measures. Large Sample Inferences. IV. CLASSIFICATION AND GROUPING TECHNIQUES. 11. Discrimination and Classification. Separation and Classification for Two Populations. Classifications with Two Multivariate Normal Populations. Evaluating Classification Functions. Fisher's Discriminant Function...nSeparation of Populations. Classification with Several Populations. Fisher's Method for Discriminating among Several Populations. Final Comments. 12. Clustering, Distance Methods and Ordination. Similarity Measures. Hierarchical Clustering Methods. Nonhierarchical Clustering Methods. Multidimensional Scaling. Correspondence Analysis. Biplots for Viewing Sample Units and Variables. Procustes Analysis: A Method for Comparing Configurations. Appendix. Standard Normal Probabilities. Student's t-Distribution Percentage Points. ...c2 Distribution Percentage Points. F-Distribution Percentage Points. F-Distribution Percentage Points (...a = .10). F-Distribution Percentage Points (...a = .05). F-Distribution Percentage Points (...a = .01). Data Index. Subject Index.

11,697 citations

Journal ArticleDOI
TL;DR: In this article, the authors present an overview of the basic concepts of multivariate analysis, including matrix algebra and random vectors, as well as a strategy for analyzing multivariate models.
Abstract: (NOTE: Each chapter begins with an Introduction, and concludes with Exercises and References.) I. GETTING STARTED. 1. Aspects of Multivariate Analysis. Applications of Multivariate Techniques. The Organization of Data. Data Displays and Pictorial Representations. Distance. Final Comments. 2. Matrix Algebra and Random Vectors. Some Basics of Matrix and Vector Algebra. Positive Definite Matrices. A Square-Root Matrix. Random Vectors and Matrices. Mean Vectors and Covariance Matrices. Matrix Inequalities and Maximization. Supplement 2A Vectors and Matrices: Basic Concepts. 3. Sample Geometry and Random Sampling. The Geometry of the Sample. Random Samples and the Expected Values of the Sample Mean and Covariance Matrix. Generalized Variance. Sample Mean, Covariance, and Correlation as Matrix Operations. Sample Values of Linear Combinations of Variables. 4. The Multivariate Normal Distribution. The Multivariate Normal Density and Its Properties. Sampling from a Multivariate Normal Distribution and Maximum Likelihood Estimation. The Sampling Distribution of 'X and S. Large-Sample Behavior of 'X and S. Assessing the Assumption of Normality. Detecting Outliners and Data Cleaning. Transformations to Near Normality. II. INFERENCES ABOUT MULTIVARIATE MEANS AND LINEAR MODELS. 5. Inferences About a Mean Vector. The Plausibility of ...m0 as a Value for a Normal Population Mean. Hotelling's T 2 and Likelihood Ratio Tests. Confidence Regions and Simultaneous Comparisons of Component Means. Large Sample Inferences about a Population Mean Vector. Multivariate Quality Control Charts. Inferences about Mean Vectors When Some Observations Are Missing. Difficulties Due To Time Dependence in Multivariate Observations. Supplement 5A Simultaneous Confidence Intervals and Ellipses as Shadows of the p-Dimensional Ellipsoids. 6. Comparisons of Several Multivariate Means. Paired Comparisons and a Repeated Measures Design. Comparing Mean Vectors from Two Populations. Comparison of Several Multivariate Population Means (One-Way MANOVA). Simultaneous Confidence Intervals for Treatment Effects. Two-Way Multivariate Analysis of Variance. Profile Analysis. Repealed Measures, Designs, and Growth Curves. Perspectives and a Strategy for Analyzing Multivariate Models. 7. Multivariate Linear Regression Models. The Classical Linear Regression Model. Least Squares Estimation. Inferences About the Regression Model. Inferences from the Estimated Regression Function. Model Checking and Other Aspects of Regression. Multivariate Multiple Regression. The Concept of Linear Regression. Comparing the Two Formulations of the Regression Model. Multiple Regression Models with Time Dependant Errors. Supplement 7A The Distribution of the Likelihood Ratio for the Multivariate Regression Model. III. ANALYSIS OF A COVARIANCE STRUCTURE. 8. Principal Components. Population Principal Components. Summarizing Sample Variation by Principal Components. Graphing the Principal Components. Large-Sample Inferences. Monitoring Quality with Principal Components. Supplement 8A The Geometry of the Sample Principal Component Approximation. 9. Factor Analysis and Inference for Structured Covariance Matrices. The Orthogonal Factor Model. Methods of Estimation. Factor Rotation. Factor Scores. Perspectives and a Strategy for Factor Analysis. Structural Equation Models. Supplement 9A Some Computational Details for Maximum Likelihood Estimation. 10. Canonical Correlation Analysis Canonical Variates and Canonical Correlations. Interpreting the Population Canonical Variables. The Sample Canonical Variates and Sample Canonical Correlations. Additional Sample Descriptive Measures. Large Sample Inferences. IV. CLASSIFICATION AND GROUPING TECHNIQUES. 11. Discrimination and Classification. Separation and Classification for Two Populations. Classifications with Two Multivariate Normal Populations. Evaluating Classification Functions. Fisher's Discriminant Function...nSeparation of Populations. Classification with Several Populations. Fisher's Method for Discriminating among Several Populations. Final Comments. 12. Clustering, Distance Methods and Ordination. Similarity Measures. Hierarchical Clustering Methods. Nonhierarchical Clustering Methods. Multidimensional Scaling. Correspondence Analysis. Biplots for Viewing Sample Units and Variables. Procustes Analysis: A Method for Comparing Configurations. Appendix. Standard Normal Probabilities. Student's t-Distribution Percentage Points. ...c2 Distribution Percentage Points. F-Distribution Percentage Points. F-Distribution Percentage Points (...a = .10). F-Distribution Percentage Points (...a = .05). F-Distribution Percentage Points (...a = .01). Data Index. Subject Index.

10,148 citations

Book
01 Nov 1978
TL;DR: Describes the mathematical and logical foundations at a level which does not presume advanced mathematical or statistical skills, illustrating how to do factor analysis with several of the more popular packaged computer programmes.
Abstract: Describes the mathematical and logical foundations at a level which does not presume advanced mathematical or statistical skills, illustrating how to do factor analysis with several of the more popular packaged computer programmes.

1,455 citations

Journal ArticleDOI
TL;DR: This study presents necessity and usefulness of multivariate statistical techniques for evaluation and interpretation of large complex data sets with a view to get better information about the water quality and design of monitoring network for effective management of water resources.

1,429 citations

Journal ArticleDOI
TL;DR: The over-extraction of groundwater is the major cause of groundwater salinization and arsenic pollution in the coastal area of Yun-Lin, Taiwan and this model explains over 77.8% of the total groundwater quality variation.

1,429 citations