scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Multivariate statistical techniques for the evaluation of spatial and temporal variations in water quality of Gomti River (India)—a case study

TL;DR: This study presents necessity and usefulness of multivariate statistical techniques for evaluation and interpretation of large complex data sets with a view to get better information about the water quality and design of monitoring network for effective management of water resources.
About: This article is published in Water Research.The article was published on 2004-11-01. It has received 1429 citations till now. The article focuses on the topics: Multivariate statistics & Principal component analysis.
Citations
More filters
Journal ArticleDOI
TL;DR: This study illustrates the usefulness of multivariate statistical techniques for analysis and interpretation of complex data sets, and in water quality assessment, identification of pollution sources/factors and understanding temporal/spatial variations in waterquality for effective river water quality management.
Abstract: Multivariate statistical techniques, such as cluster analysis (CA), principal component analysis (PCA), factor analysis (FA) and discriminant analysis (DA), were applied for the evaluation of temporal/spatial variations and the interpretation of a large complex water quality data set of the Fuji river basin, generated during 8 years (1995–2002) monitoring of 12 parameters at 13 different sites (14 976 observations). Hierarchical cluster analysis grouped 13 sampling sites into three clusters, i.e., relatively less polluted (LP), medium polluted (MP) and highly polluted (HP) sites, based on the similarity of water quality characteristics. Factor analysis/principal component analysis, applied to the data sets of the three different groups obtained from cluster analysis, resulted in five, five and three latent factors explaining 73.18, 77.61 and 65.39% of the total variance in water quality data sets of LP, MP and HP areas, respectively. The varifactors obtained from factor analysis indicate that the parameters responsible for water quality variations are mainly related to discharge and temperature (natural), organic pollution (point source: domestic wastewater) in relatively less polluted areas; organic pollution (point source: domestic wastewater) and nutrients (non-point sources: agriculture and orchard plantations) in medium polluted areas; and organic pollution and nutrients (point sources: domestic wastewater, wastewater treatment plants and industries) in highly polluted areas in the basin. Discriminant analysis gave the best results for both spatial and temporal analysis. It provided an important data reduction as it uses only six parameters (discharge, temperature, dissolved oxygen, biochemical oxygen demand, electrical conductivity and nitrate nitrogen), affording more than 85% correct assignations in temporal analysis, and seven parameters (discharge, temperature, biochemical oxygen demand, pH, electrical conductivity, nitrate nitrogen and ammonical nitrogen), affording more than 81% correct assignations in spatial analysis, of three different sampling sites of the basin. Therefore, DA allowed a reduction in the dimensionality of the large data set, delineating a few indicator parameters responsible for large variations in water quality. Thus, this study illustrates the usefulness of multivariate statistical techniques for analysis and interpretation of complex data sets, and in water quality assessment, identification of pollution sources/factors and understanding temporal/spatial variations in water quality for effective river water quality management.

1,481 citations

Journal ArticleDOI
TL;DR: In this paper, multivariate statistical techniques, such as cluster analysis, factor analysis, principal component analysis and discriminant analysis, were applied to the data set on water quality of the Gomti river.

839 citations

Journal ArticleDOI
TL;DR: The results revealed that the major causes of water quality deterioration were related to inflow of effluent from industrial, domestic, agricultural and saline seeps into the lake at site 1 and also resulting from people living in boats and fishing at sites 2 and 3.

587 citations

Journal ArticleDOI
TL;DR: In this article, two ANN models were identified, validated and tested for the computation of dissolved oxygen (DO) and biochemical oxygen demand (BOD) concentrations in the Gomti river water.

553 citations

Journal ArticleDOI
TL;DR: In this article, water samples were collected from twelve different locations along the course of the river and its tributaries on summer and the winter seasons and the concentrations of trace metals such as cadmium, cromium, copper, cobalt, iron, manganese, nickel, lead, mercury and zinc were determined using atomic absorption spectrophotometer.
Abstract: The objective of the study is to reveal the seasonal variations in the river water quality with respect to heavy metals contamination. To get the extend of trace metals contamination, water samples were collected from twelve different locations along the course of the river and its tributaries on summer and the winter seasons. The concentrations of trace metals such as cadmium, cromium, copper, cobalt, iron, manganese, nickel, lead, mercury and zinc were determined using atomic absorption spectrophotometer. Most of the samples were found within limit of Indian drinking water standard (IS: 10500). The data generated were used to calculate the heavy metal pollution index of river water. The mean values of HPI were 36.19 in summer and 32.37 for winter seasons and these values are well below the critical index limit of 100 because of the sufficient flow in river system. Mercury and chromium could not be traced in any of the samples in the study area.

377 citations


Cites background from "Multivariate statistical techniques..."

  • ...However, the rivers play a major role in assimilation or transporting municipal and industrial wastewater and runoff from agricultural and mining land (Singh et al., 2004)....

    [...]

References
More filters
Book
01 Jan 1992
TL;DR: The most widely read reference in the water industry, Water Industry Reference as discussed by the authors, is a comprehensive reference tool for water analysis methods that covers all aspects of USEPA-approved water analysis.
Abstract: Set your standards with these standard methods. This is it: the most widely read publication in the water industry, your all-inclusive reference tool. This comprehensive reference covers all aspects of USEPA-approved water analysis methods. More than 400 methods - all detailed step-by-step; 8 vibrant, full-color pages of aquatic algae illustrations; Never-before-seen figures that will help users with toxicity testing and the identification of apparatus used in the methods; Over 300 superbly illustrated figures; A new analytical tool for a number of inorganic nonmetals; Improved coverage of data evaluation, sample preservation, and reagant water; And much more!

78,324 citations

Book
01 Jan 1982
TL;DR: In this article, the authors present an overview of the basic concepts of multivariate analysis, including matrix algebra and random vectors, as well as a strategy for analyzing multivariate models.
Abstract: (NOTE: Each chapter begins with an Introduction, and concludes with Exercises and References.) I. GETTING STARTED. 1. Aspects of Multivariate Analysis. Applications of Multivariate Techniques. The Organization of Data. Data Displays and Pictorial Representations. Distance. Final Comments. 2. Matrix Algebra and Random Vectors. Some Basics of Matrix and Vector Algebra. Positive Definite Matrices. A Square-Root Matrix. Random Vectors and Matrices. Mean Vectors and Covariance Matrices. Matrix Inequalities and Maximization. Supplement 2A Vectors and Matrices: Basic Concepts. 3. Sample Geometry and Random Sampling. The Geometry of the Sample. Random Samples and the Expected Values of the Sample Mean and Covariance Matrix. Generalized Variance. Sample Mean, Covariance, and Correlation as Matrix Operations. Sample Values of Linear Combinations of Variables. 4. The Multivariate Normal Distribution. The Multivariate Normal Density and Its Properties. Sampling from a Multivariate Normal Distribution and Maximum Likelihood Estimation. The Sampling Distribution of 'X and S. Large-Sample Behavior of 'X and S. Assessing the Assumption of Normality. Detecting Outliners and Data Cleaning. Transformations to Near Normality. II. INFERENCES ABOUT MULTIVARIATE MEANS AND LINEAR MODELS. 5. Inferences About a Mean Vector. The Plausibility of ...m0 as a Value for a Normal Population Mean. Hotelling's T 2 and Likelihood Ratio Tests. Confidence Regions and Simultaneous Comparisons of Component Means. Large Sample Inferences about a Population Mean Vector. Multivariate Quality Control Charts. Inferences about Mean Vectors When Some Observations Are Missing. Difficulties Due To Time Dependence in Multivariate Observations. Supplement 5A Simultaneous Confidence Intervals and Ellipses as Shadows of the p-Dimensional Ellipsoids. 6. Comparisons of Several Multivariate Means. Paired Comparisons and a Repeated Measures Design. Comparing Mean Vectors from Two Populations. Comparison of Several Multivariate Population Means (One-Way MANOVA). Simultaneous Confidence Intervals for Treatment Effects. Two-Way Multivariate Analysis of Variance. Profile Analysis. Repealed Measures, Designs, and Growth Curves. Perspectives and a Strategy for Analyzing Multivariate Models. 7. Multivariate Linear Regression Models. The Classical Linear Regression Model. Least Squares Estimation. Inferences About the Regression Model. Inferences from the Estimated Regression Function. Model Checking and Other Aspects of Regression. Multivariate Multiple Regression. The Concept of Linear Regression. Comparing the Two Formulations of the Regression Model. Multiple Regression Models with Time Dependant Errors. Supplement 7A The Distribution of the Likelihood Ratio for the Multivariate Regression Model. III. ANALYSIS OF A COVARIANCE STRUCTURE. 8. Principal Components. Population Principal Components. Summarizing Sample Variation by Principal Components. Graphing the Principal Components. Large-Sample Inferences. Monitoring Quality with Principal Components. Supplement 8A The Geometry of the Sample Principal Component Approximation. 9. Factor Analysis and Inference for Structured Covariance Matrices. The Orthogonal Factor Model. Methods of Estimation. Factor Rotation. Factor Scores. Perspectives and a Strategy for Factor Analysis. Structural Equation Models. Supplement 9A Some Computational Details for Maximum Likelihood Estimation. 10. Canonical Correlation Analysis Canonical Variates and Canonical Correlations. Interpreting the Population Canonical Variables. The Sample Canonical Variates and Sample Canonical Correlations. Additional Sample Descriptive Measures. Large Sample Inferences. IV. CLASSIFICATION AND GROUPING TECHNIQUES. 11. Discrimination and Classification. Separation and Classification for Two Populations. Classifications with Two Multivariate Normal Populations. Evaluating Classification Functions. Fisher's Discriminant Function...nSeparation of Populations. Classification with Several Populations. Fisher's Method for Discriminating among Several Populations. Final Comments. 12. Clustering, Distance Methods and Ordination. Similarity Measures. Hierarchical Clustering Methods. Nonhierarchical Clustering Methods. Multidimensional Scaling. Correspondence Analysis. Biplots for Viewing Sample Units and Variables. Procustes Analysis: A Method for Comparing Configurations. Appendix. Standard Normal Probabilities. Student's t-Distribution Percentage Points. ...c2 Distribution Percentage Points. F-Distribution Percentage Points. F-Distribution Percentage Points (...a = .10). F-Distribution Percentage Points (...a = .05). F-Distribution Percentage Points (...a = .01). Data Index. Subject Index.

11,697 citations

Journal ArticleDOI
TL;DR: In this article, the authors present an overview of the basic concepts of multivariate analysis, including matrix algebra and random vectors, as well as a strategy for analyzing multivariate models.
Abstract: (NOTE: Each chapter begins with an Introduction, and concludes with Exercises and References.) I. GETTING STARTED. 1. Aspects of Multivariate Analysis. Applications of Multivariate Techniques. The Organization of Data. Data Displays and Pictorial Representations. Distance. Final Comments. 2. Matrix Algebra and Random Vectors. Some Basics of Matrix and Vector Algebra. Positive Definite Matrices. A Square-Root Matrix. Random Vectors and Matrices. Mean Vectors and Covariance Matrices. Matrix Inequalities and Maximization. Supplement 2A Vectors and Matrices: Basic Concepts. 3. Sample Geometry and Random Sampling. The Geometry of the Sample. Random Samples and the Expected Values of the Sample Mean and Covariance Matrix. Generalized Variance. Sample Mean, Covariance, and Correlation as Matrix Operations. Sample Values of Linear Combinations of Variables. 4. The Multivariate Normal Distribution. The Multivariate Normal Density and Its Properties. Sampling from a Multivariate Normal Distribution and Maximum Likelihood Estimation. The Sampling Distribution of 'X and S. Large-Sample Behavior of 'X and S. Assessing the Assumption of Normality. Detecting Outliners and Data Cleaning. Transformations to Near Normality. II. INFERENCES ABOUT MULTIVARIATE MEANS AND LINEAR MODELS. 5. Inferences About a Mean Vector. The Plausibility of ...m0 as a Value for a Normal Population Mean. Hotelling's T 2 and Likelihood Ratio Tests. Confidence Regions and Simultaneous Comparisons of Component Means. Large Sample Inferences about a Population Mean Vector. Multivariate Quality Control Charts. Inferences about Mean Vectors When Some Observations Are Missing. Difficulties Due To Time Dependence in Multivariate Observations. Supplement 5A Simultaneous Confidence Intervals and Ellipses as Shadows of the p-Dimensional Ellipsoids. 6. Comparisons of Several Multivariate Means. Paired Comparisons and a Repeated Measures Design. Comparing Mean Vectors from Two Populations. Comparison of Several Multivariate Population Means (One-Way MANOVA). Simultaneous Confidence Intervals for Treatment Effects. Two-Way Multivariate Analysis of Variance. Profile Analysis. Repealed Measures, Designs, and Growth Curves. Perspectives and a Strategy for Analyzing Multivariate Models. 7. Multivariate Linear Regression Models. The Classical Linear Regression Model. Least Squares Estimation. Inferences About the Regression Model. Inferences from the Estimated Regression Function. Model Checking and Other Aspects of Regression. Multivariate Multiple Regression. The Concept of Linear Regression. Comparing the Two Formulations of the Regression Model. Multiple Regression Models with Time Dependant Errors. Supplement 7A The Distribution of the Likelihood Ratio for the Multivariate Regression Model. III. ANALYSIS OF A COVARIANCE STRUCTURE. 8. Principal Components. Population Principal Components. Summarizing Sample Variation by Principal Components. Graphing the Principal Components. Large-Sample Inferences. Monitoring Quality with Principal Components. Supplement 8A The Geometry of the Sample Principal Component Approximation. 9. Factor Analysis and Inference for Structured Covariance Matrices. The Orthogonal Factor Model. Methods of Estimation. Factor Rotation. Factor Scores. Perspectives and a Strategy for Factor Analysis. Structural Equation Models. Supplement 9A Some Computational Details for Maximum Likelihood Estimation. 10. Canonical Correlation Analysis Canonical Variates and Canonical Correlations. Interpreting the Population Canonical Variables. The Sample Canonical Variates and Sample Canonical Correlations. Additional Sample Descriptive Measures. Large Sample Inferences. IV. CLASSIFICATION AND GROUPING TECHNIQUES. 11. Discrimination and Classification. Separation and Classification for Two Populations. Classifications with Two Multivariate Normal Populations. Evaluating Classification Functions. Fisher's Discriminant Function...nSeparation of Populations. Classification with Several Populations. Fisher's Method for Discriminating among Several Populations. Final Comments. 12. Clustering, Distance Methods and Ordination. Similarity Measures. Hierarchical Clustering Methods. Nonhierarchical Clustering Methods. Multidimensional Scaling. Correspondence Analysis. Biplots for Viewing Sample Units and Variables. Procustes Analysis: A Method for Comparing Configurations. Appendix. Standard Normal Probabilities. Student's t-Distribution Percentage Points. ...c2 Distribution Percentage Points. F-Distribution Percentage Points. F-Distribution Percentage Points (...a = .10). F-Distribution Percentage Points (...a = .05). F-Distribution Percentage Points (...a = .01). Data Index. Subject Index.

10,148 citations

Journal ArticleDOI
TL;DR: In this article, a review of the available scientific information, they are confident that nonpoint pollution of surface waters with P and N could be reduced by reducing surplus nutrient flows in agricultural systems and processes, reducing agricultural and urban runoff by diverse methods, and reducing N emissions from fossil fuel burning, but rates of recovery are highly variable among water bodies.
Abstract: Agriculture and urban activities are major sources of phosphorus and nitrogen to aquatic ecosystems. Atmospheric deposition further contributes as a source of N. These nonpoint inputs of nutrients are difficult to measure and regulate because they derive from activities dispersed over wide areas of land and are variable in time due to effects of weather. In aquatic ecosystems, these nutrients cause diverse problems such as toxic algal blooms, loss of oxygen, fish kills, loss of biodiversity (including species important for commerce and recreation), loss of aquatic plant beds and coral reefs, and other problems. Nutrient enrichment seriously degrades aquatic ecosystems and impairs the use of water for drinking, industry, agriculture, recreation, and other purposes. Based on our review of the scientific literature, we are certain that (1) eutrophication is a widespread problem in rivers, lakes, estuaries, and coastal oceans, caused by overenrichment with P and N; (2) nonpoint pollution, a major source of P and N to surface waters of the United States, results primarily from agriculture and urban activity, including industry; (3) inputs of P and N to agriculture in the form of fertilizers exceed outputs in produce in the United States and many other nations; (4) nutrient flows to aquatic ecosystems are directly related to animal stocking densities, and under high livestock densities, manure production exceeds the needs of crops to which the manure is applied; (5) excess fertilization and manure production cause a P surplus to accumulate in soil, some of which is transported to aquatic ecosystems; and (6) excess fertilization and manure production on agricultural lands create surplus N, which is mobile in many soils and often leaches to downstream aquatic ecosystems, and which can also volatilize to the atmosphere, redepositing elsewhere and eventually reaching aquatic ecosystems. If current practices continue, nonpoint pollution of surface waters is virtually certain to increase in the future. Such an outcome is not inevitable, however, because a number of technologies, land use practices, and conservation measures are capable of decreasing the flow of nonpoint P and N into surface waters. From our review of the available scientific information, we are confident that: (1) nonpoint pollution of surface waters with P and N could be reduced by reducing surplus nutrient flows in agricultural systems and processes, reducing agricultural and urban runoff by diverse methods, and reducing N emissions from fossil fuel burning; and (2) eutrophication can be reversed by decreasing input rates of P and N to aquatic ecosystems, but rates of recovery are highly variable among water bodies. Often, the eutrophic state is persistent, and recovery is slow.

5,662 citations

Book
13 Mar 1991
TL;DR: In this paper, the authors present a directory of Symbols and Definitions for PCA, as well as some classic examples of PCA applications, such as: linear models, regression PCA of predictor variables, and analysis of variance PCA for Response Variables.
Abstract: Preface.Introduction.1. Getting Started.2. PCA with More Than Two Variables.3. Scaling of Data.4. Inferential Procedures.5. Putting It All Together-Hearing Loss I.6. Operations with Group Data.7. Vector Interpretation I : Simplifications and Inferential Techniques.8. Vector Interpretation II: Rotation.9. A Case History-Hearing Loss II.10. Singular Value Decomposition: Multidimensional Scaling I.11. Distance Models: Multidimensional Scaling II.12. Linear Models I : Regression PCA of Predictor Variables.13. Linear Models II: Analysis of Variance PCA of Response Variables.14. Other Applications of PCA.15. Flatland: Special Procedures for Two Dimensions.16. Odds and Ends.17. What is Factor Analysis Anyhow?18. Other Competitors.Conclusion.Appendix A. Matrix Properties.Appendix B. Matrix Algebra Associated with Principal Component Analysis.Appendix C. Computational Methods.Appendix D. A Directory of Symbols and Definitions for PCA.Appendix E. Some Classic Examples.Appendix F. Data Sets Used in This Book.Appendix G. Tables.Bibliography.Author Index.Subject Index.

3,534 citations