scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Regionalization of patterns of flow intermittence from gauging station records

TL;DR: In this article, the authors used daily flow records from 628 gauging stations on rivers with minimally modified flows distributed throughout France to predict regional patterns of flow intermittence, and used random forest (RF) models to relate the flow-regime and intermittence classifications to several environmental characteristics of the gauging station catchments.
Abstract: . Understanding large-scale patterns in flow intermittence is important for effective river management. The duration and frequency of zero-flow periods are associated with the ecological characteristics of rivers and have important implications for water resources management. We used daily flow records from 628 gauging stations on rivers with minimally modified flows distributed throughout France to predict regional patterns of flow intermittence. For each station we calculated two annual times series describing flow intermittence; the frequency of zero-flow periods (consecutive days of zero flow) in each year of record (FREQ; yr−1), and the total number of zero-flow days in each year of record (DUR; days). These time series were used to calculate two indices for each station, the mean annual frequency of zero-flow periods (mFREQ; yr−1), and the mean duration of zero-flow periods (mDUR; days). Approximately 20% of stations had recorded at least one zero-flow period in their record. Dissimilarities between pairs of gauges calculated from the annual times series (FREQ and DUR) and geographic distances were weakly correlated, indicating that there was little spatial synchronization of zero flow. A flow-regime classification for the gauging stations discriminated intermittent and perennial stations, and an intermittence classification grouped intermittent stations into three classes based on the values of mFREQ and mDUR. We used random forest (RF) models to relate the flow-regime and intermittence classifications to several environmental characteristics of the gauging station catchments. The RF model of the flow-regime classification had a cross-validated Cohen's kappa of 0.47, indicating fair performance and the intermittence classification had poor performance (cross-validated Cohen's kappa of 0.35). Both classification models identified significant environment-intermittence associations, in particular with regional-scale climate patterns and also catchment area, shape and slope. However, we suggest that the fair-to-poor performance of the classification models is because intermittence is also controlled by processes operating at scales smaller than catchments, such as groundwater-table fluctuations and seepage through permeable channels. We suggest that high spatial heterogeneity in these small-scale processes partly explains the low spatial synchronization of zero flows. While 20% of gauges were classified as intermittent, the flow-regime model predicted 39% of all river segments to be intermittent, indicating that the gauging station network under-represents intermittent river segments in France. Predictions of regional patterns in flow intermittence provide useful information for applications including environmental flow setting, estimating assimilative capacity for contaminants, designing bio-monitoring programs and making preliminary predictions of the effects of climate change on flow intermittence.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: For many decades, river research has been focused on perennial rivers, and recent studies suggest that alternating dry and wet conditions alter virtually all biotic communities and biogeochemical processes in these rivers as mentioned in this paper.
Abstract: For many decades, river research has been focused on perennial rivers. Intermittent river research has a shorter history, and recent studies suggest that alternating dry and wet conditions alter virtually all biotic communities and biogeochemical processes in these rivers. Intermittent rivers constitute more than half of the length of the global river network and are increasing in number and length in response to climate change, landuse alteration, and water abstraction. Our views of the roles that rivers play in maintaining biodiversity and controlling material fluxes will change substantially when intermittent rivers are fully integrated into regional and global analyses. Concepts, questions, and methodologies from lotic, lentic, and terrestrial ecology need to be integrated and applied to intermittent rivers to increase our knowledge and effective management of these rivers.

469 citations

Journal Article

335 citations

Journal ArticleDOI
01 Apr 2019-Water
TL;DR: This work popularizes RF and their variants for the practicing water scientist, and discusses related concepts and techniques, which have received less attention from the water science and hydrologic communities.
Abstract: Random forests (RF) is a supervised machine learning algorithm, which has recently started to gain prominence in water resources applications. However, existing applications are generally restricted to the implementation of Breiman’s original algorithm for regression and classification problems, while numerous developments could be also useful in solving diverse practical problems in the water sector. Here we popularize RF and their variants for the practicing water scientist, and discuss related concepts and techniques, which have received less attention from the water science and hydrologic communities. In doing so, we review RF applications in water resources, highlight the potential of the original algorithm and its variants, and assess the degree of RF exploitation in a diverse range of applications. Relevant implementations of random forests, as well as related concepts and techniques in the R programming language, are also covered.

283 citations

Journal ArticleDOI
07 Mar 2014-Science
TL;DR: A proposed ruling by the U.S. Environmental Protection Agency (EPA) aimed at clarifying which bodies of water that flow intermittently are protected under law has provoked conflict between developers and environmental advocates as mentioned in this paper.
Abstract: A proposed ruling by the U.S. Environmental Protection Agency (EPA), aimed at clarifying which bodies of water that flow intermittently are protected under law ( 1 ), has provoked conflict between developers and environmental advocates. Some argue that temporary streams and rivers, defined as waterways that cease to flow at some points in space and time along their course (see the figure, left) ( Fig. 1) ( 2 ), are essential to the integrity of entire river networks. Others argue that full protection will be too costly. Similar concerns extend far beyond the United States. Debate over how to treat temporary waterways in water-policy frameworks is ongoing ( 3 ), particularly because some large permanent rivers are shifting to temporary because of climate change and extraction of water ( 4 ). Even without human-induced changes, flow intermittency is part of the natural hydrology for streams and rivers globally.

279 citations

Journal ArticleDOI
TL;DR: In this article, a special issue on med-rivers synthesizes information presented in 21 articles covering the five Mediterranean Basin, coastal California, central Chile, Cape region of South Africa, and southwest and southern Australia.
Abstract: Streams and rivers in mediterranean-climate regions (med-rivers in med-regions) are ecologically unique, with flow regimes reflecting precipitation patterns. Although timing of drying and flooding is predictable, seasonal and annual intensity of these events is not. Sequential flooding and drying, coupled with anthropogenic influences make these med-rivers among the most stressed riverine habitat worldwide. Med-rivers are hotspots for biodiversity in all med-regions. Species in med-rivers require different, often opposing adaptive mechanisms to survive drought and flood conditions or recover from them. Thus, metacommunities undergo seasonal differences, reflecting cycles of river fragmentation and connectivity, which also affect ecosystem functioning. River conservation and management is challenging, and trade-offs between environmental and human uses are complex, especially under future climate change scenarios. This overview of a Special Issue on med-rivers synthesizes information presented in 21 articles covering the five med-regions worldwide: Mediterranean Basin, coastal California, central Chile, Cape region of South Africa, and southwest and southern Australia. Research programs to increase basic knowledge in less-developed med-regions should be prioritized to achieve increased abilities to better manage med-rivers.

196 citations


Cites background from "Regionalization of patterns of flow..."

  • ...For example, 59% of the total river length in United States and 39% in France is temporary (Nadeau & Rains, 2007; Snelder et al., 2013)....

    [...]

  • ...The other med-regions received human impacts later, and these increased dramatically after the arrival and settlement of Europeans between the fifteen and eighteen centuries (Conacher & Sala, 1998)....

    [...]

References
More filters
Journal ArticleDOI
01 Oct 2001
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Abstract: Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, aaa, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.

79,257 citations


"Regionalization of patterns of flow..." refers background or methods in this paper

  • ...Importance represents the increase in the misclassification rate that could be expected for new cases (i.e., cases not used to fit the model) if the predictor was excluded from the model (Breiman, 2001)....

    [...]

  • ...Importance measures indicate the contribution of the predictors to model accuracy and are calculated from the degradation in model performance (i.e., the increase in misclassification rate) when a predictor is randomly permuted (Breiman, 2001)....

    [...]

  • ...The limitations in CART models can be reduced by using RF models (Breiman, 2001)....

    [...]

  • ...RF models produce a limiting value of the generalization error (Breiman, 2001)....

    [...]

  • ...For a detailed description of RF models see Breiman (2001) and Cutler et al. (2007)....

    [...]

Journal ArticleDOI
Jacob Cohen1
TL;DR: In this article, the authors present a procedure for having two or more judges independently categorize a sample of units and determine the degree, significance, and significance of the units. But they do not discuss the extent to which these judgments are reproducible, i.e., reliable.
Abstract: CONSIDER Table 1. It represents in its formal characteristics a situation which arises in the clinical-social-personality areas of psychology, where it frequently occurs that the only useful level of measurement obtainable is nominal scaling (Stevens, 1951, pp. 2526), i.e. placement in a set of k unordered categories. Because the categorizing of the units is a consequence of some complex judgment process performed by a &dquo;two-legged meter&dquo; (Stevens, 1958), it becomes important to determine the extent to which these judgments are reproducible, i.e., reliable. The procedure which suggests itself is that of having two (or more) judges independently categorize a sample of units and determine the degree, significance, and

34,965 citations


"Regionalization of patterns of flow..." refers background in this paper

  • ...There are several criteria that can be used to define the best threshold (Freeman and Moisen, 2008) including maximising the percent correctly classified (PCC) and maximising Cohen’s kappa (Cohen, 1960)....

    [...]

Journal ArticleDOI
TL;DR: A representation and interpretation of the area under a receiver operating characteristic (ROC) curve obtained by the "rating" method, or by mathematical predictions based on patient characteristics, is presented and it is shown that in such a setting the area represents the probability that a randomly chosen diseased subject is (correctly) rated or ranked with greater suspicion than a random chosen non-diseased subject.
Abstract: A representation and interpretation of the area under a receiver operating characteristic (ROC) curve obtained by the "rating" method, or by mathematical predictions based on patient characteristics, is presented. It is shown that in such a setting the area represents the probability that a randomly chosen diseased subject is (correctly) rated or ranked with greater suspicion than a randomly chosen non-diseased subject. Moreover, this probability of a correct ranking is the same quantity that is estimated by the already well-studied nonparametric Wilcoxon statistic. These two relationships are exploited to (a) provide rapid closed-form expressions for the approximate magnitude of the sampling variability, i.e., standard error that one uses to accompany the area under a smoothed ROC curve, (b) guide in determining the size of the sample required to provide a sufficiently reliable estimate of this area, and (c) determine how large sample sizes should be to ensure that one can statistically detect difference...

19,398 citations


"Regionalization of patterns of flow..." refers background in this paper

  • ...ROC plots show the true positive rate (sensitivity) against the false positive rate (1-specificity) as the threshold varies from 0 to 1 (Hanley and McNeil, 1982)....

    [...]

  • ...The area under the ROC plot (AUC) is a measure of overall model performance that is independent of the threshold, with good models having an AUC near 1, while a poor models will have an AUC near 0.5 (Hanley and McNeil, 1982)....

    [...]

Book
28 Jul 2013
TL;DR: In this paper, the authors describe the important ideas in these areas in a common conceptual framework, and the emphasis is on concepts rather than mathematics, with a liberal use of color graphics.
Abstract: During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It is a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting---the first comprehensive treatment of this topic in any book. This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression and path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for ``wide'' data (p bigger than n), including multiple testing and false discovery rates. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie co-developed much of the statistical modeling software and environment in R/S-PLUS and invented principal curves and surfaces. Tibshirani proposed the lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, projection pursuit and gradient boosting.

19,261 citations