scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Machine Learning Algorithms for Chromophoric Dissolved Organic Matter (CDOM) Estimation Based on Landsat 8 Images

07 Sep 2021-Remote Sensing (Multidisciplinary Digital Publishing Institute)-Vol. 13, Iss: 18, pp 3560
TL;DR: Wang et al. as mentioned in this paper examined and improved different machine learning algorithms using extensive CDOM measurements and Landsat 8 images covering different trophic states to develop the robust CDOM estimation model.
Abstract: Chromophoric dissolved organic matter (CDOM) is crucial in the biogeochemical cycle and carbon cycle of aquatic environments. However, in inland waters, remotely sensed estimates of CDOM remain challenging due to the low optical signal of CDOM and complex optical conditions. Therefore, developing efficient, practical and robust models to estimate CDOM absorption coefficient in inland waters is essential for successful water environment monitoring and management. We examined and improved different machine learning algorithms using extensive CDOM measurements and Landsat 8 images covering different trophic states to develop the robust CDOM estimation model. The algorithms were evaluated via 111 Landsat 8 images and 1708 field measurements covering CDOM light absorption coefficient a(254) from 2.64 to 34.04 m−1. Overall, the four machine learning algorithms achieved more than 70% accuracy for CDOM absorption coefficient estimation. Based on model training, validation and the application on Landsat 8 OLI images, we found that the Gaussian process regression (GPR) had higher stability and estimation accuracy (R2 = 0.74, mean relative error (MRE) = 22.2%) than the other models. The estimation accuracy and MRE were R2 = 0.75 and MRE = 22.5% for backpropagation (BP) neural network, R2 = 0.71 and MRE = 24.4% for random forest regression (RFR) and R2 = 0.71 and MRE = 24.4% for support vector regression (SVR). In contrast, the best three empirical models had estimation accuracies of R2 less than 0.56. The model accuracies applied to Landsat images of Lake Qiandaohu (oligo-mesotrophic state) were better than those of Lake Taihu (eutrophic state) because of the more complex optical conditions in eutrophic lakes. Therefore, machine learning algorithms have great potential for CDOM monitoring in inland waters based on large datasets. Our study demonstrates that machine learning algorithms are available to map CDOM spatial-temporal patterns in inland waters.
Citations
More filters
Journal ArticleDOI
TL;DR: Five semi-empirical and four machine learning models are compared to estimate chlorophyll-a concentrations via simulated reflectance using fused Gaofen-6 and Sentinel-2 spectral response function and the results showed that the extreme gradient boosting tree model (one of the machine learning model) is the most accurate.
Abstract: Chlorophyll-a concentrations in water bodies are one of the most important environmental evaluation indicators in monitoring the water environment. Small water bodies include headwater streams, springs, ditches, flushes, small lakes, and ponds, which represent important freshwater resources. However, the relatively narrow and fragmented nature of small water bodies makes it difficult to monitor chlorophyll-a via medium-resolution remote sensing. In the present study, we first fused Gaofen-6 (a new Chinese satellite) images to obtain 2 m resolution images with 8 bands, which was approved as a good data source for Chlorophyll-a monitoring in small water bodies as Sentinel-2. Further, we compared five semi-empirical and four machine learning models to estimate chlorophyll-a concentrations via simulated reflectance using fused Gaofen-6 and Sentinel-2 spectral response function. The results showed that the extreme gradient boosting tree model (one of the machine learning models) is the most accurate. The mean relative error (MRE) was 9.03%, and the root-mean-square error (RMSE) was 4.5 mg/m3 for the Sentinel-2 sensor, while for the fused Gaofen-6 image, MRE was 6.73%, and RMSE was 3.26 mg/m3. Thus, both fused Gaofen-6 and Sentinel-2 could estimate the chlorophyll-a concentrations in small water bodies. Since the fused Gaofen-6 exhibited a higher spatial resolution and Sentinel-2 exhibited a higher temporal resolution.

18 citations

Journal ArticleDOI
TL;DR: In this paper , a machine learning retrieval model based on in situ data and mixture density network (MDN) was developed to detect the variations in chromophoric dissolved organic matter (CDOM) concentration in the Arctic Ocean.

1 citations

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed a machine learning algorithm based on Moderate Resolution Imaging Spectrometer (MODIS) data to estimate the algal biomass, which was successfully applied to a eutrophic lake in China, Lake Taihu.
Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors used the Landsat 8 OLI product embedded in Google Earth Engine (GEE) for deriving humification index (HIX) based on EEMs in lakes across China.
Journal ArticleDOI
TL;DR: In this article , the authors presented the first attempt to estimate dissolved organic carbon (DOC) in inland waters over a large-scale area using satellite data and ML methods with the newly published open-source dataset AquaSat.
References
More filters
Journal ArticleDOI
01 Oct 2001
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Abstract: Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, aaa, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.

79,257 citations

Journal ArticleDOI
TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Abstract: LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

40,826 citations

Journal ArticleDOI
TL;DR: This tutorial gives an overview of the basic ideas underlying Support Vector (SV) machines for function estimation, and includes a summary of currently used algorithms for training SV machines, covering both the quadratic programming part and advanced methods for dealing with large datasets.
Abstract: In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing with large datasets. Finally, we mention some modifications and extensions that have been applied to the standard SV algorithm, and discuss the aspect of regularization from a SV perspective.

10,696 citations

Journal ArticleDOI
Noel Gorelick1, M. Hancher1, Mike J. Dixon1, Simon Ilyushchenko1, David Thau1, Rebecca Moore1 
TL;DR: Google Earth Engine is a cloud-based platform for planetary-scale geospatial analysis that brings Google's massive computational capabilities to bear on a variety of high-impact societal issues including deforestation, drought, disaster, disease, food security, water management, climate monitoring and environmental protection.

6,262 citations

Journal ArticleDOI
TL;DR: For open ocean and coastal waters, a multiband quasi-analytical algorithm is developed to retrieve absorption and backscattering coefficients, as well as absorption coefficients of phytoplankton pigments and gelbstoff, based on remote-sensing reflectance models derived from the radiative transfer equation.
Abstract: For open ocean and coastal waters, a multiband quasi-analytical algorithm is developed to retrieve absorption and backscattering coefficients, as well as absorption coefficients of phytoplankton pigments and gelbstoff. This algorithm is based on remote-sensing reflectance models derived from the radiative transfer equation, and values of total absorption and backscattering coefficients are analytically calculated from values of remote-sensing reflectance. In the calculation of total absorption coefficient, no spectral models for pigment and gelbstoff absorption coefficients are used. Actually those absorption coefficients are spectrally decomposed from the derived total absorption coefficient in a separate calculation. The algorithm is easy to understand and simple to implement. It can be applied to data from past and current satellite sensors, as well as to data from hyperspectral sensors. There are only limited empirical relationships involved in the algorithm, and they are for less important properties, which implies that the concept and details of the algorithm could be applied to many data for oceanic observations. The algorithm is applied to simulated data and field data, both non-case1, to test its performance, and the results are quite promising. More independent tests with field-measured data are desired to validate and improve this algorithm.

1,375 citations

Trending Questions (2)
How does CDOM absorption compare with other methods for predicting water pollution, in terms of accuracy and efficiency?

Machine learning algorithms, particularly Gaussian process regression, outperform empirical models for accurate CDOM absorption estimation in inland waters, enhancing water pollution prediction efficiency.

What are the limitations and challenges associated with using CDOM absorption for predicting water pollution?

Challenges include low CDOM optical signal and complex optical conditions in inland waters, impacting accurate CDOM absorption coefficient estimation for water pollution prediction.