Assessment Of Spatial Hazard And Impact Of PM10 Using Machine Learning

doi:10.1109/ICCCSP49186.2020.9315283

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Spatial assessment of PM10 hotspots using Random Forest, K-Nearest Neighbour and Naïve Bayes

[...]

Abdulwaheed Tella¹, Abdul-Lateef Balogun¹, Naheem Adebisi¹, Samsuri Abdullah²•Institutions (2)

Universiti Teknologi Petronas¹, Universiti Malaysia Terengganu²

01 Oct 2021-Atmospheric Pollution Research

TL;DR: In this paper, the authors used remote sensing data such as elevation, slope, road density, Soil Adjusted Vegetation Index, Normalized difference Vegetation index, built-up index, land surface temperature, and wind speed.

...read moreread less

16 citations

Proceedings Article•DOI•

XGBoost Prediction of Infection of Leukemia Patients with Fever of Unknown Origin

[...]

Yan Huai Li, Yanhui Song, Fei Ma

19 Aug 2022

TL;DR: In this paper , the authors applied the XGBoost algorithm to predict the pathogenic infections from a big data repository of leukemia patients with fever of unknown origin (FUO) and compared the performance with other machine learning algorithms.

...read moreread less

Abstract: Discovering the source of a patient's fever without clinically localised signs can be a daunting task for doctors. In particular for leukaemia patients with fever of unknown origin, fast discovering the source of the fever is a formidable challenge, as this population has the potential to lead to fever in many different situations. In this paper, we applied XGBoost algorithm to predict the pathogenic infections from a big data repository of leukemia patients with fever of unknown origin (FUO) and compared the performance with other machine learning algorithms. Our results illustrates that those machine learning algorithms achieves good performance. In particular, the XGBoost obtains the best performance with an area under receiving-operating-characteristics curve (AUC) of 0.8376 and F1-score of 0.7034. Compared with existing literature, our experiment provides new insights for doctors to determine the cause of fever in leukemia patients.

...read moreread less

1 citations

References

PDF

Open Access

More filters

Journal Article•DOI•

Building Predictive Models in R Using the caret Package

[...]

Max Kuhn

10 Nov 2008-Journal of Statistical Software

TL;DR: The caret package, short for classification and regression training, contains numerous tools for developing predictive models using the rich set of models available in R to simplify model training and tuning across a wide variety of modeling techniques.

...read moreread less

Abstract: The caret package, short for classification and regression training, contains numerous tools for developing predictive models using the rich set of models available in R. The package focuses on simplifying model training and tuning across a wide variety of modeling techniques. It also includes methods for pre-processing training data, calculating variable importance, and model visualizations. An example from computational chemistry is used to illustrate the functionality on a real data set and to benchmark the benefits of parallel processing with several types of models.

...read moreread less

5,144 citations

"Assessment Of Spatial Hazard And Im..." refers background or methods in this paper

...The caret kit, short for classification and regression t raining, includes numerous resources to use the rich coll ection of models available in R to create predictive mod els [9]....
[...]
...The goal is to: (i) Reducing syntactic discrepancies between many of the building and model prediction functions, (ii) Developing a collection of semi-automated, rational approaches to optimize tuning parameter values for most of these models [9]....
[...]
...Here the R language has a rich collection of modeling functions for both classification and regression so many that monitoring the syntactic complexities of each function becomes increasingly difficult [9]....
[...]

Journal Article•DOI•

Random Forests for Classification in Ecology

[...]

D. Richard Cutler¹, Thomas C. Edwards¹, Thomas C. Edwards², Karen H. Beard¹, Adele Cutler¹, Kyle Hess¹, Jacob Gibson¹, Joshua J. Lawler³ - Show less +4 more•Institutions (3)

Utah State University¹, United States Geological Survey², University of Washington³

01 Nov 2007-Ecology

TL;DR: High classification accuracy in all applications as measured by cross-validation and, in the case of the lichen data, by independent test data, when comparing RF to other common classification methods are observed.

...read moreread less

Abstract: Classification procedures are some of the most widely used statistical methods in ecology. Random forests (RF) is a new and powerful statistical classifier that is well established in other disciplines but is relatively unknown in ecology. Advantages of RF compared to other statistical classifiers include (1) very high classification accuracy; (2) a novel method of determining variable importance; (3) ability to model complex interactions among predictor variables; (4) flexibility to perform several types of statistical data analysis, including regression, classification, survival analysis, and unsupervised learning; and (5) an algorithm for imputing missing values. We compared the accuracies of RF and four other commonly used statistical classifiers using data on invasive plant species presence in Lava Beds National Monument, California, USA, rare lichen species presence in the Pacific Northwest, USA, and nest sites for cavity nesting birds in the Uinta Mountains, Utah, USA. We observed high classification accuracy in all applications as measured by cross-validation and, in the case of the lichen data, by independent test data, when comparing RF to other common classification methods. We also observed that the variables that RF identified as most important for classifying invasive plant species coincided with expectations based on the literature.

...read moreread less

3,368 citations

"Assessment Of Spatial Hazard And Im..." refers methods in this paper

...Here RF model applies a subset of data to any decision tree, which is chosen randomly [3]....
[...]

Proceedings Article•

Bagging, boosting, and C4.S

[...]

J. R. Quinlan¹•Institutions (1)

University of Sydney¹

04 Aug 1996

TL;DR: Results of applying Breiman's bagging and Freund and Schapire's boosting to a system that learns decision trees and testing on a representative collection of datasets show boosting shows the greater benefit.

...read moreread less

Abstract: Breiman's bagging and Freund and Schapire's boosting are recent methods for improving the predictive power of classifier learning systems Both form a set of classifiers that are combined by voting, bagging by generating replicated bootstrap samples of the data, and boosting by adjusting the weights of training instances This paper reports results of applying both techniques to a system that learns decision trees and testing on a representative collection of datasets While both approaches substantially improve predictive accuracy, boosting shows the greater benefit On the other hand, boosting also produces severe degradation on some datasets A small change to the way that boosting combines the votes of learned classifiers reduces this downside and also leads to slightly better results on most of the datasets considered

...read moreread less

1,597 citations

Journal Article•DOI•

Estimating PM2.5 Concentrations in the Conterminous United States Using the Random Forest Approach.

[...]

Xuefei Hu, Jessica H. Belle, Xia Meng, Avani Wildani, Lance A. Waller, Matthew J. Strickland¹, Yang Liu - Show less +3 more•Institutions (1)

University of Nevada, Reno¹

01 Jun 2017-Environmental Science & Technology

TL;DR: A random forest model incorporating aerosol optical depth data, meteorological fields, and land use variables to estimate daily 24 h averaged ground-level PM2.5 concentrations over the conterminous United States in 2011 is developed.

...read moreread less

Abstract: To estimate PM25 concentrations, many parametric regression models have been developed, while nonparametric machine learning algorithms are used less often and national-scale models are rare In this paper, we develop a random forest model incorporating aerosol optical depth (AOD) data, meteorological fields, and land use variables to estimate daily 24 h averaged ground-level PM25 concentrations over the conterminous United States in 2011 Random forests are an ensemble learning method that provides predictions with high accuracy and interpretability Our results achieve an overall cross-validation (CV) R2 value of 080 Mean prediction error (MPE) and root mean squared prediction error (RMSPE) for daily predictions are 178 and 283 μg/m3, respectively, indicating a good agreement between CV predictions and observations The prediction accuracy of our model is similar to those reported in previous studies using neural networks or regression models on both national and regional scales In addition, the

...read moreread less

379 citations

"Assessment Of Spatial Hazard And Im..." refers methods in this paper

...Random forests are an integrated learning approach that provides high accuracy and interpretability predictions [11] Better decision taking and disaggregation each node several classification algorithms are used....
[...]

Journal Article•DOI•

A machine learning method to estimate PM2.5 concentrations across China with remote sensing, meteorological and land use information.

[...]

Gongbo Chen¹, Shanshan Li¹, Luke D. Knibbs², Nicholas A. S. Hamm³, Wei Cao⁴, Tiantian Li⁵, Jianping Guo, Hongyan Ren⁴, Michael J. Abramson¹, Yuming Guo¹ - Show less +6 more•Institutions (5)

Monash University¹, University of Queensland², The University of Nottingham Ningbo China³, Chinese Academy of Sciences⁴, Chinese Center for Disease Control and Prevention⁵

15 Sep 2018-Science of The Total Environment

TL;DR: Taking advantage of a novel application of modeling framework and the most recent ground-level PM2.5 observations, the machine learning method showed higher predictive ability than previous studies.

...read moreread less

331 citations

"Assessment Of Spatial Hazard And Im..." refers methods in this paper

...The predictive performance of random forest models was much higher than the other two standard regression models, explaining the majority of spatial variation in daily PM10[15]....
[...]

Assessment Of Spatial Hazard And Impact Of PM10 Using Machine Learning

Citations

References

"Assessment Of Spatial Hazard And Im..." refers background or methods in this paper

"Assessment Of Spatial Hazard And Im..." refers methods in this paper

"Assessment Of Spatial Hazard And Im..." refers methods in this paper

"Assessment Of Spatial Hazard And Im..." refers methods in this paper

Related Papers (5)