Journal ArticleDOI
Estimating PM2.5 Concentrations in the Conterminous United States Using the Random Forest Approach.
Xuefei Hu,Jessica H. Belle,Xia Meng,Avani Wildani,Lance A. Waller,Matthew J. Strickland,Yang Liu +6 more
Reads0
Chats0
TLDR
A random forest model incorporating aerosol optical depth data, meteorological fields, and land use variables to estimate daily 24 h averaged ground-level PM2.5 concentrations over the conterminous United States in 2011 is developed.Abstract:
To estimate PM25 concentrations, many parametric regression models have been developed, while nonparametric machine learning algorithms are used less often and national-scale models are rare In this paper, we develop a random forest model incorporating aerosol optical depth (AOD) data, meteorological fields, and land use variables to estimate daily 24 h averaged ground-level PM25 concentrations over the conterminous United States in 2011 Random forests are an ensemble learning method that provides predictions with high accuracy and interpretability Our results achieve an overall cross-validation (CV) R2 value of 080 Mean prediction error (MPE) and root mean squared prediction error (RMSPE) for daily predictions are 178 and 283 μg/m3, respectively, indicating a good agreement between CV predictions and observations The prediction accuracy of our model is similar to those reported in previous studies using neural networks or regression models on both national and regional scales In addition, the read more
Citations
More filters
Journal ArticleDOI
An ensemble-based model of PM2.5 concentration across the contiguous United States with high spatiotemporal resolution
Qian Di,Qian Di,Heresh Amini,Liuhua Shi,Itai Kloog,Rachel F. Silvern,James T. Kelly,M. Benjamin Sabath,Christine Choirat,Petros Koutrakis,Alexei Lyapustin,Yujie Wang,Loretta J. Mickley,Joel Schwartz +13 more
TL;DR: An ensemble model that integrated multiple machine learning algorithms and predictor variables to estimate daily PM2.5 at a resolution of 1’km × 1 km across the contiguous United States allows epidemiologists to accurately estimate the adverse health effect of PM 2.5.
Journal ArticleDOI
A machine learning method to estimate PM2.5 concentrations across China with remote sensing, meteorological and land use information.
Gongbo Chen,Shanshan Li,Luke D. Knibbs,Nicholas A. S. Hamm,Wei Cao,Tiantian Li,Jianping Guo,Hongyan Ren,Michael J. Abramson,Yuming Guo +9 more
TL;DR: Taking advantage of a novel application of modeling framework and the most recent ground-level PM2.5 observations, the machine learning method showed higher predictive ability than previous studies.
Journal ArticleDOI
Estimating 1-km-resolution PM2.5 concentrations across China using the space-time random forest approach
Journal ArticleDOI
Estimation of daily PM10 and PM2.5 concentrations in Italy, 2013-2015, using a spatiotemporal land-use random-forest model.
Massimo Stafoggia,Tom Bellander,Simone Bucci,Marina Davoli,Kees de Hoogh,Francesca De' Donato,Claudio Gariazzo,Alexei Lyapustin,Paola Michelozzi,Matteo Renzi,Matteo Scortichini,Alexandra Shtein,Giovanni Viegi,Itai Kloog,Joel Schwartz +14 more
TL;DR: Predictions were equally good in capturing annual and daily PM variability, therefore they can be used as reliable exposure estimates for investigating long-term and short-term health effects.
Journal ArticleDOI
Spatiotemporal continuous estimates of PM2.5 concentrations in China, 2000–2016: A machine learning method with inputs from satellites, chemical transport model, and ground observations
TL;DR: A new machine learning (ML) model with high-dimensional expansion (HD-expansion) of numerous predictors (including AOD and other satellite covariates, meteorological variables and CTM simulations) is developed to predict daily PM2.5 concentrations during 2000-2016 across China and estimate long-term trends in PM 2.5 for the period.
References
More filters
Journal ArticleDOI
Random Forests
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Classification and Regression by randomForest
Andy Liaw,Matthew C. Wiener +1 more
TL;DR: random forests are proposed, which add an additional layer of randomness to bagging and are robust against overfitting, and the randomForest package provides an R interface to the Fortran programs by Breiman and Cutler.
Journal ArticleDOI
Gene selection and classification of microarray data using random forest
TL;DR: It is shown that random forest has comparable performance to other classification methods, including DLDA, KNN, and SVM, and that the new gene selection procedure yields very small sets of genes (often smaller than alternative methods) while preserving predictive accuracy.
Journal ArticleDOI
Conditional variable importance for random forests
TL;DR: A new, conditional permutation scheme is developed for the computation of the variable importance measure that reflects the true impact of each predictor variable more reliably than the original marginal approach.
Journal ArticleDOI
The Collection 6 MODIS aerosol products over land and ocean
Robert C. Levy,Shana Mattoo,L. A. Munchak,Lorraine A. Remer,Andrew M. Sayer,Andrew M. Sayer,Falguni Patadia,Falguni Patadia,N. C. Hsu +8 more
TL;DR: The Collection 6 (C6) algorithm as mentioned in this paper was proposed to retrieve aerosol optical depth (AOD) and aerosol size parameters from MODIS-observed spectral reflectance.