scispace - formally typeset
Journal ArticleDOI

Estimating PM2.5 Concentrations in the Conterminous United States Using the Random Forest Approach.

Reads0
Chats0
TLDR
A random forest model incorporating aerosol optical depth data, meteorological fields, and land use variables to estimate daily 24 h averaged ground-level PM2.5 concentrations over the conterminous United States in 2011 is developed.
Abstract
To estimate PM25 concentrations, many parametric regression models have been developed, while nonparametric machine learning algorithms are used less often and national-scale models are rare In this paper, we develop a random forest model incorporating aerosol optical depth (AOD) data, meteorological fields, and land use variables to estimate daily 24 h averaged ground-level PM25 concentrations over the conterminous United States in 2011 Random forests are an ensemble learning method that provides predictions with high accuracy and interpretability Our results achieve an overall cross-validation (CV) R2 value of 080 Mean prediction error (MPE) and root mean squared prediction error (RMSPE) for daily predictions are 178 and 283 μg/m3, respectively, indicating a good agreement between CV predictions and observations The prediction accuracy of our model is similar to those reported in previous studies using neural networks or regression models on both national and regional scales In addition, the

read more

Citations
More filters
Journal ArticleDOI

An ensemble-based model of PM2.5 concentration across the contiguous United States with high spatiotemporal resolution

TL;DR: An ensemble model that integrated multiple machine learning algorithms and predictor variables to estimate daily PM2.5 at a resolution of 1’km × 1 km across the contiguous United States allows epidemiologists to accurately estimate the adverse health effect of PM 2.5.
Journal ArticleDOI

Spatiotemporal continuous estimates of PM2.5 concentrations in China, 2000–2016: A machine learning method with inputs from satellites, chemical transport model, and ground observations

TL;DR: A new machine learning (ML) model with high-dimensional expansion (HD-expansion) of numerous predictors (including AOD and other satellite covariates, meteorological variables and CTM simulations) is developed to predict daily PM2.5 concentrations during 2000-2016 across China and estimate long-term trends in PM 2.5 for the period.
References
More filters
Journal ArticleDOI

Random Forests

TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.

Classification and Regression by randomForest

TL;DR: random forests are proposed, which add an additional layer of randomness to bagging and are robust against overfitting, and the randomForest package provides an R interface to the Fortran programs by Breiman and Cutler.
Journal ArticleDOI

Gene selection and classification of microarray data using random forest

TL;DR: It is shown that random forest has comparable performance to other classification methods, including DLDA, KNN, and SVM, and that the new gene selection procedure yields very small sets of genes (often smaller than alternative methods) while preserving predictive accuracy.
Journal ArticleDOI

Conditional variable importance for random forests

TL;DR: A new, conditional permutation scheme is developed for the computation of the variable importance measure that reflects the true impact of each predictor variable more reliably than the original marginal approach.
Journal ArticleDOI

The Collection 6 MODIS aerosol products over land and ocean

TL;DR: The Collection 6 (C6) algorithm as mentioned in this paper was proposed to retrieve aerosol optical depth (AOD) and aerosol size parameters from MODIS-observed spectral reflectance.
Related Papers (5)