scispace - formally typeset
Open AccessJournal ArticleDOI

A comparison of machine learning methods for ozone pollution prediction

Fouzi Harrou, +1 more
- 15 May 2023 - 
- Vol. 10, Iss: 1, pp 1-31
TLDR
In this paper , the authors evaluated the predictive performance of nineteen machine learning models for ozone pollution prediction and investigate using time-lagged measurements to improve prediction accuracy, showing that dynamic models using timelagged data outperformed static and reduced machine learning.
Abstract
Abstract Precise and efficient ozone ( $$\hbox {O}_{3}$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:msub> <mml:mtext>O</mml:mtext> <mml:mn>3</mml:mn> </mml:msub> </mml:math> ) concentration prediction is crucial for weather monitoring and environmental policymaking due to the harmful effects of high $$\hbox {O}_{3}$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:msub> <mml:mtext>O</mml:mtext> <mml:mn>3</mml:mn> </mml:msub> </mml:math> pollution levels on human health and ecosystems. However, the complexity of $$\hbox {O}_{3}$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:msub> <mml:mtext>O</mml:mtext> <mml:mn>3</mml:mn> </mml:msub> </mml:math> formation mechanisms in the troposphere presents a significant challenge in modeling $$\hbox {O}_{3}$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:msub> <mml:mtext>O</mml:mtext> <mml:mn>3</mml:mn> </mml:msub> </mml:math> accurately and quickly, especially in the absence of a process model. Data-driven machine-learning techniques have demonstrated promising performance in modeling air pollution, mainly when a process model is unavailable. This study evaluates the predictive performance of nineteen machine learning models for ozone pollution prediction. Specifically, we assess how incorporating features using Random Forest affects $$\hbox {O}_{3}$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:msub> <mml:mtext>O</mml:mtext> <mml:mn>3</mml:mn> </mml:msub> </mml:math> concentration prediction and investigate using time-lagged measurements to improve prediction accuracy. Air pollution and meteorological data collected at King Abdullah University of Science and Technology are used. Results show that dynamic models using time-lagged data outperform static and reduced machine learning models. Incorporating time-lagged data improves the accuracy of machine learning models by 300% and 200%, respectively, compared to static and reduced models, under RMSE metrics. And importantly, the best dynamic model with time-lagged information only requires 0.01 s, indicating its practical use. The Diebold-Mariano Test, a statistical test used to compare the forecasting accuracy of models, is also conducted.

read more

Content maybe subject to copyright    Report

References
More filters
Journal ArticleDOI

Deep learning

TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.
Journal ArticleDOI

Systematic Review: Process of Forming Academic Service Partnerships to Reform Clinical Education

TL;DR: This study’s findings can provide practical guidelines to steer partnership programs within the academic and clinical bodies, with the aim of providing a collaborative partnership approach to clinical education.
Journal ArticleDOI

Regression Shrinkage and Selection via the Lasso

TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Book

Deep Learning

TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.
Journal ArticleDOI

A tutorial on support vector regression

TL;DR: This tutorial gives an overview of the basic ideas underlying Support Vector (SV) machines for function estimation, and includes a summary of currently used algorithms for training SV machines, covering both the quadratic programming part and advanced methods for dealing with large datasets.
Related Papers (5)
Trending Questions (3)
How does the accuracy of using CNN-LSTM to predict ozone compare to other methods?

Using time-lagged data, CNN-LSTM outperforms other methods for ozone prediction, improving accuracy by 300% and 200% compared to static and reduced models, respectively, under RMSE metrics.

How does the accuracy of using deep learning to predict ozone compare to other methods?

Using time-lagged data, deep learning outperforms other methods for ozone prediction, improving accuracy by 300% and 200% compared to static and reduced models, respectively, under RMSE metrics.