scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Spatial and temporal air quality pattern recognition using environmetric techniques: a case study in Malaysia.

TL;DR: The necessity and usefulness of environmetric techniques for the interpretation of large datasets aiming to obtain better information about air quality patterns based on spatial and temporal characterizations at the selected air monitoring stations are presented.
Abstract: The objective of this study is to identify spatial and temporal patterns in the air quality at three selected Malaysian air monitoring stations based on an eleven-year database (January 2000-December 2010). Four statistical methods, Discriminant Analysis (DA), Hierarchical Agglomerative Cluster Analysis (HACA), Principal Component Analysis (PCA) and Artificial Neural Networks (ANNs), were selected to analyze the datasets of five air quality parameters, namely: SO2, NO2, O3, CO and particulate matter with a diameter size of below 10 μm (PM10). The three selected air monitoring stations share the characteristic of being located in highly urbanized areas and are surrounded by a number of industries. The DA results show that spatial characterizations allow successful discrimination between the three stations, while HACA shows the temporal pattern from the monthly and yearly factor analysis which correlates with severe haze episodes that have happened in this country at certain periods of time. The PCA results show that the major source of air pollution is mostly due to the combustion of fossil fuel in motor vehicles and industrial activities. The spatial pattern recognition (S-ANN) results show a better prediction performance in discriminating between the regions, with an excellent percentage of correct classification compared to DA. This study presents the necessity and usefulness of environmetric techniques for the interpretation of large datasets aiming to obtain better information about air quality patterns based on spatial and temporal characterizations at the selected air monitoring stations.

Summary (1 min read)

Jump to:  and [Summary]

Summary

  • The objective of this study is to identify spatial and temporal patterns in the air quality at three selected Malaysian air monitoring stations based on an eleven-year database (January 2000–December 2010).
  • Four statistical methods, Discriminant Analysis (DA), Hierarchical Agglomerative Cluster Analysis (HACA), Principal Component Analysis (PCA) and Artificial Neural Networks (ANNs), were selected to analyze the datasets of five air quality parameters, namely: SO2, NO2, O3, CO and particulate matter with a diameter size of below 10 μm (PM10).
  • The three selected air monitoring stations share the characteristic of being located in highly urbanized areas and are surrounded by a number of industries.
  • The DA results show that spatial characterizations allow successful discrimination between the three stations, while HACA shows the temporal pattern from the monthly and yearly factor analysis which correlates with severe haze episodes that have happened in this country at certain periods of time.
  • The PCA results show that the major source of air pollution is mostly due to the combustion of fossil fuel in motor vehicles and industrial activities.
  • The spatial pattern recognition (S-ANN) results show a better prediction performance in discriminating between the regions, with an excellent percentage of correct classification compared to DA.
  • This study presents the necessity and usefulness of environmetric techniques for the interpretation of large datasets aiming to obtain better information about air quality patterns based on spatial and temporal characterizations at the selected air monitoring stations.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Spatial and temporal air quality pattern recognition using environmetric techniques : a
case study in Malaysia.
Abstract
The objective of this study is to identify spatial and temporal patterns in the air quality at
three selected Malaysian air monitoring stations based on an eleven-year database (January
2000December 2010). Four statistical methods, Discriminant Analysis (DA), Hierarchical
Agglomerative Cluster Analysis (HACA), Principal Component Analysis (PCA) and
Artificial Neural Networks (ANNs), were selected to analyze the datasets of five air quality
parameters, namely: SO2, NO2, O3, CO and particulate matter with a diameter size of below
10 μm (PM10). The three selected air monitoring stations share the characteristic of being
located in highly urbanized areas and are surrounded by a number of industries. The DA
results show that spatial characterizations allow successful discrimination between the three
stations, while HACA shows the temporal pattern from the monthly and yearly factor
analysis which correlates with severe haze episodes that have happened in this country at
certain periods of time. The PCA results show that the major source of air pollution is mostly
due to the combustion of fossil fuel in motor vehicles and industrial activities. The spatial
pattern recognition (S-ANN) results show a better prediction performance in discriminating
between the regions, with an excellent percentage of correct classification compared to DA.
This study presents the necessity and usefulness of environmetric techniques for the
interpretation of large datasets aiming to obtain better information about air quality patterns
based on spatial and temporal characterizations at the selected air monitoring stations.
Keyword:
Air quality; Multivariate analysis.
Citations
More filters
Journal ArticleDOI
TL;DR: In this article, a combination of principal component analysis (PCA) and artificial neural networks (ANN) was developed to determine its predictive ability for the air pollutant index (API).
Abstract: This study focused on the pattern recognition of Malaysian air quality based on the data obtained from the Malaysian Department of Environment (DOE). Eight air quality parameters in ten monitoring stations in Malaysia for 7 years (2005–2011) were gathered. Principal component analysis (PCA) in the environmetric approach was used to identify the sources of pollution in the study locations. The combination of PCA and artificial neural networks (ANN) was developed to determine its predictive ability for the air pollutant index (API). The PCA has identified that CH4, NmHC, THC, O3, and PM10 are the most significant parameters. The PCA-ANN showed better predictive ability in the determination of API with fewer variables, with R 2 and root mean square error (RMSE) values of 0.618 and 10.017, respectively. The work has demonstrated the importance of historical data in sampling plan strategies to achieve desired research objectives, as well as to highlight the possibility of determining the optimum number of sampling parameters, which in turn will reduce costs and time of sampling.

146 citations


Cites background or methods or result from "Spatial and temporal air quality pa..."

  • ...Previous studies done by Mutalib et al. (2013), Alkasassbeh et al. (2013), Brunelli et al. (2007), Tecer (2007), Perez and Reyes (2006), and Niska et al. (2004, 2005) prove that ANN is very well suited for solving environmental problems, especially in the analysis of air pollution....

    [...]

  • ...Once the lack of compliance is determined, the data can be used to advise or caution the decision makers or planners to avoid health effects (Kamal et al. 2006; Mutalib et al. 2013)....

    [...]

  • ...Two major air pollutants are PM10 and O3, particularly in the urban and suburban areas in Malaysia (Dominick et al. 2012; Latif et al. 2012; Mutalib et al. 2013), and have been recognized as two of the major concerns that have high potential for deleterious effects on health (Mahiyudin et al. 2013;…...

    [...]

  • ...…maintain air quality and protect public health, the Malaysian Department of Environment (DOE) has set up the API and established national air quality standards through the Recommended Malaysian Air Quality Guidelines (RMAQG) for each of these pollutants (Mutalib et al. 2013; Dominick et al. 2012)....

    [...]

  • ...The applications of different environmetric techniques such as cluster analysis (CA), principal component analysis (PCA), factor analysis (FA), and discriminant analysis (DA) have been extensively applied in many scientific studies over the last few years (Mutalib et al. 2013; Singh et al. 2004, 2005), especially in air quality monitoring....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors established the definition of air pollution, the motivation to study it, and its impacts and sources of pollution and climate change in Malaysia, and discussed the air quality monitoring system in Malaysia and compared Malaysian ambient air quality standards with global standards.
Abstract: Air pollution is strongly tied to climate change. Industrialization and fossil fuel combustion are the main contributors leading to climate change, also being significant sources of air pollution. Malaysia is a developing country with a focus on industrialization. The preference of using private cars is a common practice in Malaysia, resulting in the after-effects of haze and transboundary air pollution. Hence, air pollution has become a severe issue in Malaysia in recent times. Exposure to air pollutants such as ozone and airborne particles is associated with increases in hospital admissions and mortality. For the past few years, the focus of the research is moving towards air quality and the impacts of air pollution on health in Malaysia. In this study, we establish the definition of air pollution, the motivation to study it, and its impacts and sources of air pollution and climate change. We discuss the air quality monitoring system in Malaysia and compare Malaysian ambient air quality standards with global standards. We also look comprehensively on the health impacts of air pollution globally and in the Malaysian context. We discuss where the health impact studies in Malaysia are lacking and what are the gaps in the research. The role of the Malaysian government concerning air pollution and its impacts is discussed. Lastly, we look into the future work and research opportunities with a focus on engineering, estimation, predictive models and lack of research projects.

67 citations

Journal ArticleDOI
21 Mar 2018-Sensors
TL;DR: A reliable, efficient, and cost-effective internet of things (IoT) system for air quality monitoring with newly added features of assessment and pollutant prediction for mine environmental safety by quickly assessing and predicting mine air quality.
Abstract: The implementation of wireless sensor networks (WSNs) for monitoring the complex, dynamic, and harsh environment of underground coal mines (UCMs) is sought around the world to enhance safety. However, previously developed smart systems are limited to monitoring or, in a few cases, can report events. Therefore, this study introduces a reliable, efficient, and cost-effective internet of things (IoT) system for air quality monitoring with newly added features of assessment and pollutant prediction. This system is comprised of sensor modules, communication protocols, and a base station, running Azure Machine Learning (AML) Studio over it. Arduino-based sensor modules with eight different parameters were installed at separate locations of an operational UCM. Based on the sensed data, the proposed system assesses mine air quality in terms of the mine environment index (MEI). Principal component analysis (PCA) identified CH4, CO, SO2, and H2S as the most influencing gases significantly affecting mine air quality. The results of PCA were fed into the ANN model in AML studio, which enabled the prediction of MEI. An optimum number of neurons were determined for both actual input and PCA-based input parameters. The results showed a better performance of the PCA-based ANN for MEI prediction, with R2 and RMSE values of 0.6654 and 0.2104, respectively. Therefore, the proposed Arduino and AML-based system enhances mine environmental safety by quickly assessing and predicting mine air quality.

54 citations


Cites background from "Spatial and temporal air quality pa..."

  • ...However, the complex and non-linear behaviors of air quality variables are beyond the capabilities of a simple mathematical prediction formula [10]....

    [...]

  • ...Several air quality studies [10,22] have utilized ANN to simulate PM10 concentrations, air quality prediction, and other environmental issues....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the authors used principal component analysis (PCA) and ANN to predict the air pollutant index (API) within the seven selected Malaysian air monitoring stations in the southern region of Peninsular Malaysia based on seven years database.
Abstract: This paper describes the application of principal component analysis (PCA) and artificial neural network (ANN) to predict the air pollutant index (API) within the seven selected Malaysian air monitoring stations in the southern region of Peninsular Malaysia based on seven years database (2005-2011). Feed-forward ANN was used as a prediction method. The feed-forward ANN analysis demonstrated that the rotated principal component scores (RPCs) were the best input parameters to predict API. From the 4 RPCs, only 10 (CO, O3, PM10, NO2, CH4, NmHC, THC, wind direction, humidity and ambient temp) out of 12 prediction variables were the most significant parameters to predict API. The results proved that the ANN method can be applied successfully as tools for decision making and problem solving for better atmospheric management.

48 citations

Journal ArticleDOI
TL;DR: In this paper, the effectiveness of hierarchical agglomerative cluster analysis (HACA), discriminant analysis (DA), principal component analysis (PCA), factor analysis (FA), and multiple linear regressions (MLR) for assessing the air quality data and air pollution sources pattern recognition were applied.
Abstract: This study intends to show the effectiveness of hierarchical agglomerative cluster analysis (HACA), discriminant analysis (DA), principal component analysis (PCA), factor analysis (FA) and multiple linear regressions (MLR) for assessing the air quality data and air pollution sources pattern recognition. The data sets of air quality for 12 months (January–December) in 2007, consisting of 14 stations around Peninsular Malaysia with 14 parameters (168 datasets) were applied. Three significant clusters - low pollution source (LPS) region, moderate pollution source (MPS) region, and slightly high pollution source (SHPS) region were generated via HACA. Forward stepwise of DA managed to discriminate 8 variables, whereas backward stepwise of DA managed to discriminate 9 out of 14 variables. The method of PCA and FA has identified 8 pollutants in LPS and SHPS respectively, as well as 11 pollutants in MPS region, where most of the pollutants are expected derived from industrial activities, transportation and agriculture systems. Four MLR models show that PM10 categorize as the primary pollutant in Malaysia. From the study, it can be stipulated that the application of chemometric techniques can disclose meaningful information on the spatial variability of a large and complex air quality data. A clearer review about the air quality and a novel design of air quality monitoring network for better management of air pollution can be achieved.

32 citations


Cites background from "Spatial and temporal air quality pa..."

  • ...However, the highest values of R2 (which was near to 1) will be declared as the best linear model (Norusis 1990; Mutalib et al., 2013; Azid et al., 2013, 2014a)....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: This study illustrates the usefulness of multivariate statistical techniques for analysis and interpretation of complex data sets, and in water quality assessment, identification of pollution sources/factors and understanding temporal/spatial variations in waterquality for effective river water quality management.
Abstract: Multivariate statistical techniques, such as cluster analysis (CA), principal component analysis (PCA), factor analysis (FA) and discriminant analysis (DA), were applied for the evaluation of temporal/spatial variations and the interpretation of a large complex water quality data set of the Fuji river basin, generated during 8 years (1995–2002) monitoring of 12 parameters at 13 different sites (14 976 observations). Hierarchical cluster analysis grouped 13 sampling sites into three clusters, i.e., relatively less polluted (LP), medium polluted (MP) and highly polluted (HP) sites, based on the similarity of water quality characteristics. Factor analysis/principal component analysis, applied to the data sets of the three different groups obtained from cluster analysis, resulted in five, five and three latent factors explaining 73.18, 77.61 and 65.39% of the total variance in water quality data sets of LP, MP and HP areas, respectively. The varifactors obtained from factor analysis indicate that the parameters responsible for water quality variations are mainly related to discharge and temperature (natural), organic pollution (point source: domestic wastewater) in relatively less polluted areas; organic pollution (point source: domestic wastewater) and nutrients (non-point sources: agriculture and orchard plantations) in medium polluted areas; and organic pollution and nutrients (point sources: domestic wastewater, wastewater treatment plants and industries) in highly polluted areas in the basin. Discriminant analysis gave the best results for both spatial and temporal analysis. It provided an important data reduction as it uses only six parameters (discharge, temperature, dissolved oxygen, biochemical oxygen demand, electrical conductivity and nitrate nitrogen), affording more than 85% correct assignations in temporal analysis, and seven parameters (discharge, temperature, biochemical oxygen demand, pH, electrical conductivity, nitrate nitrogen and ammonical nitrogen), affording more than 81% correct assignations in spatial analysis, of three different sampling sites of the basin. Therefore, DA allowed a reduction in the dimensionality of the large data set, delineating a few indicator parameters responsible for large variations in water quality. Thus, this study illustrates the usefulness of multivariate statistical techniques for analysis and interpretation of complex data sets, and in water quality assessment, identification of pollution sources/factors and understanding temporal/spatial variations in water quality for effective river water quality management.

1,481 citations

Journal ArticleDOI
TL;DR: This study presents necessity and usefulness of multivariate statistical techniques for evaluation and interpretation of large complex data sets with a view to get better information about the water quality and design of monitoring network for effective management of water resources.

1,429 citations

Journal ArticleDOI
TL;DR: The necessity and usefulness of multivariate statistical assessment of large and complex databases in order to get better information about the quality of surface water, the design of sampling and analytical protocols and the effective pollution control/management of the surface waters is presented.

1,136 citations

Journal ArticleDOI
TL;DR: In this paper, multivariate statistical techniques, such as cluster analysis, factor analysis, principal component analysis and discriminant analysis, were applied to the data set on water quality of the Gomti river.

839 citations

Journal ArticleDOI
TL;DR: In this paper, the authors present a methodology for estimating the seasonal and interannual variation of biomass burning designed for use in global chemical transport models, using the Total Ozone Mapping Spectrometer (TOMS) Aerosol Index (AI) data set.
Abstract: We present a methodology for estimating the seasonal and interannual variation of biomass burning designed for use in global chemical transport models. The average seasonal variation is estimated from 4 years of fire-count data from the Along Track Scanning Radiometer (ATSR) and 1-2 years of similar data from the Advanced Very High Resolution Radiometer (AVHRR) World Fire Atlases. We use the Total Ozone Mapping Spectrometer (TOMS) Aerosol Index (AI) data product as a surrogate to estimate interannual variability in biomass burning for six regions: Southeast Asia, Indonesia and Malaysia, Brazil, Central America and Mexico, Canada and Alaska, and Asiatic Russia. The AI data set is available from 1979 to the present with an interruption in satellite observations from mid-1993 to mid-1996; this data gap is filled where possible with estimates of area burned from the literature for different regions. Between August 1996 and July 2000, the ATSR fire-counts are used to provide specific locations of emissions and a record of interannual variability throughout the world. We use our methodology to estimate mean seasonal and interannual variations for emissions of carbon monoxide from biomass burning, and we find that no trend is apparent in these emissions over the last two decades, but that there is significant interannual variability.

678 citations

Frequently Asked Questions (1)
Q1. What are the contributions in this paper?

The objective of this study is to identify spatial and temporal patterns in the air quality at three selected Malaysian air monitoring stations based on an eleven-year database ( January 2000–December 2010 ). This study presents the necessity and usefulness of environmetric techniques for the interpretation of large datasets aiming to obtain better information about air quality patterns based on spatial and temporal characterizations at the selected air monitoring stations.