Dalarna University's logo and link to the university's website

du.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • chicago-author-date
  • chicago-note-bibliography
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Ensemble Deep Learning For Classification Of Pollution Peaks
Dalarna University, School of Information and Engineering, Microdata Analysis.ORCID iD: 0000-0003-0525-7617
Grupo de Biodiversidad Medio Ambiente y Salud (BIOMAS), Universidad de Las Américas, Ecuador.
Dalarna University, School of Information and Engineering, Microdata Analysis. School of Information and Engineering, Dalarna University, Sweden.
2022 (English)In: Air and Water Pollution XXX / [ed] S. Mambretti, Polytechnic of Milan, Italy and Member of WIT Board of Directors, J. Longhurst and J, Barnes, University of the West of England, UK, 2022, p. 25-36Conference paper, Published paper (Refereed)
Abstract [en]

The concentration peaks of atmospheric pollutants are the most challenging and important phenomena in air quality forecasting. The fact that these elevated levels of pollution do not seem to follow any specific pattern explains why current models still struggle to provide an accurate prediction of these harmful events for human health. The present study tackles this issue by testing several supervised learning methods to discriminate between peak and no peak of concentrations of five contaminants: NO2, CO, SO2, PM2.5, and O3. The classification performance of ensemble decision tree (gradient boosting machine (GBM)) models and ensemble deep learning (EDL) models are compared. The results reveal that the EDL outperforms the GBM model. An analysis of the variable importance (SHapley additive exPlanations (SHAP)) shows that both temporal and meteorological features have an impact on the proposed models. In particular, time of day and wind speed are the most important features to explain the performance of the ensemble DL models.

Place, publisher, year, edition, pages
2022. p. 25-36
Series
WIT Transactions on Ecology and the Environment, E-ISSN 1743-3541 ; 259
Keywords [en]
machine learning, deep learning, air pollution forecasting, data-driven modelling
National Category
Earth and Related Environmental Sciences
Identifiers
URN: urn:nbn:se:du-42820DOI: 10.2495/AWP220031Scopus ID: 2-s2.0-85147800542ISBN: 978-1-78466-467-1 (print)ISBN: 978-1-78466-468-8 (electronic)OAI: oai:DiVA.org:du-42820DiVA, id: diva2:1703079
Conference
30th International Conference on Modelling, Monitoring and Management of Air and Water Pollution
Available from: 2022-10-12 Created: 2022-10-12 Last updated: 2025-10-09
In thesis
1. Machine Learning Approaches to Develop Weather Normalize Models for Urban Air Quality
Open this publication in new window or tab >>Machine Learning Approaches to Develop Weather Normalize Models for Urban Air Quality
2024 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

According to the World Health Organization, almost all human population (99%) lives in 117 countries with over 6000 cities, where air pollutant concentration exceeds recommended thresholds. The most common, so-called criteria, air pollutants that affect human lives, are particulate matter (PM) and gas-phase (SO2, CO, NO2, O3 and others). Therefore, many countries or regions worldwide have imposed regulations or interventions to reduce these effects. Whenever an intervention occurs, air quality changes due to changes in ambient factors, such as weather characteristics and human activities. One approach for assessing the effects of interventions or events on air quality is through the use of the Weather Normalized Model (WNM). However, current deterministic models struggle to accurately capture the complex, non-linear relationship between pollutant concentrations and their emission sources. Hence, the primary objective of this thesis is to examine the power of machine learning (ML) and deep learning (DL) techniques to develop and improve WNMs. Subsequently, these enhanced WNMs are employed to assess the impact of events on air quality. Furthermore, these ML/DL-based WNMs can serve as valuable tools for conducting exploratory data analysis (EDA) to uncover the correlations between independent variables (meteorological and temporal features) and air pollutant concentrations within the models. 

It has been discovered that DL techniques demonstrated their efficiency and high performance in different fields, such as natural language processing, image processing, biology, and environment. Therefore, several appropriate DL architectures (Long Short-Term Memory - LSTM, Recurrent Neural Network - RNN, Bidirectional Recurrent Neural Network - BIRNN, Convolutional Neural Network - CNN, and Gated Recurrent Unit - GRU) were tested to develop the WNMs presented in Paper I. When comparing these DL architectures and Gradient Boosting Machine (GBM), LSTM-based methods (LSTM, BiRNN) have obtained superior results in developing WNMs. The study also showed that our WNMs (DL-based) could capture the correlations between input variables (meteorological and temporal variables) and five criteria contaminants (SO2, CO, NO2, O3 and PM2.5). This is because the SHapley Additive exPlanations (SHAP) library allowed us to discover the significant factors in DL-based WNMs. Additionally, these WNMs were used to assess the air quality changes during COVID-19 lockdown periods in Ecuador. The existing normalized models operate based on the original units of pollutants and are designed for assessing pollutant concentrations under “average” or consistent weather conditions. Predicting pollution peaks presents an even greater challenge because they often lack discernible patterns. To address this, we enhanced the Weather Normalized Models (WNMs) to boost their performance specifically during daily concentration peak conditions. In the second paper, we accomplished this by developing supervised learning techniques, including Ensemble Deep Learning methods, to distinguish between daily peak and non-peak pollutant concentrations. This approach offers flexibility in categorizing pollutant concentrations as either daily concentration peaks or non-daily concentration peaks. However, it is worth noting that this method may introduce potential bias when selecting non-peak values. In the third paper, WNMs are directly applied to daily concentration peaks to predict and analyse the correlations between meteorological, temporal features and daily concentration peaks of air pollutants.

Place, publisher, year, edition, pages
Borlänge: Högskolan Dalarna, 2024
Series
Dalarna Licentiate Theses ; 20
Keywords
Weather Normalized Models (WNMs), Air Pollution, Data-Driven Modeling and Optimization, Deep Learning - Artificial Neural Network (DL-ANN), Machine Learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:du-47290 (URN)978-91-88679-56-7 (ISBN)
Presentation
2024-02-20, room Clas Ohlson, Campus Borlänge, 14:00 (English)
Opponent
Supervisors
Available from: 2024-01-29 Created: 2023-11-21 Last updated: 2025-10-09Bibliographically approved

Open Access in DiVA

fulltext(700 kB)238 downloads
File information
File name FULLTEXT01.pdfFile size 700 kBChecksum SHA-512
276de65d5482387f02d074233bf5059db6e0ab64d5b82c6dd01bbc266fa41a8e2078298371d0694a5be2618d38e159d6063e08005e3d4959328154902d90e763
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Ngoc Phuong, ChauRybarczyk, Yves

Search in DiVA

By author/editor
Ngoc Phuong, ChauRybarczyk, Yves
By organisation
Microdata Analysis
Earth and Related Environmental Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 238 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 337 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • chicago-author-date
  • chicago-note-bibliography
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf