AIR QUALITY FORECASTING ON TIME SERIES DATA WITH ANOMALIES USING THE LSTM-XGBOOST APPROACH

Air pollution is a global issue that significantly impacts human health and the environment. The COVID-19 pandemic introduced unexpected changes in air quality patterns, necessitating an approach capable of handling data pattern shifts and anomalies. This study aims to demonstrate that LSTM-XGBoo...

Full description

Saved in:
Bibliographic Details
Main Author: Layalia S.A.G., Aurell
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/87887
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Air pollution is a global issue that significantly impacts human health and the environment. The COVID-19 pandemic introduced unexpected changes in air quality patterns, necessitating an approach capable of handling data pattern shifts and anomalies. This study aims to demonstrate that LSTM-XGBoost can achieve the best performance in air quality prediction for time series data containing anomalies during the COVID-19 pandemic, compared to XGBoost and LSTM individually. LSTM-XGBoost integrates LSTM to capture temporal patterns and XGBoost to model non-linear relationships. The dataset used in this study includes air pollutant concentrations and meteorological factors from 2020 to 2024, focusing on case studies in Jakarta and Bandung. This approach incorporates temporal event indicators to mark the COVID-19 pandemic, PSBB, and PPKM periods, as well as anomaly detection using Isolation Forest and LSTM Autoencoder to identify and mitigate anomalies in the data. The results show that LSTM-XGBoost achieves the best MAPE performance, with 8.56% for Jakarta and 8.74% for Bandung, outperforming both XGBoost and LSTM individually. The addition of temporal event indicators enhances the model’s ability to recognize changes in data patterns, while anomaly detection allows the model to identify and reduce the impact of anomalies. LSTM-XGBoost proves to be the most effective approach for predicting air quality in time series data containing anomalies. The findings of this study are expected to contribute to air quality management and support the development of predictive models that are responsive to environmental changes.