Predictive data analytics for air pollutant data

In view of government’s measure and public health alert on air pollution, air pollutant is a forecast demanding. However, prediction of single air pollutant is not comprehensive as air pollution is caused by various air pollutants. Thus, this project implements Air Quality Index (AQI) to identify th...

Full description

Saved in:

Bibliographic Details
Main Author:	Fu, Danli
Other Authors:	Wong Kin Shun, Terence
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2022
Subjects:	Engineering::Electrical and electronic engineering
Online Access:	https://hdl.handle.net/10356/163583
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-163583
record_format	dspace
spelling	sg-ntu-dr.10356-1635832023-07-07T18:57:20Z Predictive data analytics for air pollutant data Fu, Danli Wong Kin Shun, Terence School of Electrical and Electronic Engineering EKSWONG@ntu.edu.sg Engineering::Electrical and electronic engineering In view of government’s measure and public health alert on air pollution, air pollutant is a forecast demanding. However, prediction of single air pollutant is not comprehensive as air pollution is caused by various air pollutants. Thus, this project implements Air Quality Index (AQI) to identify the level of air quality. We use data provided by the environmental protection department (EPD) in Hong Kong and Hong Kong Observatory (HKO) to predict AQI level through FSP, RSP, NOx, SO2, pressure, air temperature and dew point. Past AQI values are calculated through major pollutants FSP, RSP, SO2 and NOx and then use to forecast the AQI level in the following day. In this project , we use both regression and classification strategies to predict the air quality level for the next day. In regression methodologies, we study autoregressive integrated moving average (ARIMA) model and multilayer perceptron (MLP) model. In classification methodologies, we study decision tree (DT), random forest (RF) and XGBoost. From the experiment results, for our project, there is still considerable error in identifying the level of air pollution by predicting the specific AQI value in the next day. On the other hand, with binary prediction, through experiments we conclude that imbalanced class distribution impacts the accuracy of minority group. This study also investigates feature importance to RF and XGBoost models, it suggests that AQI value is strongly associated with FSP, RSP, SO2 and its value on previous day. Bachelor of Engineering (Electrical and Electronic Engineering) 2022-12-12T03:22:43Z 2022-12-12T03:22:43Z 2022 Final Year Project (FYP) Fu, D. (2022). Predictive data analytics for air pollutant data. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/163583 https://hdl.handle.net/10356/163583 en A2398-212 application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Electrical and electronic engineering
spellingShingle	Engineering::Electrical and electronic engineering Fu, Danli Predictive data analytics for air pollutant data
description	In view of government’s measure and public health alert on air pollution, air pollutant is a forecast demanding. However, prediction of single air pollutant is not comprehensive as air pollution is caused by various air pollutants. Thus, this project implements Air Quality Index (AQI) to identify the level of air quality. We use data provided by the environmental protection department (EPD) in Hong Kong and Hong Kong Observatory (HKO) to predict AQI level through FSP, RSP, NOx, SO2, pressure, air temperature and dew point. Past AQI values are calculated through major pollutants FSP, RSP, SO2 and NOx and then use to forecast the AQI level in the following day. In this project , we use both regression and classification strategies to predict the air quality level for the next day. In regression methodologies, we study autoregressive integrated moving average (ARIMA) model and multilayer perceptron (MLP) model. In classification methodologies, we study decision tree (DT), random forest (RF) and XGBoost. From the experiment results, for our project, there is still considerable error in identifying the level of air pollution by predicting the specific AQI value in the next day. On the other hand, with binary prediction, through experiments we conclude that imbalanced class distribution impacts the accuracy of minority group. This study also investigates feature importance to RF and XGBoost models, it suggests that AQI value is strongly associated with FSP, RSP, SO2 and its value on previous day.
author2	Wong Kin Shun, Terence
author_facet	Wong Kin Shun, Terence Fu, Danli
format	Final Year Project
author	Fu, Danli
author_sort	Fu, Danli
title	Predictive data analytics for air pollutant data
title_short	Predictive data analytics for air pollutant data
title_full	Predictive data analytics for air pollutant data
title_fullStr	Predictive data analytics for air pollutant data
title_full_unstemmed	Predictive data analytics for air pollutant data
title_sort	predictive data analytics for air pollutant data
publisher	Nanyang Technological University
publishDate	2022
url	https://hdl.handle.net/10356/163583
_version_	1772827519059230720

Predictive data analytics for air pollutant data

Similar Items