Estimation of missing values for air pollution data using Interpolation technique

Air pollution data such as PM10, sulphur dioxide, ozone and carbon monoxide are usually obtained using automated machines located at different sites. These are usually due to mechanical failure, routine maintenance, changes in siting monitors and human error. The occurrence of missing values requir...

Full description

Saved in:
Bibliographic Details
Main Authors: Norazian, Mohamed Noor, Mohd Mustafa, Al Bakri Abdullah, Ahmad Shukri, Yahaya, Nor Azam, Ramli
Format: Article
Language:English
Published: Universiti Malaysia Perlis 2010
Subjects:
Online Access:http://dspace.unimap.edu.my/xmlui/handle/123456789/7459
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Malaysia Perlis
Language: English
id my.unimap-7459
record_format dspace
spelling my.unimap-74592010-01-03T01:33:37Z Estimation of missing values for air pollution data using Interpolation technique Norazian, Mohamed Noor Mohd Mustafa, Al Bakri Abdullah Ahmad Shukri, Yahaya Nor Azam, Ramli Air pollution Interpolation Performance indicators. Missing values Estimation theory Air pollution data such as PM10, sulphur dioxide, ozone and carbon monoxide are usually obtained using automated machines located at different sites. These are usually due to mechanical failure, routine maintenance, changes in siting monitors and human error. The occurrence of missing values requires special attention on analyzing the data. Incomplete datasets can cause bias due to systematic differences between observed and unobserved data. Therefore, the need to find the best way in estimating missing values is very important so that the data analyzed is ensured of high quality. In this study, four types of imputation techniques that are linear, quadratic, cubic and nearest neighbour interpolations were used to replace the missing values. Annual hourly monitoring data for PM10 were used to generate missing values. Five randomly simulated missing data were evaluated in order to test the efficiency of the methods used. They are 5%, 10%, 15%, 25% and 40%. Four types of performance indicators that are mean absolute error (MAE), root mean square error (RMSE), coefficient of determination (R2) and prediction accuracy (PA) were calculated to describe the goodness of fit for all the method. From all the method applied, it was found that linear interpolation method is the best method for estimating data for all percentages of simulated missing values. 2010-01-03T01:33:12Z 2010-01-03T01:33:12Z 2006 Article http://hdl.handle.net/123456789/7459 en Universiti Malaysia Perlis
institution Universiti Malaysia Perlis
building UniMAP Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Perlis
content_source UniMAP Library Digital Repository
url_provider http://dspace.unimap.edu.my/
language English
topic Air pollution
Interpolation
Performance indicators.
Missing values
Estimation theory
spellingShingle Air pollution
Interpolation
Performance indicators.
Missing values
Estimation theory
Norazian, Mohamed Noor
Mohd Mustafa, Al Bakri Abdullah
Ahmad Shukri, Yahaya
Nor Azam, Ramli
Estimation of missing values for air pollution data using Interpolation technique
description Air pollution data such as PM10, sulphur dioxide, ozone and carbon monoxide are usually obtained using automated machines located at different sites. These are usually due to mechanical failure, routine maintenance, changes in siting monitors and human error. The occurrence of missing values requires special attention on analyzing the data. Incomplete datasets can cause bias due to systematic differences between observed and unobserved data. Therefore, the need to find the best way in estimating missing values is very important so that the data analyzed is ensured of high quality. In this study, four types of imputation techniques that are linear, quadratic, cubic and nearest neighbour interpolations were used to replace the missing values. Annual hourly monitoring data for PM10 were used to generate missing values. Five randomly simulated missing data were evaluated in order to test the efficiency of the methods used. They are 5%, 10%, 15%, 25% and 40%. Four types of performance indicators that are mean absolute error (MAE), root mean square error (RMSE), coefficient of determination (R2) and prediction accuracy (PA) were calculated to describe the goodness of fit for all the method. From all the method applied, it was found that linear interpolation method is the best method for estimating data for all percentages of simulated missing values.
format Article
author Norazian, Mohamed Noor
Mohd Mustafa, Al Bakri Abdullah
Ahmad Shukri, Yahaya
Nor Azam, Ramli
author_facet Norazian, Mohamed Noor
Mohd Mustafa, Al Bakri Abdullah
Ahmad Shukri, Yahaya
Nor Azam, Ramli
author_sort Norazian, Mohamed Noor
title Estimation of missing values for air pollution data using Interpolation technique
title_short Estimation of missing values for air pollution data using Interpolation technique
title_full Estimation of missing values for air pollution data using Interpolation technique
title_fullStr Estimation of missing values for air pollution data using Interpolation technique
title_full_unstemmed Estimation of missing values for air pollution data using Interpolation technique
title_sort estimation of missing values for air pollution data using interpolation technique
publisher Universiti Malaysia Perlis
publishDate 2010
url http://dspace.unimap.edu.my/xmlui/handle/123456789/7459
_version_ 1643788816892297216