A stacked ensemble deep learning model for water quality prediction / Wong Wen Yee

Water quality management is crucial to ensure water security for the sustainability of health, productivity, and livelihoods. Contamination of water sources often occurs due to illegal waste dumping, sewage, and industrial discharge. This causes hazardous substances such as pesticides, heavy metals,...

Full description

Saved in:
Bibliographic Details
Main Author: Wong , Wen Yee
Format: Thesis
Published: 2023
Subjects:
Online Access:http://studentsrepo.um.edu.my/15108/2/Wong_Wen_Yee.pdf
http://studentsrepo.um.edu.my/15108/1/Wong_Wen_Yee.pdf
http://studentsrepo.um.edu.my/15108/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Malaya
id my.um.stud.15108
record_format eprints
spelling my.um.stud.151082024-07-04T17:53:22Z A stacked ensemble deep learning model for water quality prediction / Wong Wen Yee Wong , Wen Yee TK Electrical engineering. Electronics Nuclear engineering Water quality management is crucial to ensure water security for the sustainability of health, productivity, and livelihoods. Contamination of water sources often occurs due to illegal waste dumping, sewage, and industrial discharge. This causes hazardous substances such as pesticides, heavy metals, and pathogens to seep into waterways. Therefore, the use of water quality indicators to detect the presence of pollutants is very important. Conventional water quality index (WQI) assessment methods are limited to features such as water acidity or basicity (pH), dissolved oxygen (DO), biological oxygen demand (BOD), chemical oxygen demand (COD), ammoniacal nitrogen (NH3N), and suspended solids (SS). These features are too common and insufficient to represent the true nature of water quality. Other significant parameters including fecal coliform, heavy metals, and nutrients were not part of the WQI formula. Hence, this study aims to bridge the research gap of using different water quality parameters in water quality assessment through artificial intelligence. In this work, the potential of other water quality parameters as input variables is investigated and discussed. There are 17 input features, namely conductivity (COND), salinity (SAL), turbidity (TUR), dissolved solids (DS), nitrate (NO3), chloride (Cl), phosphate (PO4), arsenic (As), chromium (Cr), zinc (Zn), calcium (Ca), iron (Fe), potassium (K), magnesium (Mg), sodium (Na), E. coli, and total coliform, analyzed using five regression algorithms: random forest (RF), AdaBoost, support vector regression (SVR), decision tree regression (DTR), and multilayer perceptron (MLP) for preliminary model selection. The results show that the RF algorithm exhibits better prediction performance, with R2 of 0.798. The dataset is then validated with the RF classifier, and results are then improved by applying the synthetic minority oversampling technique (SMOTE) to tackle imbalanced datasets. The proposed method is shown to achieve 78.13%, 72.99%, 63.51%, and 66.85% accuracy, precision, recall, and F1 score, respectively. The results and analysis obtained from this study have proven the possibility of predicting WQI using other input features. In addition, the research extended its study to understanding imbalanced data in water quality datasets. Classifiers often perform poorly in skewed data due to a bias in the majority class. Therefore, this paper aims to explore the use of ensemble and deep learning techniques to simplify the classification process of imbalanced data. The study then proposes a stacked ensemble deep learning framework for a faster and more efficient water quality analysis. The stacked ensemble deep learning method applied was proven robust with a performance accuracy, precision, recall, and F1 score at 95.69%, 94.96%, 92.92%, and 93.88% respectively. The proposed deep learning model renders faster without the use of SMOTE. Any resampling algorithm is not a necessity in the case of this proposed algorithm. 2023-02 Thesis NonPeerReviewed application/pdf http://studentsrepo.um.edu.my/15108/2/Wong_Wen_Yee.pdf application/pdf http://studentsrepo.um.edu.my/15108/1/Wong_Wen_Yee.pdf Wong , Wen Yee (2023) A stacked ensemble deep learning model for water quality prediction / Wong Wen Yee. PhD thesis, Universiti Malaya. http://studentsrepo.um.edu.my/15108/
institution Universiti Malaya
building UM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaya
content_source UM Student Repository
url_provider http://studentsrepo.um.edu.my/
topic TK Electrical engineering. Electronics Nuclear engineering
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
Wong , Wen Yee
A stacked ensemble deep learning model for water quality prediction / Wong Wen Yee
description Water quality management is crucial to ensure water security for the sustainability of health, productivity, and livelihoods. Contamination of water sources often occurs due to illegal waste dumping, sewage, and industrial discharge. This causes hazardous substances such as pesticides, heavy metals, and pathogens to seep into waterways. Therefore, the use of water quality indicators to detect the presence of pollutants is very important. Conventional water quality index (WQI) assessment methods are limited to features such as water acidity or basicity (pH), dissolved oxygen (DO), biological oxygen demand (BOD), chemical oxygen demand (COD), ammoniacal nitrogen (NH3N), and suspended solids (SS). These features are too common and insufficient to represent the true nature of water quality. Other significant parameters including fecal coliform, heavy metals, and nutrients were not part of the WQI formula. Hence, this study aims to bridge the research gap of using different water quality parameters in water quality assessment through artificial intelligence. In this work, the potential of other water quality parameters as input variables is investigated and discussed. There are 17 input features, namely conductivity (COND), salinity (SAL), turbidity (TUR), dissolved solids (DS), nitrate (NO3), chloride (Cl), phosphate (PO4), arsenic (As), chromium (Cr), zinc (Zn), calcium (Ca), iron (Fe), potassium (K), magnesium (Mg), sodium (Na), E. coli, and total coliform, analyzed using five regression algorithms: random forest (RF), AdaBoost, support vector regression (SVR), decision tree regression (DTR), and multilayer perceptron (MLP) for preliminary model selection. The results show that the RF algorithm exhibits better prediction performance, with R2 of 0.798. The dataset is then validated with the RF classifier, and results are then improved by applying the synthetic minority oversampling technique (SMOTE) to tackle imbalanced datasets. The proposed method is shown to achieve 78.13%, 72.99%, 63.51%, and 66.85% accuracy, precision, recall, and F1 score, respectively. The results and analysis obtained from this study have proven the possibility of predicting WQI using other input features. In addition, the research extended its study to understanding imbalanced data in water quality datasets. Classifiers often perform poorly in skewed data due to a bias in the majority class. Therefore, this paper aims to explore the use of ensemble and deep learning techniques to simplify the classification process of imbalanced data. The study then proposes a stacked ensemble deep learning framework for a faster and more efficient water quality analysis. The stacked ensemble deep learning method applied was proven robust with a performance accuracy, precision, recall, and F1 score at 95.69%, 94.96%, 92.92%, and 93.88% respectively. The proposed deep learning model renders faster without the use of SMOTE. Any resampling algorithm is not a necessity in the case of this proposed algorithm.
format Thesis
author Wong , Wen Yee
author_facet Wong , Wen Yee
author_sort Wong , Wen Yee
title A stacked ensemble deep learning model for water quality prediction / Wong Wen Yee
title_short A stacked ensemble deep learning model for water quality prediction / Wong Wen Yee
title_full A stacked ensemble deep learning model for water quality prediction / Wong Wen Yee
title_fullStr A stacked ensemble deep learning model for water quality prediction / Wong Wen Yee
title_full_unstemmed A stacked ensemble deep learning model for water quality prediction / Wong Wen Yee
title_sort stacked ensemble deep learning model for water quality prediction / wong wen yee
publishDate 2023
url http://studentsrepo.um.edu.my/15108/2/Wong_Wen_Yee.pdf
http://studentsrepo.um.edu.my/15108/1/Wong_Wen_Yee.pdf
http://studentsrepo.um.edu.my/15108/
_version_ 1805882087524270080