A stacked ensemble deep learning model for water quality prediction / Wong Wen Yee

Water quality management is crucial to ensure water security for the sustainability of health, productivity, and livelihoods. Contamination of water sources often occurs due to illegal waste dumping, sewage, and industrial discharge. This causes hazardous substances such as pesticides, heavy metals,...

全面介紹

Saved in:
書目詳細資料
主要作者: Wong , Wen Yee
格式: Thesis
出版: 2023
主題:
在線閱讀:http://studentsrepo.um.edu.my/15108/2/Wong_Wen_Yee.pdf
http://studentsrepo.um.edu.my/15108/1/Wong_Wen_Yee.pdf
http://studentsrepo.um.edu.my/15108/
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
id my.um.stud.15108
record_format eprints
spelling my.um.stud.151082024-07-04T17:53:22Z A stacked ensemble deep learning model for water quality prediction / Wong Wen Yee Wong , Wen Yee TK Electrical engineering. Electronics Nuclear engineering Water quality management is crucial to ensure water security for the sustainability of health, productivity, and livelihoods. Contamination of water sources often occurs due to illegal waste dumping, sewage, and industrial discharge. This causes hazardous substances such as pesticides, heavy metals, and pathogens to seep into waterways. Therefore, the use of water quality indicators to detect the presence of pollutants is very important. Conventional water quality index (WQI) assessment methods are limited to features such as water acidity or basicity (pH), dissolved oxygen (DO), biological oxygen demand (BOD), chemical oxygen demand (COD), ammoniacal nitrogen (NH3N), and suspended solids (SS). These features are too common and insufficient to represent the true nature of water quality. Other significant parameters including fecal coliform, heavy metals, and nutrients were not part of the WQI formula. Hence, this study aims to bridge the research gap of using different water quality parameters in water quality assessment through artificial intelligence. In this work, the potential of other water quality parameters as input variables is investigated and discussed. There are 17 input features, namely conductivity (COND), salinity (SAL), turbidity (TUR), dissolved solids (DS), nitrate (NO3), chloride (Cl), phosphate (PO4), arsenic (As), chromium (Cr), zinc (Zn), calcium (Ca), iron (Fe), potassium (K), magnesium (Mg), sodium (Na), E. coli, and total coliform, analyzed using five regression algorithms: random forest (RF), AdaBoost, support vector regression (SVR), decision tree regression (DTR), and multilayer perceptron (MLP) for preliminary model selection. The results show that the RF algorithm exhibits better prediction performance, with R2 of 0.798. The dataset is then validated with the RF classifier, and results are then improved by applying the synthetic minority oversampling technique (SMOTE) to tackle imbalanced datasets. The proposed method is shown to achieve 78.13%, 72.99%, 63.51%, and 66.85% accuracy, precision, recall, and F1 score, respectively. The results and analysis obtained from this study have proven the possibility of predicting WQI using other input features. In addition, the research extended its study to understanding imbalanced data in water quality datasets. Classifiers often perform poorly in skewed data due to a bias in the majority class. Therefore, this paper aims to explore the use of ensemble and deep learning techniques to simplify the classification process of imbalanced data. The study then proposes a stacked ensemble deep learning framework for a faster and more efficient water quality analysis. The stacked ensemble deep learning method applied was proven robust with a performance accuracy, precision, recall, and F1 score at 95.69%, 94.96%, 92.92%, and 93.88% respectively. The proposed deep learning model renders faster without the use of SMOTE. Any resampling algorithm is not a necessity in the case of this proposed algorithm. 2023-02 Thesis NonPeerReviewed application/pdf http://studentsrepo.um.edu.my/15108/2/Wong_Wen_Yee.pdf application/pdf http://studentsrepo.um.edu.my/15108/1/Wong_Wen_Yee.pdf Wong , Wen Yee (2023) A stacked ensemble deep learning model for water quality prediction / Wong Wen Yee. PhD thesis, Universiti Malaya. http://studentsrepo.um.edu.my/15108/
institution Universiti Malaya
building UM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaya
content_source UM Student Repository
url_provider http://studentsrepo.um.edu.my/
topic TK Electrical engineering. Electronics Nuclear engineering
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
Wong , Wen Yee
A stacked ensemble deep learning model for water quality prediction / Wong Wen Yee
description Water quality management is crucial to ensure water security for the sustainability of health, productivity, and livelihoods. Contamination of water sources often occurs due to illegal waste dumping, sewage, and industrial discharge. This causes hazardous substances such as pesticides, heavy metals, and pathogens to seep into waterways. Therefore, the use of water quality indicators to detect the presence of pollutants is very important. Conventional water quality index (WQI) assessment methods are limited to features such as water acidity or basicity (pH), dissolved oxygen (DO), biological oxygen demand (BOD), chemical oxygen demand (COD), ammoniacal nitrogen (NH3N), and suspended solids (SS). These features are too common and insufficient to represent the true nature of water quality. Other significant parameters including fecal coliform, heavy metals, and nutrients were not part of the WQI formula. Hence, this study aims to bridge the research gap of using different water quality parameters in water quality assessment through artificial intelligence. In this work, the potential of other water quality parameters as input variables is investigated and discussed. There are 17 input features, namely conductivity (COND), salinity (SAL), turbidity (TUR), dissolved solids (DS), nitrate (NO3), chloride (Cl), phosphate (PO4), arsenic (As), chromium (Cr), zinc (Zn), calcium (Ca), iron (Fe), potassium (K), magnesium (Mg), sodium (Na), E. coli, and total coliform, analyzed using five regression algorithms: random forest (RF), AdaBoost, support vector regression (SVR), decision tree regression (DTR), and multilayer perceptron (MLP) for preliminary model selection. The results show that the RF algorithm exhibits better prediction performance, with R2 of 0.798. The dataset is then validated with the RF classifier, and results are then improved by applying the synthetic minority oversampling technique (SMOTE) to tackle imbalanced datasets. The proposed method is shown to achieve 78.13%, 72.99%, 63.51%, and 66.85% accuracy, precision, recall, and F1 score, respectively. The results and analysis obtained from this study have proven the possibility of predicting WQI using other input features. In addition, the research extended its study to understanding imbalanced data in water quality datasets. Classifiers often perform poorly in skewed data due to a bias in the majority class. Therefore, this paper aims to explore the use of ensemble and deep learning techniques to simplify the classification process of imbalanced data. The study then proposes a stacked ensemble deep learning framework for a faster and more efficient water quality analysis. The stacked ensemble deep learning method applied was proven robust with a performance accuracy, precision, recall, and F1 score at 95.69%, 94.96%, 92.92%, and 93.88% respectively. The proposed deep learning model renders faster without the use of SMOTE. Any resampling algorithm is not a necessity in the case of this proposed algorithm.
format Thesis
author Wong , Wen Yee
author_facet Wong , Wen Yee
author_sort Wong , Wen Yee
title A stacked ensemble deep learning model for water quality prediction / Wong Wen Yee
title_short A stacked ensemble deep learning model for water quality prediction / Wong Wen Yee
title_full A stacked ensemble deep learning model for water quality prediction / Wong Wen Yee
title_fullStr A stacked ensemble deep learning model for water quality prediction / Wong Wen Yee
title_full_unstemmed A stacked ensemble deep learning model for water quality prediction / Wong Wen Yee
title_sort stacked ensemble deep learning model for water quality prediction / wong wen yee
publishDate 2023
url http://studentsrepo.um.edu.my/15108/2/Wong_Wen_Yee.pdf
http://studentsrepo.um.edu.my/15108/1/Wong_Wen_Yee.pdf
http://studentsrepo.um.edu.my/15108/
_version_ 1805882087524270080