Process control of water treatment facilities using machine learning method

The water industry in Singapore is increasingly incorporating the use of Industrial Control System (ICS) which introduces cyber-physical systems (CPS) in water treatment plants. Along with the highly efficient automated processes, the connectivity of the systems instigates new means of cyber-a...

Full description

Saved in:
Bibliographic Details
Main Author: Tey, Shiyang
Other Authors: Law Wing-Keung, Adrian
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/158278
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The water industry in Singapore is increasingly incorporating the use of Industrial Control System (ICS) which introduces cyber-physical systems (CPS) in water treatment plants. Along with the highly efficient automated processes, the connectivity of the systems instigates new means of cyber-attacks threats. For research, a Secure Water Treatment (SWaT) testbed was jointly established by Singapore’s authorities and SUTD to provide a facility to study the security of CPS. This study aims to improve and optimize previously developed anomaly detection scripts against possible attacks on the testbed. Previous studies utilized NGBoost (NGB) which is a gradient boosting model (GBM) which outputs probabilistic predictions as the main algorithm to perform anomaly detection. Probabilistic predictions were used to estimate uncertainties to aid in the judgement of a model’s prediction. XGBoost-Distritbution (XGBD) was discovered to be a more efficient gradient boosting model compared to NGBoost (NGB) while also providing probabilistic predictions. XGBD was found to perform predictions 30 times faster and train 18 times faster than NGB. However, XGBD’s overall performance on the validation set has a 10% higher RMSE and a 25% higher MAE than NGB’s overall performance on the validation set. After comparing the significant reduction of computational time and slightly inferior accuracy, it was optimistic that XGBD is a more suitable model candidate for this project’s application. To maintain the performance of models during real-time prediction, factors affecting the degradation of performance were identified. In addition, mitigation methods were proposed to continuously improve the training data and to use the improvised training data to update and retrain models.