Process control of water treatment facilities using machine learning method

The water industry in Singapore is increasingly incorporating the use of Industrial Control System (ICS) which introduces cyber-physical systems (CPS) in water treatment plants. Along with the highly efficient automated processes, the connectivity of the systems instigates new means of cyber-a...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Tey, Shiyang
مؤلفون آخرون: Law Wing-Keung, Adrian
التنسيق: Final Year Project
اللغة:English
منشور في: Nanyang Technological University 2022
الموضوعات:
الوصول للمادة أونلاين:https://hdl.handle.net/10356/158278
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
الوصف
الملخص:The water industry in Singapore is increasingly incorporating the use of Industrial Control System (ICS) which introduces cyber-physical systems (CPS) in water treatment plants. Along with the highly efficient automated processes, the connectivity of the systems instigates new means of cyber-attacks threats. For research, a Secure Water Treatment (SWaT) testbed was jointly established by Singapore’s authorities and SUTD to provide a facility to study the security of CPS. This study aims to improve and optimize previously developed anomaly detection scripts against possible attacks on the testbed. Previous studies utilized NGBoost (NGB) which is a gradient boosting model (GBM) which outputs probabilistic predictions as the main algorithm to perform anomaly detection. Probabilistic predictions were used to estimate uncertainties to aid in the judgement of a model’s prediction. XGBoost-Distritbution (XGBD) was discovered to be a more efficient gradient boosting model compared to NGBoost (NGB) while also providing probabilistic predictions. XGBD was found to perform predictions 30 times faster and train 18 times faster than NGB. However, XGBD’s overall performance on the validation set has a 10% higher RMSE and a 25% higher MAE than NGB’s overall performance on the validation set. After comparing the significant reduction of computational time and slightly inferior accuracy, it was optimistic that XGBD is a more suitable model candidate for this project’s application. To maintain the performance of models during real-time prediction, factors affecting the degradation of performance were identified. In addition, mitigation methods were proposed to continuously improve the training data and to use the improvised training data to update and retrain models.