Probabilistic digital twin of water treatment facilities

In recent years, the implementation of digital twin (DT) as a digital replica of the physical asset has matured significantly in smart manufacturing with the advancement of digital technologies. At the same time, for water treatment facilities which are critical infrastructures, DT is still in its i...

Full description

Saved in:
Bibliographic Details
Main Author: Wei, Yuying
Other Authors: Law Wing-Keung, Adrian
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/173687
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-173687
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering
Digital twin
Maching learning
Probabilistic assessment
Anomaly detection
Data assimilation
spellingShingle Engineering
Digital twin
Maching learning
Probabilistic assessment
Anomaly detection
Data assimilation
Wei, Yuying
Probabilistic digital twin of water treatment facilities
description In recent years, the implementation of digital twin (DT) as a digital replica of the physical asset has matured significantly in smart manufacturing with the advancement of digital technologies. At the same time, for water treatment facilities which are critical infrastructures, DT is still in its infancy for real-world applications. Therefore, there is a pressing need for research that can accelerate the DT development for these critical infrastructures. The aim of this study is to improve the performance of DT for water treatment facilities using probabilistic assessment. The scope of research focuses on two main directions: first, utilizing probabilistic machine learning (ML) models to assist in real-time anomaly detection of DT, and second, developing data assimilation methods for probabilistic ML models to obtain optimized system states for process control. Anomaly detection is crucial for water treatment facilities as they can be susceptible to cyber-physical attacks that negatively impact the system's functionality, while data assimilation can improve the robustness of real-time monitoring in noisy environments by assimilating ML predictions with observations. A combined anomaly detection framework (CADF) was first developed for DT applications using probabilistic ML models. A prototype water treatment testbed facility was utilized to verify the anomaly detection framework by simulating various types of security attacks. CADF utilizes a Programmable Logic Controller (PLC)-based whitelist system to detect anomalies targeting the actuators, and a probabilistic ML model with corresponding assessment to detect anomalies targeting the sensors in this fully operational, scaled-down testbed facility. The results showed that CADF could successfully detect various attacks on the DT and reduce false alarms significantly when compared with other methods. To further enhance the performance of CADF, a real-time data processing framework was developed to update ML models in the DT by selecting suitable update intervals and training datasets to maximize their overall accuracy. It was shown to synchronize model updates and predictions effectively, leading to a significant reduction in errors. A data assimilation approach named Probabilistic Optimal Interpolation (POI) was also developed to combine the predictions from probabilistic ML models and real-time observations. The quantification of the respective uncertainties is directly included within the probabilistic ML model itself. As an application example, the performance of POI was tested using a multi-scale Lorenz 96 chaos system in both stationary and nonstationary environments. The POI implementation was able to reduce uncertainty in both environments and serve as a compromise in scenarios where the noise level of the environment was unclear. The performance of POI under scenarios with missing values was also evaluated by masking the test datasets with different missingness rates. The impact from random missing values was found negligible and assimilation was still suggested at missing points. In summary, probabilistic approaches and frameworks were developed in this study for DT of water treatment facilities, with a particular emphasis on anomaly detection and data assimilation. They were shown to be effective in enhancing the DT performance and potentially leading to more robust and secure systems in the future.
author2 Law Wing-Keung, Adrian
author_facet Law Wing-Keung, Adrian
Wei, Yuying
format Thesis-Doctor of Philosophy
author Wei, Yuying
author_sort Wei, Yuying
title Probabilistic digital twin of water treatment facilities
title_short Probabilistic digital twin of water treatment facilities
title_full Probabilistic digital twin of water treatment facilities
title_fullStr Probabilistic digital twin of water treatment facilities
title_full_unstemmed Probabilistic digital twin of water treatment facilities
title_sort probabilistic digital twin of water treatment facilities
publisher Nanyang Technological University
publishDate 2024
url https://hdl.handle.net/10356/173687
_version_ 1794549334696525824
spelling sg-ntu-dr.10356-1736872024-03-07T08:52:06Z Probabilistic digital twin of water treatment facilities Wei, Yuying Law Wing-Keung, Adrian Interdisciplinary Graduate School (IGS) Nanyang Environment and Water Research Institute CWKLAW@ntu.edu.sg Engineering Digital twin Maching learning Probabilistic assessment Anomaly detection Data assimilation In recent years, the implementation of digital twin (DT) as a digital replica of the physical asset has matured significantly in smart manufacturing with the advancement of digital technologies. At the same time, for water treatment facilities which are critical infrastructures, DT is still in its infancy for real-world applications. Therefore, there is a pressing need for research that can accelerate the DT development for these critical infrastructures. The aim of this study is to improve the performance of DT for water treatment facilities using probabilistic assessment. The scope of research focuses on two main directions: first, utilizing probabilistic machine learning (ML) models to assist in real-time anomaly detection of DT, and second, developing data assimilation methods for probabilistic ML models to obtain optimized system states for process control. Anomaly detection is crucial for water treatment facilities as they can be susceptible to cyber-physical attacks that negatively impact the system's functionality, while data assimilation can improve the robustness of real-time monitoring in noisy environments by assimilating ML predictions with observations. A combined anomaly detection framework (CADF) was first developed for DT applications using probabilistic ML models. A prototype water treatment testbed facility was utilized to verify the anomaly detection framework by simulating various types of security attacks. CADF utilizes a Programmable Logic Controller (PLC)-based whitelist system to detect anomalies targeting the actuators, and a probabilistic ML model with corresponding assessment to detect anomalies targeting the sensors in this fully operational, scaled-down testbed facility. The results showed that CADF could successfully detect various attacks on the DT and reduce false alarms significantly when compared with other methods. To further enhance the performance of CADF, a real-time data processing framework was developed to update ML models in the DT by selecting suitable update intervals and training datasets to maximize their overall accuracy. It was shown to synchronize model updates and predictions effectively, leading to a significant reduction in errors. A data assimilation approach named Probabilistic Optimal Interpolation (POI) was also developed to combine the predictions from probabilistic ML models and real-time observations. The quantification of the respective uncertainties is directly included within the probabilistic ML model itself. As an application example, the performance of POI was tested using a multi-scale Lorenz 96 chaos system in both stationary and nonstationary environments. The POI implementation was able to reduce uncertainty in both environments and serve as a compromise in scenarios where the noise level of the environment was unclear. The performance of POI under scenarios with missing values was also evaluated by masking the test datasets with different missingness rates. The impact from random missing values was found negligible and assimilation was still suggested at missing points. In summary, probabilistic approaches and frameworks were developed in this study for DT of water treatment facilities, with a particular emphasis on anomaly detection and data assimilation. They were shown to be effective in enhancing the DT performance and potentially leading to more robust and secure systems in the future. Doctor of Philosophy 2024-02-23T01:44:48Z 2024-02-23T01:44:48Z 2024 Thesis-Doctor of Philosophy Wei, Y. (2024). Probabilistic digital twin of water treatment facilities. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/173687 https://hdl.handle.net/10356/173687 10.32657/10356/173687 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University