Anomaly detection in multivariate time series using ensemble method
Water distribution networks (WDNs) are essential services to people’s life and production. The identification of anomalies and mitigation of cyber-attacks are crucial to ensure uninterrupted water service. Among various solutions of anomalies detection, matrix profile is recognized as the most ti...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Research |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/155731 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Water distribution networks (WDNs) are essential services to people’s life and production.
The identification of anomalies and mitigation of cyber-attacks are crucial to ensure uninterrupted
water service. Among various solutions of anomalies detection, matrix profile
is recognized as the most time-efficient distanced-based approach. Matrix profile identifies
discords in univariate time series (UTS). As the physical processes are interdependent in
water networks, the data obtained from different sensors are correlated to detect anomalies
in multivariate time series (MTS). However, this approach only collects positive predictions
and has limitations in eliminating false-positive detections.
To improve the above-mentioned anomaly detection limitation of matrix profile, we propose
and demonstrate two methods, the matrix profile with autoencoder method and the boosting
method. Autoencoder, an artificial neural network trained to copy its input to its output, is
introduced to reduce false alarms. Moreover, the localization of anomalies is automated by
analyzing the UTS anomaly detection results. Boosting is an ensemble learning algorithm
that focuses on correcting misclassified labels by the previous model with the current model.
It converts weak learners to strong learners sequentially. Three boosting methods, including
XGBoost, LightGBM, and CatBoost, are studied to tackle the classification of anomalies.
Specifically, the proposed matrix profile with the autoencoder based ensemble model is applied
as a semi-supervised anomaly detection model. The three boosting-based models are
proposed as supervised anomaly detection models.
To validate effectiveness in complex environments of water distribution system (WDS), we
tested the proposed two methods with simulated datasets containing labeled cyber-attacks.
Both the matrix profile with autoencoder model and the CatBoost model show high accuracy
of 0.9645 and 0.9245, respectively, superior to the existing state-of-the-art models. In
addition, the boosting methods are also applied to anomaly detection on a simulated leakage
dataset that contains detailed leakage information in WDS. The LightGMB provides outstanding
classification results with 0.945 and 0.985 accuracy, which is competitive among
the frontier models. |
---|