Deep learning techniques for anomaly detection in time series data using transformer
Due to the growing demand and applications in many fields producing massive amounts of high dimensional data, anomaly detection is becoming increasingly important. The correlation between sequences makes multivariate anomalies more difficult on top of univariate anomalies. With the proliferation of...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/162811 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Due to the growing demand and applications in many fields producing massive amounts of high dimensional data, anomaly detection is becoming increasingly important. The correlation between sequences makes multivariate anomalies more difficult on top of univariate anomalies. With the proliferation of data modalities, the issue of anomaly detection in large-scale databases is getting harder. Therefore, the main objective of this project will be to identify irregularities in multivariate time series data using deep learning techniques.
To achieve the objectives of this project, Transformers, which have achieved state-of-the-art-performance in various natural language processing and computer vision tasks, are used to detect anomalies in multivariate time series data. Further improvements were also proposed to address issues in the base model.
When dealing with extremely low dimensional time series where the granularity of the data is exceedingly small, the base model has a relative weakness. A 1D-convolutional layer is used to extract more meaningful representations of low dimension input parts in order to overcome this.
It is important to capture both temporal and spatial information as multivariate time series is made up of many channels. The encoders in each tower of a two-tower framework are specifically designed to capture the step-wise and channel-wise correlation. The two towers are combined to merge the features of the two encoders.
Convolutional Neural Network is not fully utilised when combined with Transformer as they are loosely coupled. Two Tightly Coupled Convolutional Transformer architecture, CSPAttention and Passthrough Mechanism are suggested as solutions to this problem and to reduce computation costs. The computing cost of self-attention is decreased by 30% by using CSPAttention which combines CSPNet with a self-attention mechanism. The passthrough mechanism enables Transformer-like models to obtain more precise information with minimally increased computational costs when applied to a stack of self-attention blocks.
The models are evaluated on several benchmark datasets for multivariate time series regression and classification. All in all, the modelling approaches have exceeded the current state-of-the-art performance of supervised methods. |
---|