Time series clustering and anomaly detection in the financial markets

Time series clustering and anomaly detection provide researches with useful domain insights but are also two of the most challenging time series data mining issues. As both activities have high time complexity cost and high memory requirements, few studies on large time series datasets have been mad...

Full description

Saved in:
Bibliographic Details
Main Author: Lim, Wilbur Yong Wei
Other Authors: Ke Yiping, Kelly
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/148473
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Time series clustering and anomaly detection provide researches with useful domain insights but are also two of the most challenging time series data mining issues. As both activities have high time complexity cost and high memory requirements, few studies on large time series datasets have been made. In this paper, we focus on unsupervised whole time series clustering, identifying anomalous clusters, and point outlier anomaly detection within time series sequences within the financial markets. Using K Means ++ and K Shape, Whole time series clustering is performed on the NYSE, NASDAQ and AMEX stock companies’ stock price performance between the period 1st October 2005 to 1st October 2020, a 15-year-long time period. The dataset reviewed consists of 11 market sectors under the GICS classification methodology and thus includes over 2785 individual time series. This review is arguably one of the few, if not the first, to evaluate whole time series clustering on such a large scale. After evaluating anomalous clusters found in the dataset, two unsupervised point-outlier detection algorithms, namely Isolation Forest, and One Class Support Vector Machine, will be employed on the same dataset before the detection results between the two algorithms are compared.