Pairs trading strategy with unsupervised clustering methods

Machine learning has been gaining momentum and has been applied in various fields including finance in recent years. Most financial application of machine learning are used for predictive tasks, such as predicting returns or risk, which can be easily converted into supervised learning or reinforceme...

Full description

Saved in:
Bibliographic Details
Main Author: Toh, Alenson Jun Wei
Other Authors: Heng Kok Hui, John Gerard
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/137833
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Machine learning has been gaining momentum and has been applied in various fields including finance in recent years. Most financial application of machine learning are used for predictive tasks, such as predicting returns or risk, which can be easily converted into supervised learning or reinforcement learning problems. This paper proposes a framework to construct a novel clustering-based pairs trading strategy which is the first attempt of applying unsupervised learning method in finance literature. Three clustering methods namely K-means clustering, Density-based Spatial Clustering of Applications with Noise (DBSCAN) and Agglomerative clustering on pairs trading are explored on the US stock market. Comparing the performance of equally-weighted long-short portfolios from the 3 clustering methods, DBSCAN outperforms the other 2 significantly where it attains an annualized Sharpe ratio of 2.141 and an annualised mean return of 26.5% prior to transaction cost during January 2016 to December 2019. It is then proposed in this paper to define "pairs" using a new perspective, that is to find pairs in terms of the data density in a high-dimensional data structure. An industry breakdown of stocks chosen and traded by DBSCAN is also conducted to unveil the sources of profitability. It is discovered that most of the stocks traded by the clustering strategies are in the financial industry. This shows that financial institutions are very similar to one another in terms of financial performance and should give similar stock returns in an efficient market. DBSCAN strategy has also outperformed existing pairs trading strategy such as cointegration, distance, time series and supervised learning approach significantly.