Predicting stock market index with gradient boosting machine ensemble, bayesian optimization, temporal consistency analysis, market sentiment analysis, game theory and novel holdout method
The potential of machine learning has sustained the interest of both academia and industry in stock market prediction for over the past decade. This project aims to integrate modern techniques used in the field into a resource-efficient and accurate stock index predictor. While Gradient Boosting Ma...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/148294 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-148294 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1482942023-05-19T05:44:54Z Predicting stock market index with gradient boosting machine ensemble, bayesian optimization, temporal consistency analysis, market sentiment analysis, game theory and novel holdout method Yeo, Jarrett Shan Wei Yeo Chai Kiat Nanyang Business School ASCKYEO@ntu.edu.sg Engineering::Computer science and engineering::Information systems The potential of machine learning has sustained the interest of both academia and industry in stock market prediction for over the past decade. This project aims to integrate modern techniques used in the field into a resource-efficient and accurate stock index predictor. While Gradient Boosting Machines (GBMs) have been around for more than twenty years, they have recently received a revival in popularity of modern gradient-boosted decision trees such as XGBoost in 2014, and LightGBM and CatBoost in 2017. Additionally, literature in stock market prediction field has been focused on the use of macro-economic metrics, the creation of technical financial indicators, and more recently, the analysis of social media big data as well. This project serves to unify such techniques into an efficient yet effective ensemble called CalixBoost Ensemble of the GBMs using the aforementioned data. The models are tuned with Bayesian Optimization, and temporal consistency analysis is also used for invariant feature selection over random trial-and-error. Market sentiment analysis is then conducted using a simple and fast but effective rule-based model tuned specifically for understanding social media posts. Finally, the feature importance and inter-feature relationships of every model will be explained using a unified game theory approach using Shapley values to better appreciate their inner workings. All models will be evaluated using a novel holdout method, viz. on two separate test datasets whose datapoints are collected under different conditions: first, normal economic activity; and second, during a black swan / financial downturn. Bachelor of Business Bachelor of Engineering (Computer Science) 2021-05-04T02:05:41Z 2021-05-04T02:05:41Z 2021 Final Year Project (FYP) Yeo, J. S. W. (2021). Predicting stock market index with gradient boosting machine ensemble, bayesian optimization, temporal consistency analysis, market sentiment analysis, game theory and novel holdout method. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/148294 https://hdl.handle.net/10356/148294 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering::Information systems |
spellingShingle |
Engineering::Computer science and engineering::Information systems Yeo, Jarrett Shan Wei Predicting stock market index with gradient boosting machine ensemble, bayesian optimization, temporal consistency analysis, market sentiment analysis, game theory and novel holdout method |
description |
The potential of machine learning has sustained the interest of both academia and industry in stock market prediction for over the past decade. This project aims to integrate modern techniques used in the field into a resource-efficient and accurate stock index predictor.
While Gradient Boosting Machines (GBMs) have been around for more than twenty years, they have recently received a revival in popularity of modern gradient-boosted decision trees such as XGBoost in 2014, and LightGBM and CatBoost in 2017.
Additionally, literature in stock market prediction field has been focused on the use of macro-economic metrics, the creation of technical financial indicators, and more recently, the analysis of social media big data as well.
This project serves to unify such techniques into an efficient yet effective ensemble called CalixBoost Ensemble of the GBMs using the aforementioned data. The models are tuned with Bayesian Optimization, and temporal consistency analysis is also used for invariant feature selection over random trial-and-error. Market sentiment analysis is then conducted using a simple and fast but effective rule-based model tuned specifically for understanding social media posts. Finally, the feature importance and inter-feature relationships of every model will be explained using a unified game theory approach using Shapley values to better appreciate their inner workings.
All models will be evaluated using a novel holdout method, viz. on two separate test datasets whose datapoints are collected under different conditions: first, normal economic activity; and second, during a black swan / financial downturn. |
author2 |
Yeo Chai Kiat |
author_facet |
Yeo Chai Kiat Yeo, Jarrett Shan Wei |
format |
Final Year Project |
author |
Yeo, Jarrett Shan Wei |
author_sort |
Yeo, Jarrett Shan Wei |
title |
Predicting stock market index with gradient boosting machine ensemble, bayesian optimization, temporal consistency analysis, market sentiment analysis, game theory and novel holdout method |
title_short |
Predicting stock market index with gradient boosting machine ensemble, bayesian optimization, temporal consistency analysis, market sentiment analysis, game theory and novel holdout method |
title_full |
Predicting stock market index with gradient boosting machine ensemble, bayesian optimization, temporal consistency analysis, market sentiment analysis, game theory and novel holdout method |
title_fullStr |
Predicting stock market index with gradient boosting machine ensemble, bayesian optimization, temporal consistency analysis, market sentiment analysis, game theory and novel holdout method |
title_full_unstemmed |
Predicting stock market index with gradient boosting machine ensemble, bayesian optimization, temporal consistency analysis, market sentiment analysis, game theory and novel holdout method |
title_sort |
predicting stock market index with gradient boosting machine ensemble, bayesian optimization, temporal consistency analysis, market sentiment analysis, game theory and novel holdout method |
publisher |
Nanyang Technological University |
publishDate |
2021 |
url |
https://hdl.handle.net/10356/148294 |
_version_ |
1770566906594459648 |