Application of machine learning in stock index forecast

Stock prediction has been a popular area of research. It is challenging due to the dynamic, chaotic, and non-stationary nature of data. However, significant advancements in the field of machine learning, has encouraged the usage of these advanced techniques in the application of stock price pre...

Full description

Saved in:
Bibliographic Details
Main Author: Suresh, Shet Swati
Other Authors: Yeo Chai Kiat
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/156547
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Stock prediction has been a popular area of research. It is challenging due to the dynamic, chaotic, and non-stationary nature of data. However, significant advancements in the field of machine learning, has encouraged the usage of these advanced techniques in the application of stock price prediction. This project focuses on the New York Stock Exchange Composite (NYSE) Index for stock Opening Price and Stock Movement (Direction) forecasting. NYSE index is downloaded from Yahoo! Finance. It leverages Technical Indicators as well as market Sentiment Analysis to facilitate the prediction of stock index. Technical Indicators are obtained via feature engineering of the stock index. Sentiment Analysis is obtained via data pre-processing of extracted Twitter Tweets to which VADER is applied. Further, Recursive Feature Addition (RFA) algorithm is implemented to identify impactful Technical Indicators and discard insignificant Technical Indicators. The pre-processed features of the data are fed into the proposed models – LSTM (Long Short-Term Memory), PCA-LSTM (Principal Component Analysis-Long Short-Term Memory) and CNN-LSTM (Convolutional Neural Network-Long-Short Term Memory). The model performances are evaluated and compared with one another as well as with benchmark models, namely, ARIMA (Autoregressive Integrated Moving Average) and SVR (Support Vector Regression). The results indicate that incorporation of technical indicators, market sentiment analysis score, PCA in the case of LSTM as well as applying RFA algorithm improve model performance in terms of RMSE, MAE, Accuracy and F1 Score. Further, the proposed models exceed benchmark model performance in terms of Accuracy and F1 Score and overall perform well in terms RMSE and MSE metrics.