A risk-sensitive stock trading system with the application of reinforcement learning (Q-learning)

The aim of this research project is to develop a stock trading system using reinforcement learning (RL) techniques. The characteristic that sets this trading system apart from existing works is the fact that in addition to being profit-maximizing, it is also risk-sensitive. It allows for the preferr...

Full description

Saved in:
Bibliographic Details
Main Author: Gupta, Shantanu
Other Authors: Quek Hiok Chai
Format: Final Year Project
Language:English
Published: 2017
Subjects:
Online Access:http://hdl.handle.net/10356/70414
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The aim of this research project is to develop a stock trading system using reinforcement learning (RL) techniques. The characteristic that sets this trading system apart from existing works is the fact that in addition to being profit-maximizing, it is also risk-sensitive. It allows for the preferred amount of risk-seeking to be set on a sliding scale and this is incorporated directly into the reinforcement learning model. There is no single type of stock trader in the market. Different traders are willing to tolerate different amounts of risk. Risk-averse traders are often unwilling to enter the market to trade in situations where risk-seekers are willing to. This stock trading system caters to the needs of different categories of traders. The behavior of this system was successfully validated using existing research in behavioral finance and actual trading data from human subjects. The trading pattern of the system did match the pattern predicted by psychological theories and the behavior shown by human subjects which proves that the system is correctly exhibiting the desired behavior. Another insight of this project is that different risk profiles are suitable for different stock market conditions. As a result, a risk-adaptive trading system is developed that can serve this requirement. The results show that it is successfully able to adopt the correct risk strategies and outperforms systems with constant risk-profiles.