Effects of action masking on deep reinforcement learning for inventory management
Inventory Management has always been a crucial part of Supply Chain Management, and not managing it carefully would lead to unnecessary inventory costs such as lost sales and holding cost. Over the years, many researchers have investigated solutions and systems in the field of operations research to...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/166091 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Inventory Management has always been a crucial part of Supply Chain Management, and not managing it carefully would lead to unnecessary inventory costs such as lost sales and holding cost. Over the years, many researchers have investigated solutions and systems in the field of operations research to better manage inventory and optimize it by lowering the inventory cost as much as possible. Due to recent advancement in reinforcement learning and the advancement of deep neural network, there has been rising interest in making use of Deep Reinforcement Learning to train an artificial agent that would be able to manage inventory and minimize inventory costs. Through this report, a solution for a single retailer, single item Inventory Management Environment with stochastic demand would be developed using Deep Q-Network (DQN). Moreover, even though there are recent works of using DQN in Inventory Management, not many have investigated the effects of action masking on this problem domain. Thus, this report will attempt to focus on investigating different methods of action masking and analyze their effects on the speed of convergence during the training phase and additional metric such as mean reward, fill rate and service level during the inference phase. Furthermore, this report will also analyze the effects of different demand distribution and whether that will affect the training of a DQN agent. |
---|