Deep reinforcement learning for inventory management
Inventory control is one of the most important aspects of supply chain management. An inefficient inventory control can give rise to higher inventory costs. However there are many solutions and new systems that have given rise to more optimal and better management of Inventory. There are a wide rang...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/156593 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Inventory control is one of the most important aspects of supply chain management. An inefficient inventory control can give rise to higher inventory costs. However there are many solutions and new systems that have given rise to more optimal and better management of Inventory. There are a wide range of inventory setups, each catered to the needs of the business. One approach which might be suitable to solve such an inventory problem is Deep Reinforcement learning (DRL). Reinforcement learning (RL) has shown some good results in past works and Deep Reinforcement learning may be more powerful in complex inventory systems. Through this project, we attempt to find a solution for a simple inventory control problem using Deep Reinforcement Learning, namely, Deep Q-Learning with the assistance of a Multilayer Perceptron Deep Network (MLP). The problem is a single-agent single-item inventory control problem which has constraints such as lead time. An MLP of 2 hidden linear layers with ReLU activation function was build to approximate the Deep Q-Network (DQN) Policy. The (DQN) implemented by us was experimented and analysed using grid search and seed analysis, and was also compared with other techniques such as Q- Learning and Mixed Integer Linear Programming (MILP). Though the DQN model does not perform as good as the Q-Learning and MILP models, it proves great potential to be improved and optimised further. |
---|