Exploring tabular Q-learning for single machine job dispatching
Reinforcement Learning (RL) studies the problem of how an autonomous agent can learn, while interacting with its environment, to choose the appropriate actions for achieving its goals. In this report, we will be exploring the application of tabular Q-Learning, a type of RL, to single machine job dis...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2019
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/77872 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-77872 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-778722023-03-04T18:46:35Z Exploring tabular Q-learning for single machine job dispatching Chai, Jeffrey Zhi Yang Sivakumar A. I. Tan Chin Sheng School of Mechanical and Aerospace Engineering A*STAR Singapore Institute of Manufacturing Technology DRNTU::Engineering::Systems engineering Reinforcement Learning (RL) studies the problem of how an autonomous agent can learn, while interacting with its environment, to choose the appropriate actions for achieving its goals. In this report, we will be exploring the application of tabular Q-Learning, a type of RL, to single machine job dispatching (SMDJ) problems. In tabular Q-Learning, the agent learns the most appropriate action for any environment state by storing this information in the form of state-action pair value functions which are represented in a discretised state-action table. Although there have been existing studies that demonstrated its application potential in these scheduling problems, the current Q-Learning implementation had one limitation identified in this report, which was using a state-action tables that was pre-determined by trial and error experiments before the actual learning application to the scheduling problems. This can be a very tedious process. For this project, we therefore proposed a clustering method, K-Means Clustering, to automate and dynamically determine the state-action tables instead. These state-policy tables, under our proposed method, are specific to SMDJ problems of different system objectives and was also able to achieve higher rate of learning convergence by the agent. The project also investigated the feasibility of using Q-Learning to formulate composite dispatching rules for SMJD problems with multiple objectives, an area of research not yet pervasively studied. Overall, this study provided encouraging results for better future application of Q-Learning to more complex production scheduling. Bachelor of Engineering (Mechanical Engineering) 2019-06-07T05:28:42Z 2019-06-07T05:28:42Z 2019 Final Year Project (FYP) http://hdl.handle.net/10356/77872 en Nanyang Technological University 80 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Systems engineering |
spellingShingle |
DRNTU::Engineering::Systems engineering Chai, Jeffrey Zhi Yang Exploring tabular Q-learning for single machine job dispatching |
description |
Reinforcement Learning (RL) studies the problem of how an autonomous agent can learn, while interacting with its environment, to choose the appropriate actions for achieving its goals. In this report, we will be exploring the application of tabular Q-Learning, a type of RL, to single machine job dispatching (SMDJ) problems. In tabular Q-Learning, the agent learns the most appropriate action for any environment state by storing this information in the form of state-action pair value functions which are represented in a discretised state-action table. Although there have been existing studies that demonstrated its application potential in these scheduling problems, the current Q-Learning implementation had one limitation identified in this report, which was using a state-action tables that was pre-determined by trial and error experiments before the actual learning application to the scheduling problems. This can be a very tedious process. For this project, we therefore proposed a clustering method, K-Means Clustering, to automate and dynamically determine the state-action tables instead. These state-policy tables, under our proposed method, are specific to SMDJ problems of different system objectives and was also able to achieve higher rate of learning convergence by the agent. The project also investigated the feasibility of using Q-Learning to formulate composite dispatching rules for SMJD problems with multiple objectives, an area of research not yet pervasively studied. Overall, this study provided encouraging results for better future application of Q-Learning to more complex production scheduling. |
author2 |
Sivakumar A. I. |
author_facet |
Sivakumar A. I. Chai, Jeffrey Zhi Yang |
format |
Final Year Project |
author |
Chai, Jeffrey Zhi Yang |
author_sort |
Chai, Jeffrey Zhi Yang |
title |
Exploring tabular Q-learning for single machine job dispatching |
title_short |
Exploring tabular Q-learning for single machine job dispatching |
title_full |
Exploring tabular Q-learning for single machine job dispatching |
title_fullStr |
Exploring tabular Q-learning for single machine job dispatching |
title_full_unstemmed |
Exploring tabular Q-learning for single machine job dispatching |
title_sort |
exploring tabular q-learning for single machine job dispatching |
publishDate |
2019 |
url |
http://hdl.handle.net/10356/77872 |
_version_ |
1759857333403385856 |