Exploring tabular Q-learning for single machine job dispatching

Reinforcement Learning (RL) studies the problem of how an autonomous agent can learn, while interacting with its environment, to choose the appropriate actions for achieving its goals. In this report, we will be exploring the application of tabular Q-Learning, a type of RL, to single machine job dis...

Full description

Saved in:
Bibliographic Details
Main Author: Chai, Jeffrey Zhi Yang
Other Authors: Sivakumar A. I.
Format: Final Year Project
Language:English
Published: 2019
Subjects:
Online Access:http://hdl.handle.net/10356/77872
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-77872
record_format dspace
spelling sg-ntu-dr.10356-778722023-03-04T18:46:35Z Exploring tabular Q-learning for single machine job dispatching Chai, Jeffrey Zhi Yang Sivakumar A. I. Tan Chin Sheng School of Mechanical and Aerospace Engineering A*STAR Singapore Institute of Manufacturing Technology DRNTU::Engineering::Systems engineering Reinforcement Learning (RL) studies the problem of how an autonomous agent can learn, while interacting with its environment, to choose the appropriate actions for achieving its goals. In this report, we will be exploring the application of tabular Q-Learning, a type of RL, to single machine job dispatching (SMDJ) problems. In tabular Q-Learning, the agent learns the most appropriate action for any environment state by storing this information in the form of state-action pair value functions which are represented in a discretised state-action table. Although there have been existing studies that demonstrated its application potential in these scheduling problems, the current Q-Learning implementation had one limitation identified in this report, which was using a state-action tables that was pre-determined by trial and error experiments before the actual learning application to the scheduling problems. This can be a very tedious process. For this project, we therefore proposed a clustering method, K-Means Clustering, to automate and dynamically determine the state-action tables instead. These state-policy tables, under our proposed method, are specific to SMDJ problems of different system objectives and was also able to achieve higher rate of learning convergence by the agent. The project also investigated the feasibility of using Q-Learning to formulate composite dispatching rules for SMJD problems with multiple objectives, an area of research not yet pervasively studied. Overall, this study provided encouraging results for better future application of Q-Learning to more complex production scheduling. Bachelor of Engineering (Mechanical Engineering) 2019-06-07T05:28:42Z 2019-06-07T05:28:42Z 2019 Final Year Project (FYP) http://hdl.handle.net/10356/77872 en Nanyang Technological University 80 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Systems engineering
spellingShingle DRNTU::Engineering::Systems engineering
Chai, Jeffrey Zhi Yang
Exploring tabular Q-learning for single machine job dispatching
description Reinforcement Learning (RL) studies the problem of how an autonomous agent can learn, while interacting with its environment, to choose the appropriate actions for achieving its goals. In this report, we will be exploring the application of tabular Q-Learning, a type of RL, to single machine job dispatching (SMDJ) problems. In tabular Q-Learning, the agent learns the most appropriate action for any environment state by storing this information in the form of state-action pair value functions which are represented in a discretised state-action table. Although there have been existing studies that demonstrated its application potential in these scheduling problems, the current Q-Learning implementation had one limitation identified in this report, which was using a state-action tables that was pre-determined by trial and error experiments before the actual learning application to the scheduling problems. This can be a very tedious process. For this project, we therefore proposed a clustering method, K-Means Clustering, to automate and dynamically determine the state-action tables instead. These state-policy tables, under our proposed method, are specific to SMDJ problems of different system objectives and was also able to achieve higher rate of learning convergence by the agent. The project also investigated the feasibility of using Q-Learning to formulate composite dispatching rules for SMJD problems with multiple objectives, an area of research not yet pervasively studied. Overall, this study provided encouraging results for better future application of Q-Learning to more complex production scheduling.
author2 Sivakumar A. I.
author_facet Sivakumar A. I.
Chai, Jeffrey Zhi Yang
format Final Year Project
author Chai, Jeffrey Zhi Yang
author_sort Chai, Jeffrey Zhi Yang
title Exploring tabular Q-learning for single machine job dispatching
title_short Exploring tabular Q-learning for single machine job dispatching
title_full Exploring tabular Q-learning for single machine job dispatching
title_fullStr Exploring tabular Q-learning for single machine job dispatching
title_full_unstemmed Exploring tabular Q-learning for single machine job dispatching
title_sort exploring tabular q-learning for single machine job dispatching
publishDate 2019
url http://hdl.handle.net/10356/77872
_version_ 1759857333403385856