Application of reinforcement learning to production system

The primary goal for this research is to obtain the optimal or near-optimal joint production and maintenance scheduling policy by means of reinforcement learning. In this research, we adopted reinforcement algorithm to control the feeding interval and the maintenance state of upstream station in pro...

Full description

Saved in:
Bibliographic Details
Main Author: Jiang, Zhijin
Other Authors: Rajesh Piplani
Format: Theses and Dissertations
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/75935
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The primary goal for this research is to obtain the optimal or near-optimal joint production and maintenance scheduling policy by means of reinforcement learning. In this research, we adopted reinforcement algorithm to control the feeding interval and the maintenance state of upstream station in production system. With the help of this algorithm, the work-in-process(WIP) in the production system can be limited to a reasonable level and machines are preventively maintained to be functional. By balancing the reward and cost from WIP, maintenance and the idle loss of bottleneck machine, the reinforcement learning algorithm is able to find the acceptable policy for adjusting the feeding rate and scheduling the preventive maintenance for upstream machine. However reinforcement learning involves in a lot of parameters and in practice parameters may range widely from cases to cases. There are totally five experiments performed in this research, the first and the second is the validation experiments and the third and forth is to explain the property of the algorithm. the fifth experiment describes how fast the algorithm can learn to achieve the target state of upstream station. The developed model consists of reinforcement learning based, decision-making agents with simulation model of the integrated production system. The smart agent determine the optimal or near-optimal action for each system state by interacting with their environment.