IMPLEMENTATION OF REINFORCEMENT LEARNING BY QUANTUM COMPUTING APPROACH

Basically, reinforcement learning applies trial and error principle so that this type of machine learning takes a long time to solving the problems. Besides that, unlike other types of machine learning, there is a challenge to reinforcement learning, that is trade off between exploration and e...

Full description

Saved in:
Bibliographic Details
Main Author: Wafa, Hani'ah
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/58053
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:58053
spelling id-itb.:580532021-08-30T12:48:15ZIMPLEMENTATION OF REINFORCEMENT LEARNING BY QUANTUM COMPUTING APPROACH Wafa, Hani'ah Indonesia Theses reinforcement learning, quantum computing, Grover iteration, VQC, frozen lake INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/58053 Basically, reinforcement learning applies trial and error principle so that this type of machine learning takes a long time to solving the problems. Besides that, unlike other types of machine learning, there is a challenge to reinforcement learning, that is trade off between exploration and exploitation. On the other hand, reasearcher found that the application of quantum computing can accelerate computing on various problems quadratically or even exponentially. So that researcher began to use quantum computing in the field of machine learning including reinforcement learning. Prevoius study by Dong, et al. (2008) solved reinforcement learning with tabular method and action selection that is inspired by quantum algorithm namely Grover iteration. This research shows that the use of quantum computing can balance the trade off between exploration and exploitation. However the use of the tabular method certainly makes this method unscalable. Another study by Chen, et al. (2019) solved reinforcement learning with the approximation method using Variational Quantum Circuit (VQC). However this study focuses more on the use of parameters and memory that is so much fewer than than in the classical reinforcement learning. To fill each other’s gap between the two methods, this thesis basically combines the method proposed by Dong, et al. (2008) and Chen, et al. (2019) with some modifications. In this thesis, a comparison is made between the performance of the method proposed in this thesis, the performance of method proposed by Dong, et al. (2008), method proposed by Chen, et al. (2019), and the classic reinforcement learning, namely DQN algorithm implemented by Stable Baseline. The comparison was carried out on frozen lake environment developed by Gym OpenAI. In a frozen lake environment with a 4x4 map, the best performance was obtained from the Grover method, the second best performance was obtained from the method proposed in this thesis. On the other hand, in a larger environment, which is an 8x8 map in general, the method proposed in this thesis gives the best performance or in other words, the method is more scalable. In both 4x4 and 8x8 maps, the performance of the classic VQC and RL methods is generally worse than the performance of the proposed method. The test results iv also shows that the proposed method has succeeded in making the agent carry out exploration well. In terms of time, the Grover and classical RL method requires a shorter time than the method proposed in this thesis. However, the proposed method takes a shorter time than the VQC method. Compared in terms of memory consumption or parameters, the Grover method requires storage to keep N values where N is the number of state space members and the classical RL method requires 64 x (N + 68) parameters. While the parameters for the method proposed in this thesis and the VQC method only require 3 log N parameters. However, the computation of the proposed method is slightly more complex than the VQC method because basically the proposed method is a combination of the Grover method and the VQC method so that it requires more qubit. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Basically, reinforcement learning applies trial and error principle so that this type of machine learning takes a long time to solving the problems. Besides that, unlike other types of machine learning, there is a challenge to reinforcement learning, that is trade off between exploration and exploitation. On the other hand, reasearcher found that the application of quantum computing can accelerate computing on various problems quadratically or even exponentially. So that researcher began to use quantum computing in the field of machine learning including reinforcement learning. Prevoius study by Dong, et al. (2008) solved reinforcement learning with tabular method and action selection that is inspired by quantum algorithm namely Grover iteration. This research shows that the use of quantum computing can balance the trade off between exploration and exploitation. However the use of the tabular method certainly makes this method unscalable. Another study by Chen, et al. (2019) solved reinforcement learning with the approximation method using Variational Quantum Circuit (VQC). However this study focuses more on the use of parameters and memory that is so much fewer than than in the classical reinforcement learning. To fill each other’s gap between the two methods, this thesis basically combines the method proposed by Dong, et al. (2008) and Chen, et al. (2019) with some modifications. In this thesis, a comparison is made between the performance of the method proposed in this thesis, the performance of method proposed by Dong, et al. (2008), method proposed by Chen, et al. (2019), and the classic reinforcement learning, namely DQN algorithm implemented by Stable Baseline. The comparison was carried out on frozen lake environment developed by Gym OpenAI. In a frozen lake environment with a 4x4 map, the best performance was obtained from the Grover method, the second best performance was obtained from the method proposed in this thesis. On the other hand, in a larger environment, which is an 8x8 map in general, the method proposed in this thesis gives the best performance or in other words, the method is more scalable. In both 4x4 and 8x8 maps, the performance of the classic VQC and RL methods is generally worse than the performance of the proposed method. The test results iv also shows that the proposed method has succeeded in making the agent carry out exploration well. In terms of time, the Grover and classical RL method requires a shorter time than the method proposed in this thesis. However, the proposed method takes a shorter time than the VQC method. Compared in terms of memory consumption or parameters, the Grover method requires storage to keep N values where N is the number of state space members and the classical RL method requires 64 x (N + 68) parameters. While the parameters for the method proposed in this thesis and the VQC method only require 3 log N parameters. However, the computation of the proposed method is slightly more complex than the VQC method because basically the proposed method is a combination of the Grover method and the VQC method so that it requires more qubit.
format Theses
author Wafa, Hani'ah
spellingShingle Wafa, Hani'ah
IMPLEMENTATION OF REINFORCEMENT LEARNING BY QUANTUM COMPUTING APPROACH
author_facet Wafa, Hani'ah
author_sort Wafa, Hani'ah
title IMPLEMENTATION OF REINFORCEMENT LEARNING BY QUANTUM COMPUTING APPROACH
title_short IMPLEMENTATION OF REINFORCEMENT LEARNING BY QUANTUM COMPUTING APPROACH
title_full IMPLEMENTATION OF REINFORCEMENT LEARNING BY QUANTUM COMPUTING APPROACH
title_fullStr IMPLEMENTATION OF REINFORCEMENT LEARNING BY QUANTUM COMPUTING APPROACH
title_full_unstemmed IMPLEMENTATION OF REINFORCEMENT LEARNING BY QUANTUM COMPUTING APPROACH
title_sort implementation of reinforcement learning by quantum computing approach
url https://digilib.itb.ac.id/gdl/view/58053
_version_ 1822275106158673920