IMPLEMENTATION OF REINFORCEMENT LEARNING BY QUANTUM COMPUTING APPROACH
Basically, reinforcement learning applies trial and error principle so that this type of machine learning takes a long time to solving the problems. Besides that, unlike other types of machine learning, there is a challenge to reinforcement learning, that is trade off between exploration and e...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/58053 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:58053 |
---|---|
spelling |
id-itb.:580532021-08-30T12:48:15ZIMPLEMENTATION OF REINFORCEMENT LEARNING BY QUANTUM COMPUTING APPROACH Wafa, Hani'ah Indonesia Theses reinforcement learning, quantum computing, Grover iteration, VQC, frozen lake INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/58053 Basically, reinforcement learning applies trial and error principle so that this type of machine learning takes a long time to solving the problems. Besides that, unlike other types of machine learning, there is a challenge to reinforcement learning, that is trade off between exploration and exploitation. On the other hand, reasearcher found that the application of quantum computing can accelerate computing on various problems quadratically or even exponentially. So that researcher began to use quantum computing in the field of machine learning including reinforcement learning. Prevoius study by Dong, et al. (2008) solved reinforcement learning with tabular method and action selection that is inspired by quantum algorithm namely Grover iteration. This research shows that the use of quantum computing can balance the trade off between exploration and exploitation. However the use of the tabular method certainly makes this method unscalable. Another study by Chen, et al. (2019) solved reinforcement learning with the approximation method using Variational Quantum Circuit (VQC). However this study focuses more on the use of parameters and memory that is so much fewer than than in the classical reinforcement learning. To fill each other’s gap between the two methods, this thesis basically combines the method proposed by Dong, et al. (2008) and Chen, et al. (2019) with some modifications. In this thesis, a comparison is made between the performance of the method proposed in this thesis, the performance of method proposed by Dong, et al. (2008), method proposed by Chen, et al. (2019), and the classic reinforcement learning, namely DQN algorithm implemented by Stable Baseline. The comparison was carried out on frozen lake environment developed by Gym OpenAI. In a frozen lake environment with a 4x4 map, the best performance was obtained from the Grover method, the second best performance was obtained from the method proposed in this thesis. On the other hand, in a larger environment, which is an 8x8 map in general, the method proposed in this thesis gives the best performance or in other words, the method is more scalable. In both 4x4 and 8x8 maps, the performance of the classic VQC and RL methods is generally worse than the performance of the proposed method. The test results iv also shows that the proposed method has succeeded in making the agent carry out exploration well. In terms of time, the Grover and classical RL method requires a shorter time than the method proposed in this thesis. However, the proposed method takes a shorter time than the VQC method. Compared in terms of memory consumption or parameters, the Grover method requires storage to keep N values where N is the number of state space members and the classical RL method requires 64 x (N + 68) parameters. While the parameters for the method proposed in this thesis and the VQC method only require 3 log N parameters. However, the computation of the proposed method is slightly more complex than the VQC method because basically the proposed method is a combination of the Grover method and the VQC method so that it requires more qubit. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Basically, reinforcement learning applies trial and error principle so that this
type of machine learning takes a long time to solving the problems. Besides that,
unlike other types of machine learning, there is a challenge to reinforcement
learning, that is trade off between exploration and exploitation. On the other
hand, reasearcher found that the application of quantum computing can
accelerate computing on various problems quadratically or even exponentially.
So that researcher began to use quantum computing in the field of machine
learning including reinforcement learning.
Prevoius study by Dong, et al. (2008) solved reinforcement learning with tabular
method and action selection that is inspired by quantum algorithm namely Grover
iteration. This research shows that the use of quantum computing can balance the
trade off between exploration and exploitation. However the use of the tabular
method certainly makes this method unscalable. Another study by Chen, et al.
(2019) solved reinforcement learning with the approximation method using
Variational Quantum Circuit (VQC). However this study focuses more on the use
of parameters and memory that is so much fewer than than in the classical
reinforcement learning. To fill each other’s gap between the two methods, this
thesis basically combines the method proposed by Dong, et al. (2008) and Chen,
et al. (2019) with some modifications.
In this thesis, a comparison is made between the performance of the method
proposed in this thesis, the performance of method proposed by Dong, et al.
(2008), method proposed by Chen, et al. (2019), and the classic reinforcement
learning, namely DQN algorithm implemented by Stable Baseline. The
comparison was carried out on frozen lake environment developed by Gym
OpenAI. In a frozen lake environment with a 4x4 map, the best performance was
obtained from the Grover method, the second best performance was obtained
from the method proposed in this thesis. On the other hand, in a larger
environment, which is an 8x8 map in general, the method proposed in this thesis
gives the best performance or in other words, the method is more scalable. In both
4x4 and 8x8 maps, the performance of the classic VQC and RL methods is
generally worse than the performance of the proposed method. The test results
iv
also shows that the proposed method has succeeded in making the agent carry out
exploration well.
In terms of time, the Grover and classical RL method requires a shorter time than
the method proposed in this thesis. However, the proposed method takes a shorter
time than the VQC method. Compared in terms of memory consumption or
parameters, the Grover method requires storage to keep N values where N is the
number of state space members and the classical RL method requires 64 x (N +
68) parameters. While the parameters for the method proposed in this thesis and
the VQC method only require 3 log N parameters. However, the computation of
the proposed method is slightly more complex than the VQC method because
basically the proposed method is a combination of the Grover method and the
VQC method so that it requires more qubit.
|
format |
Theses |
author |
Wafa, Hani'ah |
spellingShingle |
Wafa, Hani'ah IMPLEMENTATION OF REINFORCEMENT LEARNING BY QUANTUM COMPUTING APPROACH |
author_facet |
Wafa, Hani'ah |
author_sort |
Wafa, Hani'ah |
title |
IMPLEMENTATION OF REINFORCEMENT LEARNING BY QUANTUM COMPUTING APPROACH |
title_short |
IMPLEMENTATION OF REINFORCEMENT LEARNING BY QUANTUM COMPUTING APPROACH |
title_full |
IMPLEMENTATION OF REINFORCEMENT LEARNING BY QUANTUM COMPUTING APPROACH |
title_fullStr |
IMPLEMENTATION OF REINFORCEMENT LEARNING BY QUANTUM COMPUTING APPROACH |
title_full_unstemmed |
IMPLEMENTATION OF REINFORCEMENT LEARNING BY QUANTUM COMPUTING APPROACH |
title_sort |
implementation of reinforcement learning by quantum computing approach |
url |
https://digilib.itb.ac.id/gdl/view/58053 |
_version_ |
1822275106158673920 |