PARALLEL MONTE CARLO METHOD IN GRID WORLD (REINFORCEMENT LEARNING) USING CUDA DYNAMIC PARALLELISM
Parallel Monte Carlo method for reinforcement learning problem has been shown to be able to accelerate agents’ experience quality gain per episode by increasing number of agents. Previous researches have experimented on this with up to 16 parallel agents. The rapid development of GPGPU, especiall...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/39712 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:39712 |
---|---|
spelling |
id-itb.:397122019-06-27T14:25:39ZPARALLEL MONTE CARLO METHOD IN GRID WORLD (REINFORCEMENT LEARNING) USING CUDA DYNAMIC PARALLELISM Socrates, Sandy Indonesia Theses HPC, NVIDIA CUDA, parallel programming, reinforcement learning INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/39712 Parallel Monte Carlo method for reinforcement learning problem has been shown to be able to accelerate agents’ experience quality gain per episode by increasing number of agents. Previous researches have experimented on this with up to 16 parallel agents. The rapid development of GPGPU, especially NVIDIA CUDA, has opened new possibilities to use higher number of parallel agents. But this also reveals new problem as the increase of agent number is also followed by higher load of experience sharing needed for each agents. In this research, we propose two implementations using CUDA Dynamic Parallelism (CDP) to answer this problem on grid world. The two proposed solutions are asynchronous parallel Monte Carlo and nested-asynchronous parallel Monte Carlo. The experiments showed the implemented solutions gave up to 22% performance gain. But as the number of agents and episodes increased the overhead caused by CDP kernel calls will overshadow the performance gained. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Parallel Monte Carlo method for reinforcement learning problem has been shown
to be able to accelerate agents’ experience quality gain per episode by increasing
number of agents. Previous researches have experimented on this with up to 16
parallel agents. The rapid development of GPGPU, especially NVIDIA CUDA, has
opened new possibilities to use higher number of parallel agents. But this also
reveals new problem as the increase of agent number is also followed by higher
load of experience sharing needed for each agents. In this research, we propose two
implementations using CUDA Dynamic Parallelism (CDP) to answer this problem
on grid world. The two proposed solutions are asynchronous parallel Monte Carlo
and nested-asynchronous parallel Monte Carlo. The experiments showed the implemented
solutions gave up to 22% performance gain. But as the number of agents
and episodes increased the overhead caused by CDP kernel calls will overshadow
the performance gained. |
format |
Theses |
author |
Socrates, Sandy |
spellingShingle |
Socrates, Sandy PARALLEL MONTE CARLO METHOD IN GRID WORLD (REINFORCEMENT LEARNING) USING CUDA DYNAMIC PARALLELISM |
author_facet |
Socrates, Sandy |
author_sort |
Socrates, Sandy |
title |
PARALLEL MONTE CARLO METHOD IN GRID WORLD (REINFORCEMENT LEARNING) USING CUDA DYNAMIC PARALLELISM |
title_short |
PARALLEL MONTE CARLO METHOD IN GRID WORLD (REINFORCEMENT LEARNING) USING CUDA DYNAMIC PARALLELISM |
title_full |
PARALLEL MONTE CARLO METHOD IN GRID WORLD (REINFORCEMENT LEARNING) USING CUDA DYNAMIC PARALLELISM |
title_fullStr |
PARALLEL MONTE CARLO METHOD IN GRID WORLD (REINFORCEMENT LEARNING) USING CUDA DYNAMIC PARALLELISM |
title_full_unstemmed |
PARALLEL MONTE CARLO METHOD IN GRID WORLD (REINFORCEMENT LEARNING) USING CUDA DYNAMIC PARALLELISM |
title_sort |
parallel monte carlo method in grid world (reinforcement learning) using cuda dynamic parallelism |
url |
https://digilib.itb.ac.id/gdl/view/39712 |
_version_ |
1822925376269058048 |