PARALLEL MONTE CARLO METHOD IN GRID WORLD (REINFORCEMENT LEARNING) USING CUDA DYNAMIC PARALLELISM

Parallel Monte Carlo method for reinforcement learning problem has been shown to be able to accelerate agents’ experience quality gain per episode by increasing number of agents. Previous researches have experimented on this with up to 16 parallel agents. The rapid development of GPGPU, especiall...

Full description

Saved in:

Bibliographic Details
Main Author:	Socrates, Sandy
Format:	Theses
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/39712
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:39712
spelling	id-itb.:397122019-06-27T14:25:39ZPARALLEL MONTE CARLO METHOD IN GRID WORLD (REINFORCEMENT LEARNING) USING CUDA DYNAMIC PARALLELISM Socrates, Sandy Indonesia Theses HPC, NVIDIA CUDA, parallel programming, reinforcement learning INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/39712 Parallel Monte Carlo method for reinforcement learning problem has been shown to be able to accelerate agents’ experience quality gain per episode by increasing number of agents. Previous researches have experimented on this with up to 16 parallel agents. The rapid development of GPGPU, especially NVIDIA CUDA, has opened new possibilities to use higher number of parallel agents. But this also reveals new problem as the increase of agent number is also followed by higher load of experience sharing needed for each agents. In this research, we propose two implementations using CUDA Dynamic Parallelism (CDP) to answer this problem on grid world. The two proposed solutions are asynchronous parallel Monte Carlo and nested-asynchronous parallel Monte Carlo. The experiments showed the implemented solutions gave up to 22% performance gain. But as the number of agents and episodes increased the overhead caused by CDP kernel calls will overshadow the performance gained. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	Parallel Monte Carlo method for reinforcement learning problem has been shown to be able to accelerate agents’ experience quality gain per episode by increasing number of agents. Previous researches have experimented on this with up to 16 parallel agents. The rapid development of GPGPU, especially NVIDIA CUDA, has opened new possibilities to use higher number of parallel agents. But this also reveals new problem as the increase of agent number is also followed by higher load of experience sharing needed for each agents. In this research, we propose two implementations using CUDA Dynamic Parallelism (CDP) to answer this problem on grid world. The two proposed solutions are asynchronous parallel Monte Carlo and nested-asynchronous parallel Monte Carlo. The experiments showed the implemented solutions gave up to 22% performance gain. But as the number of agents and episodes increased the overhead caused by CDP kernel calls will overshadow the performance gained.
format	Theses
author	Socrates, Sandy
spellingShingle	Socrates, Sandy PARALLEL MONTE CARLO METHOD IN GRID WORLD (REINFORCEMENT LEARNING) USING CUDA DYNAMIC PARALLELISM
author_facet	Socrates, Sandy
author_sort	Socrates, Sandy
title	PARALLEL MONTE CARLO METHOD IN GRID WORLD (REINFORCEMENT LEARNING) USING CUDA DYNAMIC PARALLELISM
title_short	PARALLEL MONTE CARLO METHOD IN GRID WORLD (REINFORCEMENT LEARNING) USING CUDA DYNAMIC PARALLELISM
title_full	PARALLEL MONTE CARLO METHOD IN GRID WORLD (REINFORCEMENT LEARNING) USING CUDA DYNAMIC PARALLELISM
title_fullStr	PARALLEL MONTE CARLO METHOD IN GRID WORLD (REINFORCEMENT LEARNING) USING CUDA DYNAMIC PARALLELISM
title_full_unstemmed	PARALLEL MONTE CARLO METHOD IN GRID WORLD (REINFORCEMENT LEARNING) USING CUDA DYNAMIC PARALLELISM
title_sort	parallel monte carlo method in grid world (reinforcement learning) using cuda dynamic parallelism
url	https://digilib.itb.ac.id/gdl/view/39712
_version_	1822925376269058048

PARALLEL MONTE CARLO METHOD IN GRID WORLD (REINFORCEMENT LEARNING) USING CUDA DYNAMIC PARALLELISM

Similar Items