IMPLEMENTATION OF DEEP REINFORCEMENT LEARNING IN SOCCER SIMULATION 2D GAMES

Reinforcement learning is one of the sub problems of machine learning where agents learn how to do the best action in a condition in an environment. Deep learning is able to help reinforcement learning in representing large state space. By using deep reinforcement learning agents can play in their e...

Full description

Saved in:

Bibliographic Details
Main Author:	Adi Kuncoro, Azis
Format:	Final Project
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/40100
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:40100
spelling	id-itb.:401002019-07-01T08:29:13ZIMPLEMENTATION OF DEEP REINFORCEMENT LEARNING IN SOCCER SIMULATION 2D GAMES Adi Kuncoro, Azis Indonesia Final Project reinforcement learning, deep learning, multi agent, soccer simulation. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/40100 Reinforcement learning is one of the sub problems of machine learning where agents learn how to do the best action in a condition in an environment. Deep learning is able to help reinforcement learning in representing large state space. By using deep reinforcement learning agents can play in their environment without prior knowledge. Soccer simulation 2D game is a game environment that simulates soccer games. One development of soccer simulation 2D is Half Field Offense (HFO). HFO provides features that help in learning reinforcement learning such as episodic learning, the choice to use high level or low level action or state space, the availability of hand-coded agents and random agents as baselines, available in python and C ++. In this final project, an advantage actor critic (A2C) method is used. In its implementation, A2C has two deep neural networks, namely network actors and network critics. Network actors are tasked with selecting actions for agents. The network actor receives input in the form of a state from the HFO game on a timestep and the output is a code of a discrete action. While the network critic is in charge of assessing how well the action produced is based on its state. Network critic receives input in the form of state and action chosen by the agent and the output is in the form of evaluation value from taking action in that state. There are two types of agents trained, namely attack agents and defending agents. The game scenario chosen is 5 vs 5, this is based on a futsal game that uses that many players. For each agent there is a separate A2C model. The strategy of coordination between agents is studied by agents during the learning phase. Agent learning takes 10,000 epochs against hand-coded agents. The results obtained are that A2C is able to surpass the baseline in the form of a random agent. However, it is still slightly below the performance of hand-coded agents. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	Reinforcement learning is one of the sub problems of machine learning where agents learn how to do the best action in a condition in an environment. Deep learning is able to help reinforcement learning in representing large state space. By using deep reinforcement learning agents can play in their environment without prior knowledge. Soccer simulation 2D game is a game environment that simulates soccer games. One development of soccer simulation 2D is Half Field Offense (HFO). HFO provides features that help in learning reinforcement learning such as episodic learning, the choice to use high level or low level action or state space, the availability of hand-coded agents and random agents as baselines, available in python and C ++. In this final project, an advantage actor critic (A2C) method is used. In its implementation, A2C has two deep neural networks, namely network actors and network critics. Network actors are tasked with selecting actions for agents. The network actor receives input in the form of a state from the HFO game on a timestep and the output is a code of a discrete action. While the network critic is in charge of assessing how well the action produced is based on its state. Network critic receives input in the form of state and action chosen by the agent and the output is in the form of evaluation value from taking action in that state. There are two types of agents trained, namely attack agents and defending agents. The game scenario chosen is 5 vs 5, this is based on a futsal game that uses that many players. For each agent there is a separate A2C model. The strategy of coordination between agents is studied by agents during the learning phase. Agent learning takes 10,000 epochs against hand-coded agents. The results obtained are that A2C is able to surpass the baseline in the form of a random agent. However, it is still slightly below the performance of hand-coded agents.
format	Final Project
author	Adi Kuncoro, Azis
spellingShingle	Adi Kuncoro, Azis IMPLEMENTATION OF DEEP REINFORCEMENT LEARNING IN SOCCER SIMULATION 2D GAMES
author_facet	Adi Kuncoro, Azis
author_sort	Adi Kuncoro, Azis
title	IMPLEMENTATION OF DEEP REINFORCEMENT LEARNING IN SOCCER SIMULATION 2D GAMES
title_short	IMPLEMENTATION OF DEEP REINFORCEMENT LEARNING IN SOCCER SIMULATION 2D GAMES
title_full	IMPLEMENTATION OF DEEP REINFORCEMENT LEARNING IN SOCCER SIMULATION 2D GAMES
title_fullStr	IMPLEMENTATION OF DEEP REINFORCEMENT LEARNING IN SOCCER SIMULATION 2D GAMES
title_full_unstemmed	IMPLEMENTATION OF DEEP REINFORCEMENT LEARNING IN SOCCER SIMULATION 2D GAMES
title_sort	implementation of deep reinforcement learning in soccer simulation 2d games
url	https://digilib.itb.ac.id/gdl/view/40100
_version_	1822925634101313536

IMPLEMENTATION OF DEEP REINFORCEMENT LEARNING IN SOCCER SIMULATION 2D GAMES

Similar Items