IMPLEMENTATION OF DEEP REINFORCEMENT LEARNING IN SOCCER SIMULATION 2D GAMES

Reinforcement learning is one of the sub problems of machine learning where agents learn how to do the best action in a condition in an environment. Deep learning is able to help reinforcement learning in representing large state space. By using deep reinforcement learning agents can play in their e...

Full description

Saved in:
Bibliographic Details
Main Author: Adi Kuncoro, Azis
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/40100
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:40100
spelling id-itb.:401002019-07-01T08:29:13ZIMPLEMENTATION OF DEEP REINFORCEMENT LEARNING IN SOCCER SIMULATION 2D GAMES Adi Kuncoro, Azis Indonesia Final Project reinforcement learning, deep learning, multi agent, soccer simulation. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/40100 Reinforcement learning is one of the sub problems of machine learning where agents learn how to do the best action in a condition in an environment. Deep learning is able to help reinforcement learning in representing large state space. By using deep reinforcement learning agents can play in their environment without prior knowledge. Soccer simulation 2D game is a game environment that simulates soccer games. One development of soccer simulation 2D is Half Field Offense (HFO). HFO provides features that help in learning reinforcement learning such as episodic learning, the choice to use high level or low level action or state space, the availability of hand-coded agents and random agents as baselines, available in python and C ++. In this final project, an advantage actor critic (A2C) method is used. In its implementation, A2C has two deep neural networks, namely network actors and network critics. Network actors are tasked with selecting actions for agents. The network actor receives input in the form of a state from the HFO game on a timestep and the output is a code of a discrete action. While the network critic is in charge of assessing how well the action produced is based on its state. Network critic receives input in the form of state and action chosen by the agent and the output is in the form of evaluation value from taking action in that state. There are two types of agents trained, namely attack agents and defending agents. The game scenario chosen is 5 vs 5, this is based on a futsal game that uses that many players. For each agent there is a separate A2C model. The strategy of coordination between agents is studied by agents during the learning phase. Agent learning takes 10,000 epochs against hand-coded agents. The results obtained are that A2C is able to surpass the baseline in the form of a random agent. However, it is still slightly below the performance of hand-coded agents. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Reinforcement learning is one of the sub problems of machine learning where agents learn how to do the best action in a condition in an environment. Deep learning is able to help reinforcement learning in representing large state space. By using deep reinforcement learning agents can play in their environment without prior knowledge. Soccer simulation 2D game is a game environment that simulates soccer games. One development of soccer simulation 2D is Half Field Offense (HFO). HFO provides features that help in learning reinforcement learning such as episodic learning, the choice to use high level or low level action or state space, the availability of hand-coded agents and random agents as baselines, available in python and C ++. In this final project, an advantage actor critic (A2C) method is used. In its implementation, A2C has two deep neural networks, namely network actors and network critics. Network actors are tasked with selecting actions for agents. The network actor receives input in the form of a state from the HFO game on a timestep and the output is a code of a discrete action. While the network critic is in charge of assessing how well the action produced is based on its state. Network critic receives input in the form of state and action chosen by the agent and the output is in the form of evaluation value from taking action in that state. There are two types of agents trained, namely attack agents and defending agents. The game scenario chosen is 5 vs 5, this is based on a futsal game that uses that many players. For each agent there is a separate A2C model. The strategy of coordination between agents is studied by agents during the learning phase. Agent learning takes 10,000 epochs against hand-coded agents. The results obtained are that A2C is able to surpass the baseline in the form of a random agent. However, it is still slightly below the performance of hand-coded agents.
format Final Project
author Adi Kuncoro, Azis
spellingShingle Adi Kuncoro, Azis
IMPLEMENTATION OF DEEP REINFORCEMENT LEARNING IN SOCCER SIMULATION 2D GAMES
author_facet Adi Kuncoro, Azis
author_sort Adi Kuncoro, Azis
title IMPLEMENTATION OF DEEP REINFORCEMENT LEARNING IN SOCCER SIMULATION 2D GAMES
title_short IMPLEMENTATION OF DEEP REINFORCEMENT LEARNING IN SOCCER SIMULATION 2D GAMES
title_full IMPLEMENTATION OF DEEP REINFORCEMENT LEARNING IN SOCCER SIMULATION 2D GAMES
title_fullStr IMPLEMENTATION OF DEEP REINFORCEMENT LEARNING IN SOCCER SIMULATION 2D GAMES
title_full_unstemmed IMPLEMENTATION OF DEEP REINFORCEMENT LEARNING IN SOCCER SIMULATION 2D GAMES
title_sort implementation of deep reinforcement learning in soccer simulation 2d games
url https://digilib.itb.ac.id/gdl/view/40100
_version_ 1822925634101313536