‘The six pac-men' : exploring the strength of advice provision and the impact of an adversarial advisor in reinforcement learning

Reinforcement learning in multi-agent scenarios is gaining popularity in recent times, with the student-teacher framework claiming its efficiency in the context of advice provision. The research in this paper describes a single-student-multi-teacher setting of a game environment. A new game titled...

Full description

Saved in:
Bibliographic Details
Main Author: Arun, Rakshitha
Other Authors: Zinovi Rabinovich
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/148063
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Reinforcement learning in multi-agent scenarios is gaining popularity in recent times, with the student-teacher framework claiming its efficiency in the context of advice provision. The research in this paper describes a single-student-multi-teacher setting of a game environment. A new game titled ‘Pac-Man Lite’ has been implemented for the purpose of experimentation. This environment is used to study advice provision between the student and teacher agents. First, the ability of the student agent to aggregate advice from multiple teachers with partial visibility of the environment is studied. Subsequently, an attacker in the form of an adversarial teacher advisor with a full view of the environment is introduced into the setting, whose goal is slightly different from that of the existing agents. The significant impact that adversarial advice can have on the performance of an agent serves as the major motivation behind this project. The research studies the effectiveness of adversarial advice in negatively influencing the performance of the student agent from the adversarial teacher agent’s perspective. The results indicate that the student agent is able to aggregate advice and extract value from relevant advisors in the presence of multiple sources. The results also indicate the success of the adversarial agent in negatively impacting the performance of the student agent by participating in advice provision. In addition to adding to existing research, this work has also set the ground for future research in multiple directions.