Learning expensive coordination: An event-based deep RL approach

Existing works in deep Multi-Agent Reinforcement Learning (MARL) mainly focus on coordinating cooperative agents to complete certain tasks jointly. However, in many cases of the real world, agents are self-interested such as employees in a company and clubs in a league. Therefore, the leader, i.e.,...

Full description

Saved in:

Bibliographic Details
Main Authors:	YU, Runsheng, WANG, Xinrun, WANG, Rundong, ZHANG, Youzhi, AN, Bo, SHI, Zhen Yu, LAI, Hanjiang
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2020
Subjects:	Artificial Intelligence and Robotics
Online Access:	https://ink.library.smu.edu.sg/sis_research/9147 https://ink.library.smu.edu.sg/context/sis_research/article/10150/viewcontent/108_learning_expensive_coordination_av.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-10150
record_format	dspace
spelling	sg-smu-ink.sis_research-101502024-08-01T09:18:46Z Learning expensive coordination: An event-based deep RL approach YU, Runsheng WANG, Xinrun WANG, Rundong ZHANG, Youzhi AN, Bo SHI, Zhen Yu LAI, Hanjiang Existing works in deep Multi-Agent Reinforcement Learning (MARL) mainly focus on coordinating cooperative agents to complete certain tasks jointly. However, in many cases of the real world, agents are self-interested such as employees in a company and clubs in a league. Therefore, the leader, i.e., the manager of the company or the league, needs to provide bonuses to followers for efficient coordination, which we call expensive coordination. The main difficulties of expensive coordination are that i) the leader has to consider the long-term effect and predict the followers’ behaviors when assigning bonuses, and ii) the complex interactions between followers make the training process hard to converge, especially when the leader’s policy changes with time. In this work, we address this problem through an event-based deep RL approach. Our main contributions are threefold. (1) We model the leader’s decision-making process as a semi-Markov Decision Process and propose a novel multi-agent event-based policy gradient to learn the leader’s long-term policy. (2) We exploit the leader-follower consistency scheme to design a follower-aware module and a follower-specific attention module to predict the followers’ behaviors and make accurate response to their behaviors. (3) We propose an action abstraction-based policy gradient algorithm to reduce the followers’ decision space and thus accelerate the training process of followers. Experiments in resource collections, navigation, and the predator-prey game reveal that our approach outperforms the state-of-the-art methods dramatically 2020-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9147 https://ink.library.smu.edu.sg/context/sis_research/article/10150/viewcontent/108_learning_expensive_coordination_av.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Artificial Intelligence and Robotics
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Artificial Intelligence and Robotics
spellingShingle	Artificial Intelligence and Robotics YU, Runsheng WANG, Xinrun WANG, Rundong ZHANG, Youzhi AN, Bo SHI, Zhen Yu LAI, Hanjiang Learning expensive coordination: An event-based deep RL approach
description	Existing works in deep Multi-Agent Reinforcement Learning (MARL) mainly focus on coordinating cooperative agents to complete certain tasks jointly. However, in many cases of the real world, agents are self-interested such as employees in a company and clubs in a league. Therefore, the leader, i.e., the manager of the company or the league, needs to provide bonuses to followers for efficient coordination, which we call expensive coordination. The main difficulties of expensive coordination are that i) the leader has to consider the long-term effect and predict the followers’ behaviors when assigning bonuses, and ii) the complex interactions between followers make the training process hard to converge, especially when the leader’s policy changes with time. In this work, we address this problem through an event-based deep RL approach. Our main contributions are threefold. (1) We model the leader’s decision-making process as a semi-Markov Decision Process and propose a novel multi-agent event-based policy gradient to learn the leader’s long-term policy. (2) We exploit the leader-follower consistency scheme to design a follower-aware module and a follower-specific attention module to predict the followers’ behaviors and make accurate response to their behaviors. (3) We propose an action abstraction-based policy gradient algorithm to reduce the followers’ decision space and thus accelerate the training process of followers. Experiments in resource collections, navigation, and the predator-prey game reveal that our approach outperforms the state-of-the-art methods dramatically
format	text
author	YU, Runsheng WANG, Xinrun WANG, Rundong ZHANG, Youzhi AN, Bo SHI, Zhen Yu LAI, Hanjiang
author_facet	YU, Runsheng WANG, Xinrun WANG, Rundong ZHANG, Youzhi AN, Bo SHI, Zhen Yu LAI, Hanjiang
author_sort	YU, Runsheng
title	Learning expensive coordination: An event-based deep RL approach
title_short	Learning expensive coordination: An event-based deep RL approach
title_full	Learning expensive coordination: An event-based deep RL approach
title_fullStr	Learning expensive coordination: An event-based deep RL approach
title_full_unstemmed	Learning expensive coordination: An event-based deep RL approach
title_sort	learning expensive coordination: an event-based deep rl approach
publisher	Institutional Knowledge at Singapore Management University
publishDate	2020
url	https://ink.library.smu.edu.sg/sis_research/9147 https://ink.library.smu.edu.sg/context/sis_research/article/10150/viewcontent/108_learning_expensive_coordination_av.pdf
_version_	1814047755700535296

Learning expensive coordination: An event-based deep RL approach

Similar Items