Multiagent reinforcement learning with graphical mutual information maximization
Communication learning is an important research direction in the multiagent reinforcement learning (MARL) domain. Graph neural networks (GNNs) can aggregate the information of neighbor nodes for representation learning. In recent years, several MARL methods leverage GNN to model information interact...
Saved in:
Main Authors: | , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/170576 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Communication learning is an important research direction in the multiagent reinforcement learning (MARL) domain. Graph neural networks (GNNs) can aggregate the information of neighbor nodes for representation learning. In recent years, several MARL methods leverage GNN to model information interactions between agents to coordinate actions and complete cooperative tasks. However, simply aggregating the information of neighboring agents through GNNs may not extract enough useful information, and the topological relationship information is ignored. To tackle this difficulty, we investigate how to efficiently extract and utilize the rich information of neighbor agents as much as possible in the graph structure, so as to obtain high-quality expressive feature representation to complete the cooperation task. To this end, we present a novel GNN-based MARL method with graphical mutual information (MI) maximization to maximize the correlation between input feature information of neighbor agents and output high-level hidden feature representations. The proposed method extends the traditional idea of MI optimization from graph domain to multiagent system, in which the MI is measured from two aspects: agent features information and agent topological relationships. The proposed method is agnostic to specific MARL methods and can be flexibly integrated with various value function decomposition methods. Considerable experiments on various benchmarks demonstrate that the performance of our proposed method is superior to the existing MARL methods. |
---|