HogRider: Champion agent of Microsoft Malmo collaborative AI challenge

It has been an open challenge for self-interested agents to make optimal sequential decisions in complex multiagent systems, where agents might achieve higher utility via collaboration. The Microsoft Malmo Collaborative AI Challenge (MCAC), which is designed to encourage research relating to various...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiong, Yanhai, Chen, Haipeng, Zhao, Mengchen, An, Bo
Other Authors: Interdisciplinary Graduate School (IGS)
Format: Conference or Workshop Item
Language:English
Published: 2018
Subjects:
Online Access:https://hdl.handle.net/10356/87236
http://hdl.handle.net/10220/44897
https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16385
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-87236
record_format dspace
spelling sg-ntu-dr.10356-872362020-11-01T04:43:19Z HogRider: Champion agent of Microsoft Malmo collaborative AI challenge Xiong, Yanhai Chen, Haipeng Zhao, Mengchen An, Bo Interdisciplinary Graduate School (IGS) School of Computer Science and Engineering The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18) Opponent Modeling Multiagent Learning It has been an open challenge for self-interested agents to make optimal sequential decisions in complex multiagent systems, where agents might achieve higher utility via collaboration. The Microsoft Malmo Collaborative AI Challenge (MCAC), which is designed to encourage research relating to various problems in Collaborative AI, takes the form of a Minecraft mini-game where players might work together to catch a pig or deviate from cooperation, for pursuing high scores to win the challenge. Various characteristics, such as complex interactions among agents, uncertainties, sequential decision making and limited learning trials all make it extremely challenging to find effective strategies. We present HogRider - the champion agent of MCAC in 2017 out of 81 teams from 26 countries. One key innovation of HogRider is a generalized agent type hypothesis framework to identify the behavior model of the other agents, which is demonstrated to be robust to observation uncertainty. On top of that, a second key innovation is a novel Q-learning approach to learn effective policies against each type of the collaborating agents. Various ideas are proposed to adapt traditional Qlearning to handle complexities in the challenge, including state-action abstraction to reduce problem scale, a warm start approach using human reasoning for addressing limited learning trials, and an active greedy strategy to balance exploitation-exploration. Challenge results show that HogRider outperforms all the other teams by a significant edge, in terms of both optimality and stability. NRF (Natl Research Foundation, S’pore) Accepted version 2018-05-25T05:15:24Z 2019-12-06T16:37:51Z 2018-05-25T05:15:24Z 2019-12-06T16:37:51Z 2018 Conference Paper Xiong, Y., Chen, H., Zhao, M., & An, B. (2018). HogRider: Champion agent of Microsoft Malmo collaborative AI challenge. The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), 4767-4774. https://hdl.handle.net/10356/87236 http://hdl.handle.net/10220/44897 https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16385 en © 2018 Association for the Advancement of Artificial Intelligence. This is the author created version of a work that has been peer reviewed and accepted for publication by The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), Association for the Advancement of Artificial Intelligence. It incorporates referee’s comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document. The published version is available at: [https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16385]. 8 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Opponent Modeling
Multiagent Learning
spellingShingle Opponent Modeling
Multiagent Learning
Xiong, Yanhai
Chen, Haipeng
Zhao, Mengchen
An, Bo
HogRider: Champion agent of Microsoft Malmo collaborative AI challenge
description It has been an open challenge for self-interested agents to make optimal sequential decisions in complex multiagent systems, where agents might achieve higher utility via collaboration. The Microsoft Malmo Collaborative AI Challenge (MCAC), which is designed to encourage research relating to various problems in Collaborative AI, takes the form of a Minecraft mini-game where players might work together to catch a pig or deviate from cooperation, for pursuing high scores to win the challenge. Various characteristics, such as complex interactions among agents, uncertainties, sequential decision making and limited learning trials all make it extremely challenging to find effective strategies. We present HogRider - the champion agent of MCAC in 2017 out of 81 teams from 26 countries. One key innovation of HogRider is a generalized agent type hypothesis framework to identify the behavior model of the other agents, which is demonstrated to be robust to observation uncertainty. On top of that, a second key innovation is a novel Q-learning approach to learn effective policies against each type of the collaborating agents. Various ideas are proposed to adapt traditional Qlearning to handle complexities in the challenge, including state-action abstraction to reduce problem scale, a warm start approach using human reasoning for addressing limited learning trials, and an active greedy strategy to balance exploitation-exploration. Challenge results show that HogRider outperforms all the other teams by a significant edge, in terms of both optimality and stability.
author2 Interdisciplinary Graduate School (IGS)
author_facet Interdisciplinary Graduate School (IGS)
Xiong, Yanhai
Chen, Haipeng
Zhao, Mengchen
An, Bo
format Conference or Workshop Item
author Xiong, Yanhai
Chen, Haipeng
Zhao, Mengchen
An, Bo
author_sort Xiong, Yanhai
title HogRider: Champion agent of Microsoft Malmo collaborative AI challenge
title_short HogRider: Champion agent of Microsoft Malmo collaborative AI challenge
title_full HogRider: Champion agent of Microsoft Malmo collaborative AI challenge
title_fullStr HogRider: Champion agent of Microsoft Malmo collaborative AI challenge
title_full_unstemmed HogRider: Champion agent of Microsoft Malmo collaborative AI challenge
title_sort hogrider: champion agent of microsoft malmo collaborative ai challenge
publishDate 2018
url https://hdl.handle.net/10356/87236
http://hdl.handle.net/10220/44897
https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16385
_version_ 1683493492779646976