Data-efficient multi-agent reinforcement learning

With great success in Reinforcement Learning’s application to a suite of single-agent environments, it is natural to consider its application towards environments that mimic the real world to a greater degree. One such class of environments would be decentralised multi-agent environments, mimicking...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Wong, Reuben Yuh Sheng
مؤلفون آخرون: Bo An
التنسيق: Final Year Project
اللغة:English
منشور في: Nanyang Technological University 2022
الموضوعات:
الوصول للمادة أونلاين:https://hdl.handle.net/10356/163136
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
الوصف
الملخص:With great success in Reinforcement Learning’s application to a suite of single-agent environments, it is natural to consider its application towards environments that mimic the real world to a greater degree. One such class of environments would be decentralised multi-agent environments, mimicking the many independent agents, each with their own goals in the real-world. The decentralisation of state information, as well as constraints imposed on the behaviour of agents by local observability make this a challenging problem domain. Thankfully, there currently exists a handful of powerful algorithms operating in the co-operative multi-agent space such as QMIX, which enforce that the joint-action value is monotonic in the per-agent values, allowing the maximisation of the joint-action value in linear time during off-policy learning. This work is, however, interested in exploring a tangent to multi-agent reinforcement learning. In particular, we want to explore the possibility of learning from the environment using fewer samples. We will take a look at multiple approaches in this space, ranging from injecting new learning signals to learning better representations of the state space. For its greater potential in applications to more learning algorithms, we will then take a deeper dive into algorithms based on representation learning.