Knowledge Transfer for Deep Reinforcement Learning with Hierarchical Experience Replay

The process for transferring knowledge of multiple reinforcement learning policies into a single multi-task policy via distillation technique is known as policy distillation. When policy distillation is under a deep reinforcement learning setting, due to the giant parameter size and the huge state s...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلفون الرئيسيون:	Yin, Haiyan, Pan, Sinno Jialin
مؤلفون آخرون:	School of Computer Science and Engineering
التنسيق:	Conference or Workshop Item
اللغة:	English
منشور في:	2017
الموضوعات:	Deep reinforcement learning Policy distillation
الوصول للمادة أونلاين:	https://hdl.handle.net/10356/83043 http://hdl.handle.net/10220/42453 https://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14478
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

الوصف
الملخص:	The process for transferring knowledge of multiple reinforcement learning policies into a single multi-task policy via distillation technique is known as policy distillation. When policy distillation is under a deep reinforcement learning setting, due to the giant parameter size and the huge state space for each task domain, it requires extensive computational efforts to train the multi-task policy network. In this paper, we propose a new policy distillation architecture for deep reinforcement learning, where we assume that each task uses its task specific high-level convolutional features as the inputs to the multi-task policy network. Furthermore, we propose a new sampling framework termed hierarchical prioritized experience replay to selectively choose experiences from the replay memories of each task domain to perform learning on the network. With the above two attempts, we aim to accelerate the learning of the multi-task policy network while guaranteeing a good performance. We use Atari 2600 games as testing environment to demonstrate the efficiency and effectiveness of our proposed solution for policy distillation.

Knowledge Transfer for Deep Reinforcement Learning with Hierarchical Experience Replay

مواد مشابهة