Programmatic policies for interpretable reinforcement learning using pre-trained models

Decision Trees (DTs) are widely used in machine learning due to their critical interpretability. However, training DTs in a Reinforcement Learning (RL) setting is challenging. In the project, we present a framework to improve the interpretability of reinforcement learning (RL) by generating prog...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Tu, Xia Yang
مؤلفون آخرون: Arvind Easwaran
التنسيق: Final Year Project
اللغة:English
منشور في: Nanyang Technological University 2024
الموضوعات:
LLM
الوصول للمادة أونلاين:https://hdl.handle.net/10356/181169
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة: Nanyang Technological University
اللغة: English
الوصف
الملخص:Decision Trees (DTs) are widely used in machine learning due to their critical interpretability. However, training DTs in a Reinforcement Learning (RL) setting is challenging. In the project, we present a framework to improve the interpretability of reinforcement learning (RL) by generating programmatic policies through large language models (LLMs). Initially looking at decision trees within latent space representation, we proceed to develop and apply a “code reflection” framework in the Karel environment, which provides a controlled setting for evaluating task performance. The framework leverages prompt engineering and code reflection to optimize program synthesis, producing interpretable policies. The report discusses the findings and concludes with recommendations for future work at the end.