Time-inconsistent objectives in reinforcement learning

In Reinforcement Learning, one of the most intriguing and long-lasting problems is about how to assign credit to historical events efficiently and meaningfully. And within temporal credit assignment problems, time inconsistency is a challenging sub-domain that was noticed long ago but still lacks sy...

全面介紹

Saved in:
書目詳細資料
主要作者: Su, Huangyuan
其他作者: PUN Chi Seng
格式: Final Year Project
語言:English
出版: Nanyang Technological University 2021
主題:
在線閱讀:https://hdl.handle.net/10356/148520
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English
id sg-ntu-dr.10356-148520
record_format dspace
spelling sg-ntu-dr.10356-1485202023-02-28T23:17:57Z Time-inconsistent objectives in reinforcement learning Su, Huangyuan PUN Chi Seng School of Physical and Mathematical Sciences Nixie Sapphira Lesmana cspun@ntu.edu.sg Science::Mathematics In Reinforcement Learning, one of the most intriguing and long-lasting problems is about how to assign credit to historical events efficiently and meaningfully. And within temporal credit assignment problems, time inconsistency is a challenging sub-domain that was noticed long ago but still lacks systematic treatment. The goal of this work is to search for efficient algorithms that converge to equilibrium policies in the presence of time-inconsistent objectives. In this work, we first provide a brief introduction on reinforcement learning and control theory; then, we define the time-inconsistent problem, both illustratively and formally. After that, we propose a general backward update framework based on game theory. This framework is theoretically proven to be able to find the equilibrium control under time-inconsistency. We also review and implement a forward update algorithm that is able to find the equilibrium control in the special case of hyperbolic discounting but has many limitations. The literature review introduces other time-inconsistent situations and algorithms that deal with the efficient temporal credit assignment problem. Finally, we conclude the report and point out the future directions. Bachelor of Science in Mathematical Sciences 2021-05-05T08:34:19Z 2021-05-05T08:34:19Z 2021 Final Year Project (FYP) Su, H. (2021). Time-inconsistent objectives in reinforcement learning. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/148520 https://hdl.handle.net/10356/148520 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Science::Mathematics
spellingShingle Science::Mathematics
Su, Huangyuan
Time-inconsistent objectives in reinforcement learning
description In Reinforcement Learning, one of the most intriguing and long-lasting problems is about how to assign credit to historical events efficiently and meaningfully. And within temporal credit assignment problems, time inconsistency is a challenging sub-domain that was noticed long ago but still lacks systematic treatment. The goal of this work is to search for efficient algorithms that converge to equilibrium policies in the presence of time-inconsistent objectives. In this work, we first provide a brief introduction on reinforcement learning and control theory; then, we define the time-inconsistent problem, both illustratively and formally. After that, we propose a general backward update framework based on game theory. This framework is theoretically proven to be able to find the equilibrium control under time-inconsistency. We also review and implement a forward update algorithm that is able to find the equilibrium control in the special case of hyperbolic discounting but has many limitations. The literature review introduces other time-inconsistent situations and algorithms that deal with the efficient temporal credit assignment problem. Finally, we conclude the report and point out the future directions.
author2 PUN Chi Seng
author_facet PUN Chi Seng
Su, Huangyuan
format Final Year Project
author Su, Huangyuan
author_sort Su, Huangyuan
title Time-inconsistent objectives in reinforcement learning
title_short Time-inconsistent objectives in reinforcement learning
title_full Time-inconsistent objectives in reinforcement learning
title_fullStr Time-inconsistent objectives in reinforcement learning
title_full_unstemmed Time-inconsistent objectives in reinforcement learning
title_sort time-inconsistent objectives in reinforcement learning
publisher Nanyang Technological University
publishDate 2021
url https://hdl.handle.net/10356/148520
_version_ 1759857446708314112