Improving deep reinforcement learning with advanced exploration and transfer learning techniques

Deep reinforcement learning utilizes deep neural networks as the function approximator to model the reinforcement learning policy and enables the policy to be trained in an end-to-end manner. When applied to complex real world problems such as video games playing and natural language processing, the...

全面介紹

Saved in:

書目詳細資料
主要作者:	Yin, Haiyan
其他作者:	Pan Jialin, Sinno
格式:	Thesis-Doctor of Philosophy
語言:	English
出版:	Nanyang Technological University 2020
主題:	Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
在線閱讀:	https://hdl.handle.net/10356/137772
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!

實物特徵
總結:	Deep reinforcement learning utilizes deep neural networks as the function approximator to model the reinforcement learning policy and enables the policy to be trained in an end-to-end manner. When applied to complex real world problems such as video games playing and natural language processing, the deep reinforcement learning algorithms often engage tremendous parameters with intractable search space, which is a result from the low-level modelling of state space or the complex nature of the problem. Therefore, constructing an effective exploration strategy to search through the solution space is crucial for deriving a policy that can tackle challenging problems. Furthermore, considering the considerable amount of computational resource and time consumed for policy training, it is also crucial to develop the transferability of the algorithm to create versatile and generalizable policy. In this thesis, I present a study on improving the deep reinforcement learning algorithms from the perspectives of exploration and transfer learning. The study of exploration mainly focuses on solving hard exploration problems in Atari 2600 games suite and the partially observable navigation domains with extremely sparse rewards. The following three exploration algorithms are discussed: a planning-based algorithm with deep hashing techniques to improve the search efficiency, a distributed framework with an exploration incentivizing novelty model to increase the sample throughput while gathering more novel experiences, and a sequence-level novelty model designated for sparse rewarded partially observable domains. With the attempt to improve the generalization ability of the policy, I discuss two policy transfer algorithms, which work on multi-task policy distillation and zero-shot policy transfer tasks, respectively. The above mentioned study has been evaluated in video games playing domains with high dimensional pixel-level inputs. The testified domains consist of Atari 2600 games suite, ViZDoom and DeepMind Lab. As a result, the presented approaches demonstrate desirable properties for improving the policy performance with the advanced exploration or transfer learning mechanism. Finally, I conclude by discussing open questions and future directions in applying the presented exploration and transfer learning techniques in more general and practical scenarios.

Improving deep reinforcement learning with advanced exploration and transfer learning techniques

相似書籍