Advanced topics in deep reinforcement learning and its applications

The development of reinforcement learning attracts more and more attention among researchers. Leveraging deep learning, i.e. embedding neural network function approximators, reinforcement learning is empowered to achieve great success on a broad range of tasks including video games, control, Natural...

Full description

Saved in:
Bibliographic Details
Main Author: Chen, Jianda
Other Authors: Sinno Jialin Pan
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/164296
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The development of reinforcement learning attracts more and more attention among researchers. Leveraging deep learning, i.e. embedding neural network function approximators, reinforcement learning is empowered to achieve great success on a broad range of tasks including video games, control, Natural Language Processing (NLP), and data center cooling. However, applications of deep reinforcement learning under a real-world setting usually suffer from an overfitting issue, in which performance would likely decrease if the environment is slightly different from the training environment. It turns out that it is crucial to improve deep reinforcement learning’s generalization ability such that its techniques can be applied to real-world scenarios more effectively. My major research work is investigating how to perform reinforcement learning on an important and widely used application, a control task that takes high-dimensional pixels as input. Images are informative, but they can also introduce noisy visual features, such as lighting or color shift in the real scene. Reinforcement learning agents which take input in the form of pixels are easily distracted by task-irrelevant features, resulting in a significant drop in performance in environments that are changed slightly from the training environment. Moreover, compared to low-dimensional inputs, learning to control from pixels suffers from sample inefficiency that requires more interactions with an environment to learn behaviors. My research works aim to improve the data efficiency for accelerating policy training and improve the generalization ability of reinforcement learning agents such that agents are able to perform consistently well across different environments even if they are unseen in training. I propose novel methods that map the high-dimensional visual observations to a low-dimensional representation space where the state abstractions are learned. Behavioral metrics, which compute state-wise similarities according to the properties of the Markov decision process (MDP), e.g. reward and transition probability, are measured upon the representation space, in order to learn state representation to capture task-relevant features. By learning representation with a behavioral metric, the task-irrelevant features in pixels are omitted and task-specific information is captured. Therefore the generalization ability of an agent is enhanced and the data efficiency is raised. Besides, another research work of mine is to extend applications for reinforcement learning to deep model compression. Nowadays deep neural networks are potentially overparameterized so there is a big challenge to deploy such networks on computationally constrained devices. I propose a reinforcement learning-based method for pruning convolutional neural networks (CNNs). This method combines runtime channel pruning, where pruning result depends on input data instance, and static pruning, i.e. conventional channel pruning, and develops a trade-off strategy to balance the effect of flexibility and storage efficiency.