Leveraging deep generative models for non-parametric distributions in reinforcement learning
This thesis explores the use of deep generative models to enhance distribution representations in reinforcement learning (RL), leading to improved exploration, stability, and performance. It focuses on two roles of distributions in RL: policy distributions and action distributions. For policy distri...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/173455 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | This thesis explores the use of deep generative models to enhance distribution representations in reinforcement learning (RL), leading to improved exploration, stability, and performance. It focuses on two roles of distributions in RL: policy distributions and action distributions. For policy distributions, an adversarial hypernetwork (AH) architecture enables multi-policy learning, allowing algorithms to converge to diverse local optima. The AH framework is generalized to single-policy RL algorithms, with a self-distillation mechanism for better learning efficiency. In the second part, the thesis investigates the benefits of using deep generative Fully-parameterized Quantile Function (FQF) in the actor of Soft Actor-Critic (SAC) to overcome uni-modality assumptions in stochastic policy implementations. The thesis demonstrates the theoretical boundedness of the entropy regularization term in FQF. Overall, this work proposes leveraging deep generative models to address performance inconsistencies and limitations of traditional modality assumptions in RL distributions. |
---|