Energy efficient resource allocation in wireless communications with deep reinforcement learning
The rapid development and wide application of mobile communication technology has facilitated communication of people and stimulated technological innovation. Cellular network is the basis of mobile communication and its power allocation is a problem worth studying. On the one hand, it is known that...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/173220 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | The rapid development and wide application of mobile communication technology has facilitated communication of people and stimulated technological innovation. Cellular network is the basis of mobile communication and its power allocation is a problem worth studying. On the one hand, it is known that effective power allocation scheme can improve system performance and communication quality, and reduce energy consumption together with operation cost at the same time. The problem investigated in this dissertation is downlink power allocation in cellular networks using deep reinforcement learning (DRL). To begin with, the problem is modelled as an optimization problem with constraints, dynamically finding the transmit power of each base station (BS), limited by the maximum transmit power of BSs and with the objective of maximizing the spectral efficiency of the system. Next, three different algorithms of DRL are applied to solve this problem. The parameters in the wireless communication are processed mathematically as elements in the DRL. After that, the algorithm framework and network structure of Policy Gradient, Deep Q Network (DQN) and Deep Deterministic Policy Gradient (DDPG) are designed according to this power allocation task. Finally, detailed experiments are carried out. For the three DRL methods, the average spectral efficiency values obtained from multiple simulations are 1.473 (bps/Hz) for Policy Gradient, 1.743 (bps/Hz) for DQN and 1.908 (bps/Hz) for DDPG. So the average spectral efficiency performance of DDPG is 29.530% higher than that of Policy Gradient and 9.466% higher than that of DQN. In addition, it is observed that DDPG achieves the smallest variance and the fastest convergence speed. DQN has the shortest training time. The proposed DRL algorithm has good generalization ability when the cellular network environment settings are changed. |
---|