VLC and D2D heterogeneous network optimization : a reinforcement learning approach based on equilibrium problems with equilibrium constraints

The radio frequency spectrum crunch has triggered the harnessing of other sources of bandwidth, for which visible light is a promising candidate. Even though visible light communication (VLC) ensures high capacity, coverage is limited. This necessitates the integration of VLC and device-To-device (D...

Full description

Saved in:
Bibliographic Details
Main Authors: Raveendran, Neetu, Zhang, Huaqing, Niyato, Dusit, Yang, Fang, Song, Jian, Han, Zhu
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2021
Subjects:
Online Access:https://hdl.handle.net/10356/150746
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The radio frequency spectrum crunch has triggered the harnessing of other sources of bandwidth, for which visible light is a promising candidate. Even though visible light communication (VLC) ensures high capacity, coverage is limited. This necessitates the integration of VLC and device-To-device (D2D) technologies into heterogeneous networks. In particular, mobile users which are accessible by the VLC transmitters can relay data to mobile users which are not, by means of D2D communication. However, due to the distributed behaviors of mobile users, determining optimal data transmission routes from VLC transmitters to end mobile devices is a major challenge. In this paper, we propose a reinforcement learning (RL)-based approach to determine multi-hop data transmission routes in an indoor VLC-D2D heterogeneous network. We obtain the rewards for the RL-based method dynamically, by formulating the interactions between the mobile users relaying the data as an equilibrium problem with equilibrium constraints and using alternating direction method of multipliers to solve it. The proposed technique can achieve optimal data transmission routes in a distributed manner. The simulation results demonstrate the effectiveness of the proposed approach, showing that transmission routes with low delays and high capacities can be achieved through the learning algorithm.