Multi-armed linear bandits with latent biases

In a linear stochastic bandit model, each arm corresponds to a vector in Euclidean space, and the expected return observed at each time step is determined by an unknown linear function of the selected arm. This paper addresses the challenge of identifying the optimal arm in a linear stochastic bandi...

Full description

Saved in:
Bibliographic Details
Main Authors: Kang, Qiyu, Tay, Wee Peng, She, Rui, Wang, Sijie, Liu, Xiaoqian, Yang, Yuan-Rui
Other Authors: School of Electrical and Electronic Engineering
Format: Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/175416
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Be the first to leave a comment!
You must be logged in first