Multi-armed linear bandits with latent biases
In a linear stochastic bandit model, each arm corresponds to a vector in Euclidean space, and the expected return observed at each time step is determined by an unknown linear function of the selected arm. This paper addresses the challenge of identifying the optimal arm in a linear stochastic bandi...
Saved in:
Main Authors: | Kang, Qiyu, Tay, Wee Peng, She, Rui, Wang, Sijie, Liu, Xiaoqian, Yang, Yuan-Rui |
---|---|
其他作者: | School of Electrical and Electronic Engineering |
格式: | Article |
語言: | English |
出版: |
2024
|
主題: | |
在線閱讀: | https://hdl.handle.net/10356/175416 |
標簽: |
添加標簽
沒有標簽, 成為第一個標記此記錄!
|
機構: | Nanyang Technological University |
語言: | English |
相似書籍
-
Multi-arm bandit-led clustering in federated learning
由: Zhao, Joe Chen Xuan
出版: (2024) -
PERFORMANCE GUARANTEES FOR ONLINE LEARNING: CASCADING BANDITS AND ADVERSARIAL CORRUPTIONS
由: ZHONG ZIXIN
出版: (2021) -
THE CONFIDENCE BOUND METHOD FOR THE MULTI-ARMED BANDIT PROBLEM WITH LARGE ARM SIZE
由: HU SHOURI
出版: (2020) -
Dynamic Clustering of Contextual Multi-Armed Bandits
由: NGUYEN, Trong T., et al.
出版: (2014) -
Efficient resource allocation with fairness constraints in restless multi-armed bandits
由: LI, Dexun, et al.
出版: (2022)