Multi-armed linear bandits with latent biases
In a linear stochastic bandit model, each arm corresponds to a vector in Euclidean space, and the expected return observed at each time step is determined by an unknown linear function of the selected arm. This paper addresses the challenge of identifying the optimal arm in a linear stochastic bandi...
Saved in:
Main Authors: | , , , , , |
---|---|
其他作者: | |
格式: | Article |
語言: | English |
出版: |
2024
|
主題: | |
在線閱讀: | https://hdl.handle.net/10356/175416 |
標簽: |
添加標簽
沒有標簽, 成為第一個標記此記錄!
|