Multi-armed linear bandits with latent biases
In a linear stochastic bandit model, each arm corresponds to a vector in Euclidean space, and the expected return observed at each time step is determined by an unknown linear function of the selected arm. This paper addresses the challenge of identifying the optimal arm in a linear stochastic bandi...
Saved in:
Main Authors: | , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/175416 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |