Pricing problems with Thompson sampling
In 1933, William R. Thompson proposed an algorithm known as Thompson sampling in order to maximise culmulative payo in a multi-armed bandit (MAB) problem. MAB problems have been fre- quently used to model real-life decision making scenarios. This pa- per explores the extension of Thompson sampl...
Saved in:
主要作者: | |
---|---|
其他作者: | |
格式: | Final Year Project |
語言: | English |
出版: |
2019
|
主題: | |
在線閱讀: | http://hdl.handle.net/10356/77144 |
標簽: |
添加標簽
沒有標簽, 成為第一個標記此記錄!
|