Non-repetitive gaming experience as a curriculum design problem
In this project I seek to investigate problems where the distributions of trajectories of an agent through the environment is to be optimized, as opposed to optimizing for an end state. As part of this effort I developed an environment modelling a simulated player at a Lottery Game in a casino. A Dr...
Saved in:
主要作者: | |
---|---|
其他作者: | |
格式: | Final Year Project |
語言: | English |
出版: |
Nanyang Technological University
2020
|
主題: | |
在線閱讀: | https://hdl.handle.net/10356/138141 |
標簽: |
添加標簽
沒有標簽, 成為第一個標記此記錄!
|
機構: | Nanyang Technological University |
語言: | English |
總結: | In this project I seek to investigate problems where the distributions of trajectories of an agent through the environment is to be optimized, as opposed to optimizing for an end state. As part of this effort I developed an environment modelling a simulated player at a Lottery Game in a casino. A Drama Manager is able to take certain actions which affects the environment and thus indirectly affect the player’s experience. By formulating the player’s experience – trajectory, or sequence of states – through the lottery game as a Markov Decision Problem, we have a well-studied framework on which existing algorithms can be applied to solve. Concurrently with the environment development, I built a Drama Manager that learns to solve the environment as a proof of concept. The complexity of the environment and the Drama Manager are increased concurrently, arriving at a complex stochastic environment that supports trajectory-based learning approaches and provides conflicting optimization goals. Correspondingly, the Drama Manager is able to make progress towards solving the final form of the environment. |
---|