Non-repetitive gaming experience as a curriculum design problem

In this project I seek to investigate problems where the distributions of trajectories of an agent through the environment is to be optimized, as opposed to optimizing for an end state. As part of this effort I developed an environment modelling a simulated player at a Lottery Game in a casino. A Dr...

全面介紹

Saved in:
書目詳細資料
主要作者: Quek, Yufei
其他作者: Zinovi Rabinovich
格式: Final Year Project
語言:English
出版: Nanyang Technological University 2020
主題:
在線閱讀:https://hdl.handle.net/10356/138141
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English
實物特徵
總結:In this project I seek to investigate problems where the distributions of trajectories of an agent through the environment is to be optimized, as opposed to optimizing for an end state. As part of this effort I developed an environment modelling a simulated player at a Lottery Game in a casino. A Drama Manager is able to take certain actions which affects the environment and thus indirectly affect the player’s experience. By formulating the player’s experience – trajectory, or sequence of states – through the lottery game as a Markov Decision Problem, we have a well-studied framework on which existing algorithms can be applied to solve. Concurrently with the environment development, I built a Drama Manager that learns to solve the environment as a proof of concept. The complexity of the environment and the Drama Manager are increased concurrently, arriving at a complex stochastic environment that supports trajectory-based learning approaches and provides conflicting optimization goals. Correspondingly, the Drama Manager is able to make progress towards solving the final form of the environment.