Adaptive goal selection for agents in dynamic environments

In psychology, goal-setting theory, which has been studied by psychologists for over 35 years, reveals that goals play significant roles in incentive, action and performance for human beings. Based on this theory, a goal net model has been proposed to design intelligent agents that can be viewed as...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhang, Huiliang, Luo, Xudong, Shen, Zhiqi, You, Jin, Miao, Chun Yan
Other Authors: School of Computer Engineering
Format: Article
Language:English
Published: 2013
Subjects:
Online Access:https://hdl.handle.net/10356/96478
http://hdl.handle.net/10220/18110
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In psychology, goal-setting theory, which has been studied by psychologists for over 35 years, reveals that goals play significant roles in incentive, action and performance for human beings. Based on this theory, a goal net model has been proposed to design intelligent agents that can be viewed as a soft copy of human being somehow. The goal net model has been successfully applied in many agents, specially, non-player-character agents in computer games. Such an agent selects the optimal solution in all possible solutions found by using a recursive algorithm. However, if a goal net is very complex, the time of selection could be too long for the agent to respond quickly when the agent needs to re-select a new solution against the world’s change. Moreover, in some dynamic environments, it is impossible to know the exact outcome of choosing a solution in advance, and so the possible solutions cannot be evaluated precisely. Thus, to address the problem, this paper applies learning algorithm into goal selection in dynamic environments. More specifically, we first develop a reorganization algorithm that can convert a goal net to its equivalent counterpart that a Q-learning algorithm can operate on; then, we define the key component of Q-learning, reward function, according to the feature of goal nets; and finally lots of experiments are conducted to show that, in dynamic environments, the agent with the learning algorithm significantly outperforms the one with the recursive searching algorithm. Therefore, our work suggests an agent model that can effectively be applied in dynamic time-sensitive domain, like computer games and the P2P systems of online movie watching.