Scalable transfer learning in heterogeneous, dynamic environments

Reinforcement learning is a plausible theoretical basis for developing self-learning, autonomous agents or robots that can effectively represent the world dynamics and efficiently learn the problem features to perform different tasks in different environments. The computational costs and complexitie...

Full description

Saved in:

Bibliographic Details
Main Authors:	Nguyen, Trung Thanh, Silander, Tomi, LI, Zhuoru, Tze-Yun LEONG
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2017
Subjects:	Model-based reinforcement learning Online feature selection Transfer learning Artificial Intelligence and Robotics
Online Access:	https://ink.library.smu.edu.sg/sis_research/3039 https://ink.library.smu.edu.sg/context/sis_research/article/4039/viewcontent/ScalableTransferLearningHeterogeneousDynamicEnvironments_2015.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

Description
Summary:	Reinforcement learning is a plausible theoretical basis for developing self-learning, autonomous agents or robots that can effectively represent the world dynamics and efficiently learn the problem features to perform different tasks in different environments. The computational costs and complexities involved, however, are often prohibitive for real-world applications. This study introduces a scalable methodology to learn and transfer knowledge of the transition (and reward) models for model-based reinforcement learning in a complex world. We propose a variant formulation of Markov decision processes that supports efficient online-learning of the relevant problem features to approximate the world dynamics. We apply the new feature selection and dynamics approximation techniques in heterogeneous transfer learning, where the agent automatically maintains and adapts multiple representations of the world to cope with the different environments it encounters during its lifetime. We prove regret bounds for our approach, and empirically demonstrate its capability to quickly converge to a near optimal policy in both real and simulated environments.

Scalable transfer learning in heterogeneous, dynamic environments

Similar Items