Continual learning with neural networks
Recent years have witnessed tremendous successes of artificial neural networks in many applications, ranging from visual perception to language understanding. However, such achievements have been mostly demonstrated on a large amount of labeled data that is static throughout learning. In contrast, r...
Saved in:
Main Author: | |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2022
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/etd_coll/449 https://ink.library.smu.edu.sg/context/etd_coll/article/1447/viewcontent/GPIS_AY2017_PhD_Pham_Hong_Quang.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
Summary: | Recent years have witnessed tremendous successes of artificial neural networks in many applications, ranging from visual perception to language understanding. However, such achievements have been mostly demonstrated on a large amount of labeled data that is static throughout learning. In contrast, real-world environments are always evolving, where new patterns emerge and the older ones become inactive before reappearing in the future. In this respect, continual learning aims to achieve a higher level of intelligence by learning online on a data stream of several tasks. As it turns out, neural networks are not equipped to learn continually: they lack the ability to facilitate knowledge transfer and remember the learned skills. Therefore, this thesis has been dedicated to developing effective continual learning methods and investigating their broader impacts on other research disciplines.
Towards this end, we have made several contributions to facilitate continual learning research. First, we contribute to the classical continual learning framework by analyzing how Batch Normalization affects different replay strategies. We discovered that although Batch Normalization facilitates continual learning, it also hinders the performance of older tasks. We named this the cross-task normalization phenomenon and conducted a comprehensive analysis to investigate and alleviate its negative effects.
Then, we developed a novel fast and slow learning framework for continual learning based on the Complementary Learning Systems of human learning. Particularly, the fast and slow learning principle suggests to model continual learning at two levels: general representation learning and learning of individual experience. This principle has been the main tool for us to address the challenges of learning new skills while remembering old knowledge in continual learning. We first realized the fast-and-slow learning principle in Contextual Transformation Networks (CTN) as an efficient and effective online continual learning algorithm. Then, we proposed DualNets, which incorporated representation learning into continual learning and proposed an effective strategy to utilize general representations for better supervised learning. DualNets not only addresses CTN's limitations but is also applicable to general continual learning settings. Through extensive experiments, our findings suggest that DualNets is an effective and achieved strong results in several challenging continual learning settings, even in the complex scenarios of limited training samples or distribution shifts.
Furthermore, we went beyond the traditional image benchmarks to test the proposed fast-and-slow continual learning framework on the online time series forecasting problem. We proposed Fast and Slow Networks (FSNet) as a radical approach to online time series forecasting by formulating it as a continual learning problem. FSNet leverages and improves upon the fast-and slow learning principle to address two major time series forecasting challenges: fast adaptation to concept drifts and learning of recurring concepts. From experiments with both real and synthetic datasets, we found FSNet's promising capabilities in dealing with concept drifts and recurring patterns.
Finally, we conclude the dissertation with a summary of our contributions and an outline of potential future directions in continual learning research. |
---|