Design continual learning algorithms for time series data
In recent years, significant achievements have been made in machine learning methods across various domains, including image classification, clustering, object detection, and product recommendation. However, traditional machine learning models typically cannot be updated after training, leading to t...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/181857 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | In recent years, significant achievements have been made in machine learning methods across various domains, including image classification, clustering, object detection, and product recommendation. However, traditional machine learning models typically cannot be updated after training, leading to the phenomenon known as "catastrophic forgetting" in dynamic environments where data streams continuously change. This results in a sharp decline in classification performance as the model forgets the features of old data while learning new information. To address this challenge, this dissertation test and validate several continual learning algorithms for time series data, enabling models to retain recognition capabilities for old data while learning new information.
This study systematically classifies and analyzes existing deep learning-based class-incremental learning (CIL) algorithms at three levels: data, parameter, and algorithm. Three representative algorithms—Learning without Forgetting (LwF), Memory Aware Synapses (MAS), and Soft Dynamic Time Warping (SDTW)—are selected for experimentation on time series datasets involving human activity and smart device sensor data. Experimental results demonstrate that MAS and SDTW excel in handling large-scale and multi-task incremental learning scenarios, effectively preserving knowledge from previous tasks. In contrast, the improved iCaRL algorithm based on LwF performs well on smaller tasks but struggles as the number of tasks increases.
Additionally, the effectiveness of Batch Normalization and Layer Normalization in time series CIL tasks is explored. Results indicate that LN performs better in multi-task and complex data distribution scenarios, reducing forgetting effects and enhancing model generalization, while BN is more suitable for simpler tasks with stable feature distributions. The influence of different loss functions, including Cross Entropy and Binary Cross Entropy, on model performance is also analyzed. Findings reveal that BCE can better address class imbalance issues, particularly when using regularization-based algorithms, significantly improving model performance. Besides, I used different parameter combinations from the original paper for each algorithm in the model and compared them with the optimized parameter combinations found using Ray tune.
Overall, this dissertation validates the effectiveness of several incremental learning algorithms on time series datasets and proposes strategies for selecting different algorithms and normalization methods for practical scenarios, providing valuable insights into addressing incremental learning challenges. However, due to hardware and computational limitations, further investigations into advanced model architectures and larger-scale datasets were not conducted, potential improvements such as integrating Generative Adversarial Networks and experience replay or generate replay techniques remain unexplored in this dissertation. Future research should further delve into deep learning methods for time series to broaden perspectives and deepen understanding in this field. |
---|