Deep representation learning for time series forecasting

Time series forecasting has critical applications across business and scien- tific domains, such as demand forecasting, capacity planning and management, and anomaly detection. Being able to predict the future yields immense value, allowing us to make downstream decisions with more confidence. Deep...

Full description

Saved in:
Bibliographic Details
Main Author: WOO, Gerald
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2024
Subjects:
Online Access:https://ink.library.smu.edu.sg/etd_coll/650
https://ink.library.smu.edu.sg/context/etd_coll/article/1648/viewcontent/GPIS_AY2020_PhD_Woo_Jiale_Gerald.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.etd_coll-1648
record_format dspace
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Deep learning
Time series forecasting
Neural network architecture design
Time-index models
Machine learning
Artificial Intelligence and Robotics
spellingShingle Deep learning
Time series forecasting
Neural network architecture design
Time-index models
Machine learning
Artificial Intelligence and Robotics
WOO, Gerald
Deep representation learning for time series forecasting
description Time series forecasting has critical applications across business and scien- tific domains, such as demand forecasting, capacity planning and management, and anomaly detection. Being able to predict the future yields immense value, allowing us to make downstream decisions with more confidence. Deep learning for time series forecasting is a burgeoning area of research, moving away from simple linear models found in classical time series analysis literature, towards more expressive, data hungry neural network architectures. In this thesis, we develop methods leveraging deep representation learning for time series forecasting, from exploring neural network architecture designs which encode inductive biases specific to time series data to scalable architec- tures which learn powerful representations from large-scale data. On the one hand, hand-crafting architectures with designs tailored to specific data modali- ties allows us learn representations which encode our priors. Such methods are efficient and less prone to overfitting on small to medium data. On the other hand, scalable architectures combined with large-scale data enable us to avoid manually specifying such priors which could potentially be incorrect. Instead, we learn representations purely from data with models with strong scaling ca- pabilities. This thesis consists three broad themes. In the first part, we explore learning deep representations for time series data with seasonal-trend inductive biases. Neural networks being highly ex- pressive models, tend to overfit on small to medium scale data. Encoding the inductive biases of seasonality and trend can help retain the expressiveness of these models while avoiding overfitting. We embed these ideas in a two-stage self-supervised learning framework of first learning seasonal-trend disentangled representations with contrastive learning, and subsequently feeding these repre- sentations to a downstream predictor. The second part delves into the concept of time-index models and how to adapt them for the deep learning setting. These models are functions of time, rather than a function of historical data. While classical time-index models use a predefined function form to generate predictions, deep time-index mod- els face the debilitating problem of not being able to extrapolate across the forecast horizon due to them being extremely expressive with a strong capa- bility to approximate complex functions. To overcome this issue, we introduce a meta-optimization framework, which aims to achieve the best of both worlds – to retain the flexibility and expressiveness of the neural network, as well as to learn an inductive bias which guarantees extrapolation capability over unseen time steps. Finally, the third part of this dissertation pushes towards large-scale pre- training. Existing work in the academic domain rely on time series datasets with at most millions of observations. In comparison, Large Language Mod- els are trained on trillions of tokens, and Large Vision Models on billions of images. We first tackle large-scale pre-training for the cloud operations do- main, introducing three large-scale datasets ranging up to billions of observa- tions, recording various performance metrics for cloud systems. Using these datasets, we perform an empirical study on pre-training scalable Transformer architectures, ultimately introducing a strong candidate architecture for future work on foundation models. Next, we move towards foundation models for time series forecasting, performing large-scale pre-training on time series from diverse domains. Leveraging the insights from the empirical study, we adapt the base architecture for this new setting, tackling the challenges of varying fre- quency, number of variates, and distributions, stemming from the heterogeneity of time series data. To support this, we also introduce the Large-scale Open Time Series Archive, the largest collection of open time series datasets.
format text
author WOO, Gerald
author_facet WOO, Gerald
author_sort WOO, Gerald
title Deep representation learning for time series forecasting
title_short Deep representation learning for time series forecasting
title_full Deep representation learning for time series forecasting
title_fullStr Deep representation learning for time series forecasting
title_full_unstemmed Deep representation learning for time series forecasting
title_sort deep representation learning for time series forecasting
publisher Institutional Knowledge at Singapore Management University
publishDate 2024
url https://ink.library.smu.edu.sg/etd_coll/650
https://ink.library.smu.edu.sg/context/etd_coll/article/1648/viewcontent/GPIS_AY2020_PhD_Woo_Jiale_Gerald.pdf
_version_ 1827070754825437184
spelling sg-smu-ink.etd_coll-16482025-02-13T06:10:12Z Deep representation learning for time series forecasting WOO, Gerald Time series forecasting has critical applications across business and scien- tific domains, such as demand forecasting, capacity planning and management, and anomaly detection. Being able to predict the future yields immense value, allowing us to make downstream decisions with more confidence. Deep learning for time series forecasting is a burgeoning area of research, moving away from simple linear models found in classical time series analysis literature, towards more expressive, data hungry neural network architectures. In this thesis, we develop methods leveraging deep representation learning for time series forecasting, from exploring neural network architecture designs which encode inductive biases specific to time series data to scalable architec- tures which learn powerful representations from large-scale data. On the one hand, hand-crafting architectures with designs tailored to specific data modali- ties allows us learn representations which encode our priors. Such methods are efficient and less prone to overfitting on small to medium data. On the other hand, scalable architectures combined with large-scale data enable us to avoid manually specifying such priors which could potentially be incorrect. Instead, we learn representations purely from data with models with strong scaling ca- pabilities. This thesis consists three broad themes. In the first part, we explore learning deep representations for time series data with seasonal-trend inductive biases. Neural networks being highly ex- pressive models, tend to overfit on small to medium scale data. Encoding the inductive biases of seasonality and trend can help retain the expressiveness of these models while avoiding overfitting. We embed these ideas in a two-stage self-supervised learning framework of first learning seasonal-trend disentangled representations with contrastive learning, and subsequently feeding these repre- sentations to a downstream predictor. The second part delves into the concept of time-index models and how to adapt them for the deep learning setting. These models are functions of time, rather than a function of historical data. While classical time-index models use a predefined function form to generate predictions, deep time-index mod- els face the debilitating problem of not being able to extrapolate across the forecast horizon due to them being extremely expressive with a strong capa- bility to approximate complex functions. To overcome this issue, we introduce a meta-optimization framework, which aims to achieve the best of both worlds – to retain the flexibility and expressiveness of the neural network, as well as to learn an inductive bias which guarantees extrapolation capability over unseen time steps. Finally, the third part of this dissertation pushes towards large-scale pre- training. Existing work in the academic domain rely on time series datasets with at most millions of observations. In comparison, Large Language Mod- els are trained on trillions of tokens, and Large Vision Models on billions of images. We first tackle large-scale pre-training for the cloud operations do- main, introducing three large-scale datasets ranging up to billions of observa- tions, recording various performance metrics for cloud systems. Using these datasets, we perform an empirical study on pre-training scalable Transformer architectures, ultimately introducing a strong candidate architecture for future work on foundation models. Next, we move towards foundation models for time series forecasting, performing large-scale pre-training on time series from diverse domains. Leveraging the insights from the empirical study, we adapt the base architecture for this new setting, tackling the challenges of varying fre- quency, number of variates, and distributions, stemming from the heterogeneity of time series data. To support this, we also introduce the Large-scale Open Time Series Archive, the largest collection of open time series datasets. 2024-08-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/etd_coll/650 https://ink.library.smu.edu.sg/context/etd_coll/article/1648/viewcontent/GPIS_AY2020_PhD_Woo_Jiale_Gerald.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Dissertations and Theses Collection (Open Access) eng Institutional Knowledge at Singapore Management University Deep learning Time series forecasting Neural network architecture design Time-index models Machine learning Artificial Intelligence and Robotics