Deep representation learning for time series forecasting

Time series forecasting has critical applications across business and scien- tific domains, such as demand forecasting, capacity planning and management, and anomaly detection. Being able to predict the future yields immense value, allowing us to make downstream decisions with more confidence. Deep...

Full description

Saved in:

Bibliographic Details
Main Author:	WOO, Gerald
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2024
Subjects:	Deep learning Time series forecasting Neural network architecture design Time-index models Machine learning Artificial Intelligence and Robotics
Online Access:	https://ink.library.smu.edu.sg/etd_coll/650 https://ink.library.smu.edu.sg/context/etd_coll/article/1648/viewcontent/GPIS_AY2020_PhD_Woo_Jiale_Gerald.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.etd_coll-1648
record_format	dspace
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Deep learning Time series forecasting Neural network architecture design Time-index models Machine learning Artificial Intelligence and Robotics
spellingShingle	Deep learning Time series forecasting Neural network architecture design Time-index models Machine learning Artificial Intelligence and Robotics WOO, Gerald Deep representation learning for time series forecasting
description	Time series forecasting has critical applications across business and scien- tific domains, such as demand forecasting, capacity planning and management, and anomaly detection. Being able to predict the future yields immense value, allowing us to make downstream decisions with more confidence. Deep learning for time series forecasting is a burgeoning area of research, moving away from simple linear models found in classical time series analysis literature, towards more expressive, data hungry neural network architectures. In this thesis, we develop methods leveraging deep representation learning for time series forecasting, from exploring neural network architecture designs which encode inductive biases specific to time series data to scalable architec- tures which learn powerful representations from large-scale data. On the one hand, hand-crafting architectures with designs tailored to specific data modali- ties allows us learn representations which encode our priors. Such methods are efficient and less prone to overfitting on small to medium data. On the other hand, scalable architectures combined with large-scale data enable us to avoid manually specifying such priors which could potentially be incorrect. Instead, we learn representations purely from data with models with strong scaling ca- pabilities. This thesis consists three broad themes. In the first part, we explore learning deep representations for time series data with seasonal-trend inductive biases. Neural networks being highly ex- pressive models, tend to overfit on small to medium scale data. Encoding the inductive biases of seasonality and trend can help retain the expressiveness of these models while avoiding overfitting. We embed these ideas in a two-stage self-supervised learning framework of first learning seasonal-trend disentangled representations with contrastive learning, and subsequently feeding these repre- sentations to a downstream predictor. The second part delves into the concept of time-index models and how to adapt them for the deep learning setting. These models are functions of time, rather than a function of historical data. While classical time-index models use a predefined function form to generate predictions, deep time-index mod- els face the debilitating problem of not being able to extrapolate across the forecast horizon due to them being extremely expressive with a strong capa- bility to approximate complex functions. To overcome this issue, we introduce a meta-optimization framework, which aims to achieve the best of both worlds – to retain the flexibility and expressiveness of the neural network, as well as to learn an inductive bias which guarantees extrapolation capability over unseen time steps. Finally, the third part of this dissertation pushes towards large-scale pre- training. Existing work in the academic domain rely on time series datasets with at most millions of observations. In comparison, Large Language Mod- els are trained on trillions of tokens, and Large Vision Models on billions of images. We first tackle large-scale pre-training for the cloud operations do- main, introducing three large-scale datasets ranging up to billions of observa- tions, recording various performance metrics for cloud systems. Using these datasets, we perform an empirical study on pre-training scalable Transformer architectures, ultimately introducing a strong candidate architecture for future work on foundation models. Next, we move towards foundation models for time series forecasting, performing large-scale pre-training on time series from diverse domains. Leveraging the insights from the empirical study, we adapt the base architecture for this new setting, tackling the challenges of varying fre- quency, number of variates, and distributions, stemming from the heterogeneity of time series data. To support this, we also introduce the Large-scale Open Time Series Archive, the largest collection of open time series datasets.
format	text
author	WOO, Gerald
author_facet	WOO, Gerald
author_sort	WOO, Gerald
title	Deep representation learning for time series forecasting
title_short	Deep representation learning for time series forecasting
title_full	Deep representation learning for time series forecasting
title_fullStr	Deep representation learning for time series forecasting
title_full_unstemmed	Deep representation learning for time series forecasting
title_sort	deep representation learning for time series forecasting
publisher	Institutional Knowledge at Singapore Management University
publishDate	2024
url	https://ink.library.smu.edu.sg/etd_coll/650 https://ink.library.smu.edu.sg/context/etd_coll/article/1648/viewcontent/GPIS_AY2020_PhD_Woo_Jiale_Gerald.pdf
_version_	1827070754825437184
spelling	sg-smu-ink.etd_coll-16482025-02-13T06:10:12Z Deep representation learning for time series forecasting WOO, Gerald Time series forecasting has critical applications across business and scien- tific domains, such as demand forecasting, capacity planning and management, and anomaly detection. Being able to predict the future yields immense value, allowing us to make downstream decisions with more confidence. Deep learning for time series forecasting is a burgeoning area of research, moving away from simple linear models found in classical time series analysis literature, towards more expressive, data hungry neural network architectures. In this thesis, we develop methods leveraging deep representation learning for time series forecasting, from exploring neural network architecture designs which encode inductive biases specific to time series data to scalable architec- tures which learn powerful representations from large-scale data. On the one hand, hand-crafting architectures with designs tailored to specific data modali- ties allows us learn representations which encode our priors. Such methods are efficient and less prone to overfitting on small to medium data. On the other hand, scalable architectures combined with large-scale data enable us to avoid manually specifying such priors which could potentially be incorrect. Instead, we learn representations purely from data with models with strong scaling ca- pabilities. This thesis consists three broad themes. In the first part, we explore learning deep representations for time series data with seasonal-trend inductive biases. Neural networks being highly ex- pressive models, tend to overfit on small to medium scale data. Encoding the inductive biases of seasonality and trend can help retain the expressiveness of these models while avoiding overfitting. We embed these ideas in a two-stage self-supervised learning framework of first learning seasonal-trend disentangled representations with contrastive learning, and subsequently feeding these repre- sentations to a downstream predictor. The second part delves into the concept of time-index models and how to adapt them for the deep learning setting. These models are functions of time, rather than a function of historical data. While classical time-index models use a predefined function form to generate predictions, deep time-index mod- els face the debilitating problem of not being able to extrapolate across the forecast horizon due to them being extremely expressive with a strong capa- bility to approximate complex functions. To overcome this issue, we introduce a meta-optimization framework, which aims to achieve the best of both worlds – to retain the flexibility and expressiveness of the neural network, as well as to learn an inductive bias which guarantees extrapolation capability over unseen time steps. Finally, the third part of this dissertation pushes towards large-scale pre- training. Existing work in the academic domain rely on time series datasets with at most millions of observations. In comparison, Large Language Mod- els are trained on trillions of tokens, and Large Vision Models on billions of images. We first tackle large-scale pre-training for the cloud operations do- main, introducing three large-scale datasets ranging up to billions of observa- tions, recording various performance metrics for cloud systems. Using these datasets, we perform an empirical study on pre-training scalable Transformer architectures, ultimately introducing a strong candidate architecture for future work on foundation models. Next, we move towards foundation models for time series forecasting, performing large-scale pre-training on time series from diverse domains. Leveraging the insights from the empirical study, we adapt the base architecture for this new setting, tackling the challenges of varying fre- quency, number of variates, and distributions, stemming from the heterogeneity of time series data. To support this, we also introduce the Large-scale Open Time Series Archive, the largest collection of open time series datasets. 2024-08-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/etd_coll/650 https://ink.library.smu.edu.sg/context/etd_coll/article/1648/viewcontent/GPIS_AY2020_PhD_Woo_Jiale_Gerald.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Dissertations and Theses Collection (Open Access) eng Institutional Knowledge at Singapore Management University Deep learning Time series forecasting Neural network architecture design Time-index models Machine learning Artificial Intelligence and Robotics

Deep representation learning for time series forecasting

Similar Items