Data-driven sales/demand forecasting in supply chain 4.0 system

In this dissertation, a new deep network structure, called Global-Local Fusion Network, as well as two attention mechanisms, Spatial Fusion Attention and Cross Direction Attention, are proposed for univariate time-series prediction, especially for demand forecasting or sales forecasting proble...

Full description

Saved in:
Bibliographic Details
Main Author: Chen, Weizheng
Other Authors: Lihui Chen
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/158888
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In this dissertation, a new deep network structure, called Global-Local Fusion Network, as well as two attention mechanisms, Spatial Fusion Attention and Cross Direction Attention, are proposed for univariate time-series prediction, especially for demand forecasting or sales forecasting problem. The proposed network incorporates convolution neural network (CNN) and long short term memory (LSTM) with attention mechanism in parallel, in which the former is designed to capture local features while the latter is designed for global features. Then the information is sent to LSTM decoder as well as Luong Attention Module to be integrated and finally the output is yielded. As for the two designed attention mechanisms, Spatial Fusion Attention uses one dimensional convolution filtering along spatial dimension to produce extraction vectors from hidden states of LSTM. Then extraction vectors and the final hidden state are used to produce scoring values and context vector is yielded using self-attention operation. Cross Direction Attention is similar to spatial fusion attention, but it uses information from both spatial dimension and temporal dimension, convolution filtering along temporal dimension is used to produce extraction vectors while filtering along spatial dimension and dot product is used to produce scoring value. Next, extraction vectors are multiplied by corresponding scoring values. Finally, combined with the last hidden state, context vector is produced. The proposed model works better than all candidates in three datasets, the Orange Juice dataset, a specific dataset for demand forecasting, and two benchmark datasets for time series prediction: Favorita dataset and Electricity dataset. It works especially well in the Orange Juice dataset with the lowest data frequency (weekly), both in long-range and short-range prediction. The better performance in all three datasets with different data density proves that the proposed model has the high potential for this task