Data-driven sales/demand forecasting in supply chain 4.0 system
In this dissertation, a new deep network structure, called Global-Local Fusion Network, as well as two attention mechanisms, Spatial Fusion Attention and Cross Direction Attention, are proposed for univariate time-series prediction, especially for demand forecasting or sales forecasting proble...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/158888 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | In this dissertation, a new deep network structure, called Global-Local Fusion Network,
as well as two attention mechanisms, Spatial Fusion Attention and Cross Direction
Attention, are proposed for univariate time-series prediction, especially for demand
forecasting or sales forecasting problem.
The proposed network incorporates convolution neural network (CNN) and long short term memory (LSTM) with attention mechanism in parallel, in which the former is
designed to capture local features while the latter is designed for global features. Then
the information is sent to LSTM decoder as well as Luong Attention Module to be
integrated and finally the output is yielded.
As for the two designed attention mechanisms, Spatial Fusion Attention uses one dimensional convolution filtering along spatial dimension to produce extraction
vectors from hidden states of LSTM. Then extraction vectors and the final hidden state
are used to produce scoring values and context vector is yielded using self-attention
operation. Cross Direction Attention is similar to spatial fusion attention, but it uses
information from both spatial dimension and temporal dimension, convolution
filtering along temporal dimension is used to produce extraction vectors while filtering
along spatial dimension and dot product is used to produce scoring value. Next,
extraction vectors are multiplied by corresponding scoring values. Finally, combined
with the last hidden state, context vector is produced.
The proposed model works better than all candidates in three datasets, the Orange Juice
dataset, a specific dataset for demand forecasting, and two benchmark datasets for time
series prediction: Favorita dataset and Electricity dataset. It works especially well in
the Orange Juice dataset with the lowest data frequency (weekly), both in long-range
and short-range prediction. The better performance in all three datasets with different
data density proves that the proposed model has the high potential for this task |
---|