DisMASTD: An efficient distributed multi-aspect streaming tensor decomposition

Tensor decomposition is a fundamental multidimensional data analysis tool for many data-driven applications, such as social computing, computer vision, and bioinformatics, to name but a few. However, the rapidly increasing streaming data nowadays introduces new challenges to traditional static tenso...

Full description

Saved in:
Bibliographic Details
Main Authors: YANG, Keyu, GAO, Yunjun, SHEN, Yifeng, ZHENG, Baihua, CHEN, Lu
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2021
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6124
https://ink.library.smu.edu.sg/context/sis_research/article/7127/viewcontent/DisMASTD_ICDE21_CR.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Tensor decomposition is a fundamental multidimensional data analysis tool for many data-driven applications, such as social computing, computer vision, and bioinformatics, to name but a few. However, the rapidly increasing streaming data nowadays introduces new challenges to traditional static tensor decomposition. It requires an efficient distributed dynamic tensor decomposition without re-computing the whole tensor from scratch. In this paper, we propose DisMASTD, an efficient distributed multi-aspect streaming tensor decomposition. First, we prove the optimal tensor partitioning problem is NP-hard. Second, we present two heuristic tensor partitioning approaches to ensure the load balancing. Third, we develop a distributed multi-aspect streaming tensor decomposition computation method, which avoids repetitive computation and reduces network communication by maintaining and reusing the intermediate results. Last but not least, we perform extensive experiments with both real and synthetic datasets to demonstrate the efficiency and scalability of DisMASTD.