A unified framework for sparse online learning

The amount of data in our society has been exploding in the era of big data. This article aims to address several open challenges in big data stream classification. Many existing studies in data mining literature follow the batch learning setting, which suffers from low efficiency and poor scalabili...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHAO, Peilin, WONG, Dayong, WU, Pengcheng, HOI, Steven C. H.
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2020
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/5957
https://ink.library.smu.edu.sg/context/sis_research/article/6960/viewcontent/UnitedFrameworkSOL_2020_av.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:The amount of data in our society has been exploding in the era of big data. This article aims to address several open challenges in big data stream classification. Many existing studies in data mining literature follow the batch learning setting, which suffers from low efficiency and poor scalability. To tackle these challenges, we investigate a unified online learning framework for the big data stream classification task. Different from the existing online data stream classification techniques, we propose a unified Sparse Online Classification (SOC) framework. Based on SOC, we derive a second-order online learning algorithm and a cost-sensitive sparse online learning algorithm, which could successfully handle online anomaly detection tasks with the extremely unbalanced class distribution. As the performance evaluation, we analyze the theoretical bounds of the proposed algorithms and conduct an extensive set of experiments. The encouraging experimental results demonstrate the efficacy of the proposed algorithms over the state-of-the-art techniques on multiple data stream classification tasks.