Online Passive Aggressive Active Learning and its Applications
We investigate online active learning techniques for classification tasks in data stream mining applications. Unlike traditional learning approaches (either batch or online learning) that often require to request the class label of each incoming instance, online active learning queries only a subset...
Saved in:
Main Authors: | , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2014
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/2640 https://ink.library.smu.edu.sg/context/sis_research/article/3640/viewcontent/lu_HOI_14.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
Summary: | We investigate online active learning techniques for classification tasks in data stream mining applications. Unlike traditional learning approaches (either batch or online learning) that often require to request the class label of each incoming instance, online active learning queries only a subset of informative incoming instances to update the classification model, which aims to maximize classification performance using minimal human labeling effort during the entire online stream data mining task. In this paper, we present a new family of algorithms for online active learning called Passive-Aggressive Active (PAA) learning algorithms by adapting the popular Passive-Aggressive algorithms in an online active learning setting. Unlike the conventional Perceptron-based approach that employs only the misclassified instances for updating the model, the proposed PAA learning algorithms not only use the misclassified instances to update the classifier, but also exploit correctly classified examples with low prediction confidence. We theoretically analyse the mistake bounds of the proposed algorithms and conduct extensive experiments to examine their empirical performance, in which encouraging results show clear advantages of our algorithms over the baselines. |
---|