Second-order online active learning and its applications

The goal of online active learning is to learn predictive models from a sequence of unlabeled data given limited label querybudget. Unlike conventional online learning tasks, online active learning is considerably more challenging because of two reasons.Firstly, it is difficult to design an effectiv...

Full description

Saved in:
Bibliographic Details
Main Authors: HAO, Shuji, LU, Jing, ZHAO, Peilin, ZHANG, Chi, HOI, Steven C. H., MIAO, Chunyan
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2017
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/4132
https://ink.library.smu.edu.sg/context/sis_research/article/5135/viewcontent/Second_order_Online_Active_Learning_and_Its.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-5135
record_format dspace
spelling sg-smu-ink.sis_research-51352019-06-04T06:28:01Z Second-order online active learning and its applications HAO, Shuji LU, Jing ZHAO, Peilin ZHANG, Chi HOI, Steven C. H. MIAO, Chunyan The goal of online active learning is to learn predictive models from a sequence of unlabeled data given limited label querybudget. Unlike conventional online learning tasks, online active learning is considerably more challenging because of two reasons.Firstly, it is difficult to design an effective query strategy to decide when is appropriate to query the label of an incoming instance givenlimited query budget. Secondly, it is also challenging to decide how to update the predictive models effectively whenever the true labelof an instance is queried. Most existing approaches for online active learning are often based on a family of first-order online learningalgorithms, which are simple and efficient but fall short in the slow convergence and sub-optimal solution in exploiting the labeledtraining data. To solve these issues, this paper presents a novel framework of Second-order Online Active Learning (SOAL) by fullyexploiting both the first-order and second-order information. The proposed algorithms are able to achieve effective online learningefficacy, maximize the predictive accuracy and minimize the labeling cost. To make SOAL more practical for real-world applications,especially for class-imbalanced online classification tasks (e.g., malicious web detection), we extend the SOAL framework by proposingthe Cost-sensitive Second-order Online Active Learning algorithm named “SOALCS”, which is devised by maximizing the sum ofweighted sensitivity and specificity or minimizing the cost of weighted mistakes of different classes. We conducted both theoreticalanalysis and empirical studies, including an extensive set of experiments on a variety of large-scale real-world datasets, in which thepromising empirical results validate the efficacy and scalability of the proposed algorithms towards large-scale online learning tasks. 2017-11-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/4132 info:doi/10.1109/TKDE.2017.2778097 https://ink.library.smu.edu.sg/context/sis_research/article/5135/viewcontent/Second_order_Online_Active_Learning_and_Its.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Algorithm design and analysis Labeling Active Learning Online Learning Prediction algorithms Machine learning algorithms Malicious websites detection Predictive models Training Databases and Information Systems Numerical Analysis and Scientific Computing Theory and Algorithms
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Algorithm design and analysis
Labeling
Active Learning
Online Learning
Prediction algorithms
Machine learning algorithms
Malicious websites detection
Predictive models
Training
Databases and Information Systems
Numerical Analysis and Scientific Computing
Theory and Algorithms
spellingShingle Algorithm design and analysis
Labeling
Active Learning
Online Learning
Prediction algorithms
Machine learning algorithms
Malicious websites detection
Predictive models
Training
Databases and Information Systems
Numerical Analysis and Scientific Computing
Theory and Algorithms
HAO, Shuji
LU, Jing
ZHAO, Peilin
ZHANG, Chi
HOI, Steven C. H.
MIAO, Chunyan
Second-order online active learning and its applications
description The goal of online active learning is to learn predictive models from a sequence of unlabeled data given limited label querybudget. Unlike conventional online learning tasks, online active learning is considerably more challenging because of two reasons.Firstly, it is difficult to design an effective query strategy to decide when is appropriate to query the label of an incoming instance givenlimited query budget. Secondly, it is also challenging to decide how to update the predictive models effectively whenever the true labelof an instance is queried. Most existing approaches for online active learning are often based on a family of first-order online learningalgorithms, which are simple and efficient but fall short in the slow convergence and sub-optimal solution in exploiting the labeledtraining data. To solve these issues, this paper presents a novel framework of Second-order Online Active Learning (SOAL) by fullyexploiting both the first-order and second-order information. The proposed algorithms are able to achieve effective online learningefficacy, maximize the predictive accuracy and minimize the labeling cost. To make SOAL more practical for real-world applications,especially for class-imbalanced online classification tasks (e.g., malicious web detection), we extend the SOAL framework by proposingthe Cost-sensitive Second-order Online Active Learning algorithm named “SOALCS”, which is devised by maximizing the sum ofweighted sensitivity and specificity or minimizing the cost of weighted mistakes of different classes. We conducted both theoreticalanalysis and empirical studies, including an extensive set of experiments on a variety of large-scale real-world datasets, in which thepromising empirical results validate the efficacy and scalability of the proposed algorithms towards large-scale online learning tasks.
format text
author HAO, Shuji
LU, Jing
ZHAO, Peilin
ZHANG, Chi
HOI, Steven C. H.
MIAO, Chunyan
author_facet HAO, Shuji
LU, Jing
ZHAO, Peilin
ZHANG, Chi
HOI, Steven C. H.
MIAO, Chunyan
author_sort HAO, Shuji
title Second-order online active learning and its applications
title_short Second-order online active learning and its applications
title_full Second-order online active learning and its applications
title_fullStr Second-order online active learning and its applications
title_full_unstemmed Second-order online active learning and its applications
title_sort second-order online active learning and its applications
publisher Institutional Knowledge at Singapore Management University
publishDate 2017
url https://ink.library.smu.edu.sg/sis_research/4132
https://ink.library.smu.edu.sg/context/sis_research/article/5135/viewcontent/Second_order_Online_Active_Learning_and_Its.pdf
_version_ 1770574346472914944