Second-order online active learning and its applications

The goal of online active learning is to learn predictive models from a sequence of unlabeled data given limited label query budget. Unlike conventional online learning tasks, online active learning is considerably more challenging because of two reasons. First, it is difficult to design an effectiv...

Full description

Saved in:
Bibliographic Details
Main Authors: Hao, Shuji, Lu, Jing, Zhao, Peilin, Zhang, Chi, Hoi, Steven C. H., Miao, Chunyan
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2020
Subjects:
Online Access:https://hdl.handle.net/10356/140784
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-140784
record_format dspace
spelling sg-ntu-dr.10356-1407842020-06-02T03:28:33Z Second-order online active learning and its applications Hao, Shuji Lu, Jing Zhao, Peilin Zhang, Chi Hoi, Steven C. H. Miao, Chunyan School of Computer Science and Engineering Interdisciplinary Graduate School (IGS) Engineering::Computer science and engineering Online Learning Active Learning The goal of online active learning is to learn predictive models from a sequence of unlabeled data given limited label query budget. Unlike conventional online learning tasks, online active learning is considerably more challenging because of two reasons. First, it is difficult to design an effective query strategy to decide when is appropriate to query the label of an incoming instance given limited query budget. Second, it is also challenging to decide how to update the predictive models effectively whenever the true label of an instance is queried. Most existing approaches for online active learning are often based on a family of first-order online learning algorithms, which are simple and efficient but fall short in the slow convergence and sub-optimal solution in exploiting the labeled training data. To solve these issues, this paper presents a novel framework of Second-order Online Active Learning (SOAL) by fully exploiting both the first-order and second-order information. The proposed algorithms are able to achieve effective online learning efficacy, maximize the predictive accuracy, and minimize the labeling cost. To make SOAL more practical for real-world applications, especially for class-imbalanced online classification tasks (e.g., malicious web detection), we extend the SOAL framework by proposing the Cost-sensitive Second-order Online Active Learning algorithm named 'SOALCS ', which is devised by maximizing the sum of weighted sensitivity and specificity or minimizing the cost of weighted mistakes of different classes. We conducted both theoretical analysis and empirical studies, including an extensive set of experiments on a variety of large-scale real-world datasets, in which the promising empirical results validate the efficacy and scalability of the proposed algorithms towards large-scale online learning tasks. NRF (Natl Research Foundation, S’pore) 2020-06-02T03:28:33Z 2020-06-02T03:28:33Z 2017 Journal Article Hao, S., Lu, J., Zhao, P., Zhang, C., Hoi, S. C. H., & Miao, C. (2018). Second-order online active learning and its applications. IEEE Transactions on Knowledge and Data Engineering, 30(7), 1338-1351. doi:10.1109/tkde.2017.2778097 1041-4347 https://hdl.handle.net/10356/140784 10.1109/TKDE.2017.2778097 2-s2.0-85037623896 7 30 1338 1351 en IEEE Transactions on Knowledge and Data Engineering © 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: https://doi.org/10.1109/TKDE.2017.2778097
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic Engineering::Computer science and engineering
Online Learning
Active Learning
spellingShingle Engineering::Computer science and engineering
Online Learning
Active Learning
Hao, Shuji
Lu, Jing
Zhao, Peilin
Zhang, Chi
Hoi, Steven C. H.
Miao, Chunyan
Second-order online active learning and its applications
description The goal of online active learning is to learn predictive models from a sequence of unlabeled data given limited label query budget. Unlike conventional online learning tasks, online active learning is considerably more challenging because of two reasons. First, it is difficult to design an effective query strategy to decide when is appropriate to query the label of an incoming instance given limited query budget. Second, it is also challenging to decide how to update the predictive models effectively whenever the true label of an instance is queried. Most existing approaches for online active learning are often based on a family of first-order online learning algorithms, which are simple and efficient but fall short in the slow convergence and sub-optimal solution in exploiting the labeled training data. To solve these issues, this paper presents a novel framework of Second-order Online Active Learning (SOAL) by fully exploiting both the first-order and second-order information. The proposed algorithms are able to achieve effective online learning efficacy, maximize the predictive accuracy, and minimize the labeling cost. To make SOAL more practical for real-world applications, especially for class-imbalanced online classification tasks (e.g., malicious web detection), we extend the SOAL framework by proposing the Cost-sensitive Second-order Online Active Learning algorithm named 'SOALCS ', which is devised by maximizing the sum of weighted sensitivity and specificity or minimizing the cost of weighted mistakes of different classes. We conducted both theoretical analysis and empirical studies, including an extensive set of experiments on a variety of large-scale real-world datasets, in which the promising empirical results validate the efficacy and scalability of the proposed algorithms towards large-scale online learning tasks.
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Hao, Shuji
Lu, Jing
Zhao, Peilin
Zhang, Chi
Hoi, Steven C. H.
Miao, Chunyan
format Article
author Hao, Shuji
Lu, Jing
Zhao, Peilin
Zhang, Chi
Hoi, Steven C. H.
Miao, Chunyan
author_sort Hao, Shuji
title Second-order online active learning and its applications
title_short Second-order online active learning and its applications
title_full Second-order online active learning and its applications
title_fullStr Second-order online active learning and its applications
title_full_unstemmed Second-order online active learning and its applications
title_sort second-order online active learning and its applications
publishDate 2020
url https://hdl.handle.net/10356/140784
_version_ 1681058846121918464