Soft confidence-weighted learning
Online learning plays an important role in many big datamining problems because of its high efficiency and scalability. In theliterature, many online learning algorithms using gradient information havebeen applied to solve online classification problems. Recently, more effectivesecond-order algorith...
Saved in:
Main Authors: | , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2016
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/3418 https://ink.library.smu.edu.sg/context/sis_research/article/4419/viewcontent/SoftConfidenceWeightedLearning_2016.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
Summary: | Online learning plays an important role in many big datamining problems because of its high efficiency and scalability. In theliterature, many online learning algorithms using gradient information havebeen applied to solve online classification problems. Recently, more effectivesecond-order algorithms have been proposed, where the correlation between thefeatures is utilized to improve the learning efficiency. Among them,Confidence-Weighted (CW) learning algorithms are very effective, which assumethat the classification model is drawn from a Gaussian distribution, whichenables the model to be effectively updated with the second-order informationof the data stream. Despite being studied actively, these CW algorithms cannothandle nonseparable datasets and noisy datasets very well. In this article, wepropose a family of Soft Confidence-Weighted (SCW) learning algorithms for bothbinary classification and multiclass classification tasks, which is the firstfamily of online classification algorithms that enjoys four salient propertiessimultaneously: (1) large margin training, (2) confidence weighting, (3)capability to handle nonseparable data, and (4) adaptive margin. Ourexperimental results show that the proposed SCW algorithms significantlyoutperform the original CW algorithm. When comparing with a variety ofstate-of-the-art algorithms (including AROW, NAROW, and NHERD), we found thatSCW in general achieves better or at least comparable predictive performance,but enjoys considerably better efficiency advantage (i.e., using a smallernumber of updates and lower time cost). To facilitate future research, werelease all the datasets and source code to the public athttp://libol.stevenhoi.org/. |
---|