Online learning for search and classification

Online learning is a common and useful tool for machine learning and data mining. In contrast to batch learning, online learning receives a sequence of training instances and uses some of them at a time. By the nature of online learning, the training instances may be processed only once. Therefore o...

Full description

Saved in:
Bibliographic Details
Main Author: Nguyen, Thanh Tam
Other Authors: Chang Kuiyu
Format: Theses and Dissertations
Language:English
Published: 2014
Subjects:
Online Access:https://hdl.handle.net/10356/55287
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Online learning is a common and useful tool for machine learning and data mining. In contrast to batch learning, online learning receives a sequence of training instances and uses some of them at a time. By the nature of online learning, the training instances may be processed only once. Therefore online learning algorithms can work on big data beyond the memory or disk capacity as well as streaming data. Moreover in document classification, online linear learning has been shown to be much more efficient than non-linear learning in terms of training and testing time. Therefore, online linear learning has recently become an active research topic. This thesis proposes a research framework that attempts to solve the search and classification problems based on the online linear learning approaches. Specifically, we have proposed online learning classification algorithms that are able to work on multiple view datasets and an online learning-to-rank algorithm that improves the accuracy of a search engine. The main research contributions are listed as follows. (i) Feature selection: we have investigated a number of newly supervised term weighting methods to improve the performance of text classification; (ii) Online classification: we have proposed several online learning algorithms that can be used for topic classification; (iii) Two-view online learning: we have proposed a two-view online learning algorithm, which can work on two-view datasets; (iv) Online learning-to-rank: for search engine, we have proposed an online learning-to-rank algorithm, which was to learn a scoring function to re-rank the search result.