Handling ambiguity via input-output kernel learning

Data ambiguities exist in many data mining and machine learning applications such as text categorization and image retrieval. For instance, it is generally beneficial to utilize the ambiguous unlabeled documents to learn a more robust classifier for text categorization under the semi-supervised lear...

Full description

Saved in:
Bibliographic Details
Main Authors: Xu, Xinxing, Tsang, Ivor Wai-Hung, Xu, Dong
Other Authors: School of Computer Engineering
Format: Conference or Workshop Item
Language:English
Published: 2013
Subjects:
Online Access:https://hdl.handle.net/10356/99740
http://hdl.handle.net/10220/13014
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Data ambiguities exist in many data mining and machine learning applications such as text categorization and image retrieval. For instance, it is generally beneficial to utilize the ambiguous unlabeled documents to learn a more robust classifier for text categorization under the semi-supervised learning setting. To handle general data ambiguities, we present a unified kernel learning framework named Input-Output Kernel Learning (IOKL). Based on our framework, we further propose a novel soft margin group sparse Multiple Kernel Learning (MKL) formulation by introducing a group kernel slack variable to each group of base input-output kernels. Moreover, an efficient block-wise coordinate descent algorithm with an analytical solution for the kernel combination coefficients is developed to solve the proposed formulation. We conduct comprehensive experiments on benchmark datasets for both semi-supervised learning and multiple instance learning tasks, and also apply our IOKL framework to a computer vision application called text-based image retrieval on the NUS-WIDE dataset. Promising results demonstrate the effectiveness of our proposed IOKL framework.