Confidence-aware graph regularization with heterogeneous pairwise features

Conventional classification methods tend to focus on features of individual objects, while missing out on potentially valuable pairwise features that capture the relationships between objects. Although recent developments on graph regularization exploit this aspect, existing works generally assume o...

Full description

Saved in:
Bibliographic Details
Main Authors: FANG, Yuan, HSU, Bo-June Paul, CHANG, Kevin Chen-Chuan
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2012
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/4061
https://ink.library.smu.edu.sg/context/sis_research/article/5064/viewcontent/ConfGraphReg2012.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Conventional classification methods tend to focus on features of individual objects, while missing out on potentially valuable pairwise features that capture the relationships between objects. Although recent developments on graph regularization exploit this aspect, existing works generally assume only a single kind of pairwise feature, which is often insufficient. We observe that multiple, heterogeneous pairwise features can often complement each other and are generally more robust in modeling the relationships between objects. Furthermore, as some objects are easier to classify than others, objects with higher initial classification confidence should be weighed more towards classifying related but more ambiguous objects, an observation missing from previous graph regularization techniques. In this paper, we propose a Dirichlet-based regularization framework that supports the combination of heterogeneous pairwise features with confidence-aware prediction using limited labeled training data. Next, we showcase a few applications of our framework in information retrieval, focusing on the problem of query intent classification. Finally, we demonstrate through a series of experiments the advantages of our framework on a large-scale real-world dataset.