Learning Bregman Distance Functions and its Application for Semi-Supervised Clustering

Learning distance functions with side information plays a key role in many machine learning and data mining applications. Conventional approaches often assume a Mahalanobis distance function. These approaches are limited in two aspects: (i) they are computationally expensive (even infeasible) for hi...

Full description

Saved in:
Bibliographic Details
Main Authors: WU, Lei, JIN, Rong, HOI, Steven C. H., ZHU, Jianke, YU, Nenghai
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2009
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/2368
https://ink.library.smu.edu.sg/context/sis_research/article/3368/viewcontent/NIPS09_Bregman_CR_jin.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-3368
record_format dspace
spelling sg-smu-ink.sis_research-33682016-01-13T05:13:43Z Learning Bregman Distance Functions and its Application for Semi-Supervised Clustering WU, Lei JIN, Rong HOI, Steven C. H. ZHU, Jianke YU, Nenghai Learning distance functions with side information plays a key role in many machine learning and data mining applications. Conventional approaches often assume a Mahalanobis distance function. These approaches are limited in two aspects: (i) they are computationally expensive (even infeasible) for high dimensional data because the size of the metric is in the square of dimensionality; (ii) they assume a fixed metric for the entire input space and therefore are unable to handle heterogeneous data. In this paper, we propose a novel scheme that learns nonlinear Bregman distance functions from side information using a nonparametric approach that is similar to support vector machines. The proposed scheme avoids the assumption of fixed metric by implicitly deriving a local distance from the Hessian matrix of a convex function that is used to generate the Bregman distance function. We also present an efficient learning algorithm for the proposed scheme for distance function learning. The extensive experiments with semi-supervised clustering show the proposed technique (i) outperforms the state-of-the-art approaches for distance function learning, and (ii) is computationally efficient for high dimensional data. 2009-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/2368 https://ink.library.smu.edu.sg/context/sis_research/article/3368/viewcontent/NIPS09_Bregman_CR_jin.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Computer Sciences Databases and Information Systems
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Computer Sciences
Databases and Information Systems
spellingShingle Computer Sciences
Databases and Information Systems
WU, Lei
JIN, Rong
HOI, Steven C. H.
ZHU, Jianke
YU, Nenghai
Learning Bregman Distance Functions and its Application for Semi-Supervised Clustering
description Learning distance functions with side information plays a key role in many machine learning and data mining applications. Conventional approaches often assume a Mahalanobis distance function. These approaches are limited in two aspects: (i) they are computationally expensive (even infeasible) for high dimensional data because the size of the metric is in the square of dimensionality; (ii) they assume a fixed metric for the entire input space and therefore are unable to handle heterogeneous data. In this paper, we propose a novel scheme that learns nonlinear Bregman distance functions from side information using a nonparametric approach that is similar to support vector machines. The proposed scheme avoids the assumption of fixed metric by implicitly deriving a local distance from the Hessian matrix of a convex function that is used to generate the Bregman distance function. We also present an efficient learning algorithm for the proposed scheme for distance function learning. The extensive experiments with semi-supervised clustering show the proposed technique (i) outperforms the state-of-the-art approaches for distance function learning, and (ii) is computationally efficient for high dimensional data.
format text
author WU, Lei
JIN, Rong
HOI, Steven C. H.
ZHU, Jianke
YU, Nenghai
author_facet WU, Lei
JIN, Rong
HOI, Steven C. H.
ZHU, Jianke
YU, Nenghai
author_sort WU, Lei
title Learning Bregman Distance Functions and its Application for Semi-Supervised Clustering
title_short Learning Bregman Distance Functions and its Application for Semi-Supervised Clustering
title_full Learning Bregman Distance Functions and its Application for Semi-Supervised Clustering
title_fullStr Learning Bregman Distance Functions and its Application for Semi-Supervised Clustering
title_full_unstemmed Learning Bregman Distance Functions and its Application for Semi-Supervised Clustering
title_sort learning bregman distance functions and its application for semi-supervised clustering
publisher Institutional Knowledge at Singapore Management University
publishDate 2009
url https://ink.library.smu.edu.sg/sis_research/2368
https://ink.library.smu.edu.sg/context/sis_research/article/3368/viewcontent/NIPS09_Bregman_CR_jin.pdf
_version_ 1770572114135351296