Semi-supervised hierarchical clustering for personalized web image organization

Existing efforts on web image organization usually transform the task into surrounding text clustering. However, Current text clustering algorithms do not address the problem of insufficient statistical information for image representation and noisy tags which greatly decreases the clustering perfor...

Full description

Saved in:
Bibliographic Details
Main Authors: Meng, Lei, Tan, Ah-Hwee
Other Authors: School of Computer Engineering
Format: Conference or Workshop Item
Language:English
Published: 2013
Subjects:
Online Access:https://hdl.handle.net/10356/97882
http://hdl.handle.net/10220/12415
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-97882
record_format dspace
spelling sg-ntu-dr.10356-978822020-05-28T07:19:04Z Semi-supervised hierarchical clustering for personalized web image organization Meng, Lei Tan, Ah-Hwee School of Computer Engineering International Joint Conference on Neural Networks (2012 : Brisbane, Australia) DRNTU::Engineering::Computer science and engineering Existing efforts on web image organization usually transform the task into surrounding text clustering. However, Current text clustering algorithms do not address the problem of insufficient statistical information for image representation and noisy tags which greatly decreases the clustering performance while increases the computational cost. In this paper, we propose a two-step semi-supervised hierarchical clustering algorithm, Personalized Hierarchical Theme-based Clustering (PHTC), for web image organization. In the first step, the Probabilistic Fusion ART (PF-ART) is proposed for grouping semantically similar images and simultaneously learning the probabilistic distribution of tag occurrence for mining the key tags/topics of clusters. In this way, the side-effect of noisy tags can be largely eliminated. Moreover, PF-ART can incorporate user preference for semi-supervised learning and provide users a direct control of clustering results. In the second step, a novel agglomerative merging strategy based on Cluster Semantic Relevance, proposed for measuring the semantic similarity between clusters, is employed for associating the clusters by generating a semantic hierarchy. Different from existing hierarchical clustering algorithms, the proposed merging strategy can provide a multi-branch tree structure which is more systematic and clearer than traditional binary tree structure. Extensive experiments on two real world web image data sets, namely NUS-WIDE and Flickr, demonstrate the effectiveness of our algorithm for large web image data sets. 2013-07-29T03:01:48Z 2019-12-06T19:47:38Z 2013-07-29T03:01:48Z 2019-12-06T19:47:38Z 2012 2012 Conference Paper Meng, L., & Tan, A.-H. (2012). Semi-supervised hierarchical clustering for personalized web image organization. The 2012 International Joint Conference on Neural Networks (IJCNN). https://hdl.handle.net/10356/97882 http://hdl.handle.net/10220/12415 10.1109/IJCNN.2012.6252397 en © 2012 IEEE.
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering
spellingShingle DRNTU::Engineering::Computer science and engineering
Meng, Lei
Tan, Ah-Hwee
Semi-supervised hierarchical clustering for personalized web image organization
description Existing efforts on web image organization usually transform the task into surrounding text clustering. However, Current text clustering algorithms do not address the problem of insufficient statistical information for image representation and noisy tags which greatly decreases the clustering performance while increases the computational cost. In this paper, we propose a two-step semi-supervised hierarchical clustering algorithm, Personalized Hierarchical Theme-based Clustering (PHTC), for web image organization. In the first step, the Probabilistic Fusion ART (PF-ART) is proposed for grouping semantically similar images and simultaneously learning the probabilistic distribution of tag occurrence for mining the key tags/topics of clusters. In this way, the side-effect of noisy tags can be largely eliminated. Moreover, PF-ART can incorporate user preference for semi-supervised learning and provide users a direct control of clustering results. In the second step, a novel agglomerative merging strategy based on Cluster Semantic Relevance, proposed for measuring the semantic similarity between clusters, is employed for associating the clusters by generating a semantic hierarchy. Different from existing hierarchical clustering algorithms, the proposed merging strategy can provide a multi-branch tree structure which is more systematic and clearer than traditional binary tree structure. Extensive experiments on two real world web image data sets, namely NUS-WIDE and Flickr, demonstrate the effectiveness of our algorithm for large web image data sets.
author2 School of Computer Engineering
author_facet School of Computer Engineering
Meng, Lei
Tan, Ah-Hwee
format Conference or Workshop Item
author Meng, Lei
Tan, Ah-Hwee
author_sort Meng, Lei
title Semi-supervised hierarchical clustering for personalized web image organization
title_short Semi-supervised hierarchical clustering for personalized web image organization
title_full Semi-supervised hierarchical clustering for personalized web image organization
title_fullStr Semi-supervised hierarchical clustering for personalized web image organization
title_full_unstemmed Semi-supervised hierarchical clustering for personalized web image organization
title_sort semi-supervised hierarchical clustering for personalized web image organization
publishDate 2013
url https://hdl.handle.net/10356/97882
http://hdl.handle.net/10220/12415
_version_ 1681056467043483648