Semi-supervised hierarchical clustering for personalized web image organization

Existing efforts on web image organization usually transform the task into surrounding text clustering. However, Current text clustering algorithms do not address the problem of insufficient statistical information for image representation and noisy tags which greatly decreases the clustering perfor...

全面介紹

Saved in:
書目詳細資料
Main Authors: MENG, Lei, TAN, Ah-hwee
格式: text
語言:English
出版: Institutional Knowledge at Singapore Management University 2012
主題:
在線閱讀:https://ink.library.smu.edu.sg/sis_research/6887
https://ink.library.smu.edu.sg/context/sis_research/article/7890/viewcontent/Semi_supervisedHierarchicalClusteringforPersonalizedWebImageOrganization.pdf
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Singapore Management University
語言: English
id sg-smu-ink.sis_research-7890
record_format dspace
spelling sg-smu-ink.sis_research-78902022-02-07T11:02:03Z Semi-supervised hierarchical clustering for personalized web image organization MENG, Lei TAN, Ah-hwee Existing efforts on web image organization usually transform the task into surrounding text clustering. However, Current text clustering algorithms do not address the problem of insufficient statistical information for image representation and noisy tags which greatly decreases the clustering performance while increases the computational cost. In this paper, we propose a two-step semi-supervised hierarchical clustering algorithm, Personalized Hierarchical Theme-based Clustering (PHTC), for web image organization. In the first step, the Probabilistic Fusion ART (PF-ART) is proposed for grouping semantically similar images and simultaneously learning the probabilistic distribution of tag occurrence for mining the key tags/topics of clusters. In this way, the side-effect of noisy tags can be largely eliminated. Moreover, PF-ART can incorporate user preference for semi-supervised learning and provide users a direct control of clustering results. In the second step, a novel agglomerative merging strategy based on Cluster Semantic Relevance, proposed for measuring the semantic similarity between clusters, is employed for associating the clusters by generating a semantic hierarchy. Different from existing hierarchical clustering algorithms, the proposed merging strategy can provide a multi-branch tree structure which is more systematic and clearer than traditional binary tree structure. Extensive experiments on two real world web image data sets, namely NUS-WIDE and Flickr, demonstrate the effectiveness of our algorithm for large web image data sets. 2012-06-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6887 info:doi/10.1109/IJCNN.2012.6252397 https://ink.library.smu.edu.sg/context/sis_research/article/7890/viewcontent/Semi_supervisedHierarchicalClusteringforPersonalizedWebImageOrganization.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems Graphics and Human Computer Interfaces
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Databases and Information Systems
Graphics and Human Computer Interfaces
spellingShingle Databases and Information Systems
Graphics and Human Computer Interfaces
MENG, Lei
TAN, Ah-hwee
Semi-supervised hierarchical clustering for personalized web image organization
description Existing efforts on web image organization usually transform the task into surrounding text clustering. However, Current text clustering algorithms do not address the problem of insufficient statistical information for image representation and noisy tags which greatly decreases the clustering performance while increases the computational cost. In this paper, we propose a two-step semi-supervised hierarchical clustering algorithm, Personalized Hierarchical Theme-based Clustering (PHTC), for web image organization. In the first step, the Probabilistic Fusion ART (PF-ART) is proposed for grouping semantically similar images and simultaneously learning the probabilistic distribution of tag occurrence for mining the key tags/topics of clusters. In this way, the side-effect of noisy tags can be largely eliminated. Moreover, PF-ART can incorporate user preference for semi-supervised learning and provide users a direct control of clustering results. In the second step, a novel agglomerative merging strategy based on Cluster Semantic Relevance, proposed for measuring the semantic similarity between clusters, is employed for associating the clusters by generating a semantic hierarchy. Different from existing hierarchical clustering algorithms, the proposed merging strategy can provide a multi-branch tree structure which is more systematic and clearer than traditional binary tree structure. Extensive experiments on two real world web image data sets, namely NUS-WIDE and Flickr, demonstrate the effectiveness of our algorithm for large web image data sets.
format text
author MENG, Lei
TAN, Ah-hwee
author_facet MENG, Lei
TAN, Ah-hwee
author_sort MENG, Lei
title Semi-supervised hierarchical clustering for personalized web image organization
title_short Semi-supervised hierarchical clustering for personalized web image organization
title_full Semi-supervised hierarchical clustering for personalized web image organization
title_fullStr Semi-supervised hierarchical clustering for personalized web image organization
title_full_unstemmed Semi-supervised hierarchical clustering for personalized web image organization
title_sort semi-supervised hierarchical clustering for personalized web image organization
publisher Institutional Knowledge at Singapore Management University
publishDate 2012
url https://ink.library.smu.edu.sg/sis_research/6887
https://ink.library.smu.edu.sg/context/sis_research/article/7890/viewcontent/Semi_supervisedHierarchicalClusteringforPersonalizedWebImageOrganization.pdf
_version_ 1770576113757585408