Personalized web image organization
Due to the problem of semantic gap, i.e. the visual content of an image may not represent its semantics well, existing efforts on web image organization usually transform this task to clustering the surrounding text. However, because the surrounding text is usually short and the words therein usuall...
Saved in:
Main Authors: | , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2019
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/9811 https://ink.library.smu.edu.sg/context/sis_research/article/10811/viewcontent/454069_1_En_Print.indd.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
Summary: | Due to the problem of semantic gap, i.e. the visual content of an image may not represent its semantics well, existing efforts on web image organization usually transform this task to clustering the surrounding text. However, because the surrounding text is usually short and the words therein usually appear only once, existing text clustering algorithms can hardly use the statistical information for image representation and may achieve downgraded performance with higher computational cost caused by learning from noisy tags. This chapter presents using the Probabilistic ART with user preference architecture, as introduced in Sects. 3.5 and 3.4, for personalized web image organization. This fused algorithm is named Probabilistic Fusion ART (PF-ART), which groups images of similar semantics together and simultaneously mines the key tags/topics of individual clusters.Moreover, it performs semi-supervised learning using the user-provided taggings for images to give users direct control of the generated clusters. An agglomerative merging strategy is further used to organize the clusters into a hierarchy, which is of a multi-branch tree structure rather than a binary tree generated by traditional hierarchical clustering algorithms. The entire two-step algorithm is called Personalized Hierarchical Theme-based Clustering (PHTC), for tag-based web image organization. Two large-scale real-world web image collections, namely the NUS-WIDE and the Flickr datasets, are used to evaluate PHTC and compare it with existing algorithms in terms of clustering performance and time cost. |
---|