Semi-supervised heterogeneous fusion for multimedia data co-clustering

Co-clustering is a commonly used technique for tapping the rich meta-information of multimedia web documents, including category, annotation, and description, for associative discovery. However, most co-clustering methods proposed for heterogeneous data do not consider the representation problem of...

Full description

Saved in:

Bibliographic Details
Main Authors:	MENG, Lei, TAN, Ah-hwee, XU, Dong
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2013
Subjects:	Semi-supervised learning heterogeneous data co-clustering multimedia data mining Databases and Information Systems Data Storage Systems
Online Access:	https://ink.library.smu.edu.sg/sis_research/5231 https://ink.library.smu.edu.sg/context/sis_research/article/6234/viewcontent/Semi_Supervised_Heterogeneous_Fusion___TKDE_2014.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-6234
record_format	dspace
spelling	sg-smu-ink.sis_research-62342020-07-23T18:29:17Z Semi-supervised heterogeneous fusion for multimedia data co-clustering MENG, Lei TAN, Ah-hwee XU, Dong Co-clustering is a commonly used technique for tapping the rich meta-information of multimedia web documents, including category, annotation, and description, for associative discovery. However, most co-clustering methods proposed for heterogeneous data do not consider the representation problem of short and noisy text and their performance is limited by the empirical weighting of the multi-modal features. In this paper, we propose a generalized form of Heterogeneous Fusion Adaptive Resonance Theory, called GHF-ART, for co-clustering of large-scale web multimedia documents. By extending the two-channel Heterogeneous Fusion ART (HF-ART) to multiple channels, GHF-ART is designed to handle multimedia data with an arbitrarily rich level of meta-information. For handling short and noisy text, GHF-ART does not learn directly from the textual features. Instead, it identifies key tags by learning the probabilistic distribution of tag occurrences. More importantly, GHF-ART incorporates an adaptive method for effective fusion of multi-modal features, which weights the features of multiple data sources by incrementally measuring the importance of feature modalities through the intra-cluster scatters. Extensive experiments on two web image data sets and one text document set have shown that GHF-ART achieves significantly better clustering performance and is much faster than many existing state-of-the-art algorithms. 2013-03-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/5231 info:doi/10.1109/TKDE.2013.47 https://ink.library.smu.edu.sg/context/sis_research/article/6234/viewcontent/Semi_Supervised_Heterogeneous_Fusion___TKDE_2014.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Semi-supervised learning heterogeneous data co-clustering multimedia data mining Databases and Information Systems Data Storage Systems
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Semi-supervised learning heterogeneous data co-clustering multimedia data mining Databases and Information Systems Data Storage Systems
spellingShingle	Semi-supervised learning heterogeneous data co-clustering multimedia data mining Databases and Information Systems Data Storage Systems MENG, Lei TAN, Ah-hwee XU, Dong Semi-supervised heterogeneous fusion for multimedia data co-clustering
description	Co-clustering is a commonly used technique for tapping the rich meta-information of multimedia web documents, including category, annotation, and description, for associative discovery. However, most co-clustering methods proposed for heterogeneous data do not consider the representation problem of short and noisy text and their performance is limited by the empirical weighting of the multi-modal features. In this paper, we propose a generalized form of Heterogeneous Fusion Adaptive Resonance Theory, called GHF-ART, for co-clustering of large-scale web multimedia documents. By extending the two-channel Heterogeneous Fusion ART (HF-ART) to multiple channels, GHF-ART is designed to handle multimedia data with an arbitrarily rich level of meta-information. For handling short and noisy text, GHF-ART does not learn directly from the textual features. Instead, it identifies key tags by learning the probabilistic distribution of tag occurrences. More importantly, GHF-ART incorporates an adaptive method for effective fusion of multi-modal features, which weights the features of multiple data sources by incrementally measuring the importance of feature modalities through the intra-cluster scatters. Extensive experiments on two web image data sets and one text document set have shown that GHF-ART achieves significantly better clustering performance and is much faster than many existing state-of-the-art algorithms.
format	text
author	MENG, Lei TAN, Ah-hwee XU, Dong
author_facet	MENG, Lei TAN, Ah-hwee XU, Dong
author_sort	MENG, Lei
title	Semi-supervised heterogeneous fusion for multimedia data co-clustering
title_short	Semi-supervised heterogeneous fusion for multimedia data co-clustering
title_full	Semi-supervised heterogeneous fusion for multimedia data co-clustering
title_fullStr	Semi-supervised heterogeneous fusion for multimedia data co-clustering
title_full_unstemmed	Semi-supervised heterogeneous fusion for multimedia data co-clustering
title_sort	semi-supervised heterogeneous fusion for multimedia data co-clustering
publisher	Institutional Knowledge at Singapore Management University
publishDate	2013
url	https://ink.library.smu.edu.sg/sis_research/5231 https://ink.library.smu.edu.sg/context/sis_research/article/6234/viewcontent/Semi_Supervised_Heterogeneous_Fusion___TKDE_2014.pdf
_version_	1770575342238433280

Semi-supervised heterogeneous fusion for multimedia data co-clustering

Similar Items