Task-generic semantic convolutional neural network for web text-aided image classification

In this work, we explore how to use external and auxiliary web text to improve image classification. The keystone of web text-aided image classification is the representation learning for these two modalities of data. In the recent decade, convolutional neural networks (CNN) as the core representati...

Full description

Saved in:

Bibliographic Details
Main Authors:	Wang, Dongzhe, Mao, Kezhi
Other Authors:	School of Electrical and Electronic Engineering
Format:	Article
Language:	English
Published:	2021
Subjects:	Engineering::Electrical and electronic engineering Semantic Convolutional Neural Network Image Recognition
Online Access:	https://hdl.handle.net/10356/151327
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-151327
record_format	dspace
spelling	sg-ntu-dr.10356-1513272021-06-22T05:07:59Z Task-generic semantic convolutional neural network for web text-aided image classification Wang, Dongzhe Mao, Kezhi School of Electrical and Electronic Engineering Engineering::Electrical and electronic engineering Semantic Convolutional Neural Network Image Recognition In this work, we explore how to use external and auxiliary web text to improve image classification. The keystone of web text-aided image classification is the representation learning for these two modalities of data. In the recent decade, convolutional neural networks (CNN) as the core representation methods of images have become a commodity in computer vision community. On the other hand, the long reign of word vectors has the same wide-ranging impact on NLP for representation learning. Based on the pre-trained word vectors, we propose a novel semantic CNN (s-CNN) model for high-level text representation learning using task-generic semantic filters. However, the s-CNN model inevitably brings about surplus semantic filters to achieve better applicability and generalization in universal tasks. Moreover, the surplus filters may lead to semantic overlaps and feature redundancy issue. To address this issue, we develop the so-called s-CNN Clustered (s-CNNC) models that uses filter clusters instead of individual filters. Interacting with the image CNN models, the s-CNNC models can further boost image classification under a multi-modal framework (mm-CNN). In addition, we propose to use the external text information selectively in the mm-CNN network to alleviate the noise problem inherent in web text. We validate the effectiveness of the proposed models on six benchmark datasets, and the results show that our approaches achieve remarkable improvements. 2021-06-22T05:07:58Z 2021-06-22T05:07:58Z 2019 Journal Article Wang, D. & Mao, K. (2019). Task-generic semantic convolutional neural network for web text-aided image classification. Neurocomputing, 329, 103-115. https://dx.doi.org/10.1016/j.neucom.2018.09.042 0925-2312 0000-0002-1467-6023 https://hdl.handle.net/10356/151327 10.1016/j.neucom.2018.09.042 2-s2.0-85055729191 329 103 115 en Neurocomputing © 2018 Elsevier B.V. All rights reserved.
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Electrical and electronic engineering Semantic Convolutional Neural Network Image Recognition
spellingShingle	Engineering::Electrical and electronic engineering Semantic Convolutional Neural Network Image Recognition Wang, Dongzhe Mao, Kezhi Task-generic semantic convolutional neural network for web text-aided image classification
description	In this work, we explore how to use external and auxiliary web text to improve image classification. The keystone of web text-aided image classification is the representation learning for these two modalities of data. In the recent decade, convolutional neural networks (CNN) as the core representation methods of images have become a commodity in computer vision community. On the other hand, the long reign of word vectors has the same wide-ranging impact on NLP for representation learning. Based on the pre-trained word vectors, we propose a novel semantic CNN (s-CNN) model for high-level text representation learning using task-generic semantic filters. However, the s-CNN model inevitably brings about surplus semantic filters to achieve better applicability and generalization in universal tasks. Moreover, the surplus filters may lead to semantic overlaps and feature redundancy issue. To address this issue, we develop the so-called s-CNN Clustered (s-CNNC) models that uses filter clusters instead of individual filters. Interacting with the image CNN models, the s-CNNC models can further boost image classification under a multi-modal framework (mm-CNN). In addition, we propose to use the external text information selectively in the mm-CNN network to alleviate the noise problem inherent in web text. We validate the effectiveness of the proposed models on six benchmark datasets, and the results show that our approaches achieve remarkable improvements.
author2	School of Electrical and Electronic Engineering
author_facet	School of Electrical and Electronic Engineering Wang, Dongzhe Mao, Kezhi
format	Article
author	Wang, Dongzhe Mao, Kezhi
author_sort	Wang, Dongzhe
title	Task-generic semantic convolutional neural network for web text-aided image classification
title_short	Task-generic semantic convolutional neural network for web text-aided image classification
title_full	Task-generic semantic convolutional neural network for web text-aided image classification
title_fullStr	Task-generic semantic convolutional neural network for web text-aided image classification
title_full_unstemmed	Task-generic semantic convolutional neural network for web text-aided image classification
title_sort	task-generic semantic convolutional neural network for web text-aided image classification
publishDate	2021
url	https://hdl.handle.net/10356/151327
_version_	1703971228236120064

Task-generic semantic convolutional neural network for web text-aided image classification

Similar Items