KnowleNet: knowledge fusion network for multimodal sarcasm detection

Sarcasm is a form of communication often used to express contempt or ridicule, where the speaker conveys a message opposite to their true meaning, typically intending to mock or belittle a specific target. Sarcasm detection has gained great attention in the field of natural language processing due t...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yue, Tan, Mao, Rui, Wang, Heng, Hu, Zonghai, Cambria, Erik
Other Authors:	School of Computer Science and Engineering
Format:	Article
Language:	English
Published:	2023
Subjects:	Engineering::Computer science and engineering Sarcasm Detection Multimodal Learning
Online Access:	https://hdl.handle.net/10356/171194
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-171194
record_format	dspace
spelling	sg-ntu-dr.10356-1711942023-10-17T04:24:11Z KnowleNet: knowledge fusion network for multimodal sarcasm detection Yue, Tan Mao, Rui Wang, Heng Hu, Zonghai Cambria, Erik School of Computer Science and Engineering Engineering::Computer science and engineering Sarcasm Detection Multimodal Learning Sarcasm is a form of communication often used to express contempt or ridicule, where the speaker conveys a message opposite to their true meaning, typically intending to mock or belittle a specific target. Sarcasm detection has gained great attention in the field of natural language processing due to the fact that sarcasm is widespread on social media and difficult to detect for machines. While early efforts in sarcasm detection solely relied on textual data, the abundance of multimodal data on social media is also non-negligible. Recent research has focused on multimodal sarcasm detection, where attention mechanisms and graph neural networks were commonly used to identify relevant information in both image and text data. However, these methods may overlook the importance of prior knowledge and cross-modal semantic contrast, which are crucial factors for human sarcasm detection. In this paper, we propose a novel model named KnowleNet that leverages the ConceptNet knowledge base to incorporate prior knowledge and determine image–text relatedness through sample-level and word-level cross-modal semantic similarity detection. Contrastive learning is also introduced to improve the spatial distribution of sarcastic (positive) and non-sarcastic (negative) samples. The proposed model achieves state-of-the-art performance on publicly available benchmark datasets. The work described in this paper is supported by the BUPT innovation and entrepreneurship support program (2022-YC-S002) and the China Scholarship Council (CSC) under Grant 202206470036. 2023-10-17T04:24:11Z 2023-10-17T04:24:11Z 2023 Journal Article Yue, T., Mao, R., Wang, H., Hu, Z. & Cambria, E. (2023). KnowleNet: knowledge fusion network for multimodal sarcasm detection. Information Fusion, 100, 101921-. https://dx.doi.org/10.1016/j.inffus.2023.101921 1566-2535 https://hdl.handle.net/10356/171194 10.1016/j.inffus.2023.101921 2-s2.0-85165537676 100 101921 en Information Fusion © 2023 Elsevier B.V. All rights reserved.
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering Sarcasm Detection Multimodal Learning
spellingShingle	Engineering::Computer science and engineering Sarcasm Detection Multimodal Learning Yue, Tan Mao, Rui Wang, Heng Hu, Zonghai Cambria, Erik KnowleNet: knowledge fusion network for multimodal sarcasm detection
description	Sarcasm is a form of communication often used to express contempt or ridicule, where the speaker conveys a message opposite to their true meaning, typically intending to mock or belittle a specific target. Sarcasm detection has gained great attention in the field of natural language processing due to the fact that sarcasm is widespread on social media and difficult to detect for machines. While early efforts in sarcasm detection solely relied on textual data, the abundance of multimodal data on social media is also non-negligible. Recent research has focused on multimodal sarcasm detection, where attention mechanisms and graph neural networks were commonly used to identify relevant information in both image and text data. However, these methods may overlook the importance of prior knowledge and cross-modal semantic contrast, which are crucial factors for human sarcasm detection. In this paper, we propose a novel model named KnowleNet that leverages the ConceptNet knowledge base to incorporate prior knowledge and determine image–text relatedness through sample-level and word-level cross-modal semantic similarity detection. Contrastive learning is also introduced to improve the spatial distribution of sarcastic (positive) and non-sarcastic (negative) samples. The proposed model achieves state-of-the-art performance on publicly available benchmark datasets.
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Yue, Tan Mao, Rui Wang, Heng Hu, Zonghai Cambria, Erik
format	Article
author	Yue, Tan Mao, Rui Wang, Heng Hu, Zonghai Cambria, Erik
author_sort	Yue, Tan
title	KnowleNet: knowledge fusion network for multimodal sarcasm detection
title_short	KnowleNet: knowledge fusion network for multimodal sarcasm detection
title_full	KnowleNet: knowledge fusion network for multimodal sarcasm detection
title_fullStr	KnowleNet: knowledge fusion network for multimodal sarcasm detection
title_full_unstemmed	KnowleNet: knowledge fusion network for multimodal sarcasm detection
title_sort	knowlenet: knowledge fusion network for multimodal sarcasm detection
publishDate	2023
url	https://hdl.handle.net/10356/171194
_version_	1781793877504557056

KnowleNet: knowledge fusion network for multimodal sarcasm detection

Similar Items