Fusing pairwise modalities for emotion recognition in conversations

Multimodal fusion has the potential to significantly enhance model performance in the domain of Emotion Recognition in Conversations (ERC) by efficiently integrating information from diverse modalities. However, existing methods face challenges as they directly integrate information from different m...

Full description

Saved in:

Bibliographic Details
Main Authors:	Fan, Chunxiao, Lin, Jie, Mao, Rui, Cambria, Erik
Other Authors:	School of Computer Science and Engineering
Format:	Article
Language:	English
Published:	2024
Subjects:	Computer and Information Science Multimodal Feature fusion
Online Access:	https://hdl.handle.net/10356/175811
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-175811
record_format	dspace
spelling	sg-ntu-dr.10356-1758112024-05-07T02:09:08Z Fusing pairwise modalities for emotion recognition in conversations Fan, Chunxiao Lin, Jie Mao, Rui Cambria, Erik School of Computer Science and Engineering Computer and Information Science Multimodal Feature fusion Multimodal fusion has the potential to significantly enhance model performance in the domain of Emotion Recognition in Conversations (ERC) by efficiently integrating information from diverse modalities. However, existing methods face challenges as they directly integrate information from different modalities, making it difficult to assess the individual impact of each modality during training and to capture nuanced fusion. To deal with it, we propose a novel framework named Fusing Pairwise Modalities for ERC. In this proposed method, the pairwise fusion technique is incorporated into multimodal fusion to enhance model performance, which enables each modality to contribute unique information, thereby facilitating a more comprehensive understanding of the emotional context. Additionally, a designed density loss is applied to characterise fused feature density, with a specific focus on mitigating redundancy in pairwise fusion methods. The density loss penalises feature density during training, contributing to a more efficient and effective fusion process. To validate the proposed framework, we conduct comprehensive experiments on two benchmark datasets, namely IEMOCAP and MELD. The results demonstrate the superior performance of our approach compared to state-of-the-art methods, indicating its effectiveness in addressing challenges related to multimodal fusion in the context of ERC. This work is supported in part by the National Key Research and Development Program of China (No. 2022YFC3803200), the Natural Science Foundation of China (No. 61802105), the University Synergy Innovation Program of Anhui Province, China (No. GXXT-2021-005 and GXXT-2022–033), and the Fundamental Research Funds for the Central Universities, China (No. JZ2022HGTB0250 and PA2023IISL0096). 2024-05-07T02:09:08Z 2024-05-07T02:09:08Z 2024 Journal Article Fan, C., Lin, J., Mao, R. & Cambria, E. (2024). Fusing pairwise modalities for emotion recognition in conversations. Information Fusion, 106, 102306-. https://dx.doi.org/10.1016/j.inffus.2024.102306 1566-2535 https://hdl.handle.net/10356/175811 10.1016/j.inffus.2024.102306 2-s2.0-85185399986 106 102306 en Information Fusion © 2024 Elsevier B.V. All rights reserved.
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science Multimodal Feature fusion
spellingShingle	Computer and Information Science Multimodal Feature fusion Fan, Chunxiao Lin, Jie Mao, Rui Cambria, Erik Fusing pairwise modalities for emotion recognition in conversations
description	Multimodal fusion has the potential to significantly enhance model performance in the domain of Emotion Recognition in Conversations (ERC) by efficiently integrating information from diverse modalities. However, existing methods face challenges as they directly integrate information from different modalities, making it difficult to assess the individual impact of each modality during training and to capture nuanced fusion. To deal with it, we propose a novel framework named Fusing Pairwise Modalities for ERC. In this proposed method, the pairwise fusion technique is incorporated into multimodal fusion to enhance model performance, which enables each modality to contribute unique information, thereby facilitating a more comprehensive understanding of the emotional context. Additionally, a designed density loss is applied to characterise fused feature density, with a specific focus on mitigating redundancy in pairwise fusion methods. The density loss penalises feature density during training, contributing to a more efficient and effective fusion process. To validate the proposed framework, we conduct comprehensive experiments on two benchmark datasets, namely IEMOCAP and MELD. The results demonstrate the superior performance of our approach compared to state-of-the-art methods, indicating its effectiveness in addressing challenges related to multimodal fusion in the context of ERC.
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Fan, Chunxiao Lin, Jie Mao, Rui Cambria, Erik
format	Article
author	Fan, Chunxiao Lin, Jie Mao, Rui Cambria, Erik
author_sort	Fan, Chunxiao
title	Fusing pairwise modalities for emotion recognition in conversations
title_short	Fusing pairwise modalities for emotion recognition in conversations
title_full	Fusing pairwise modalities for emotion recognition in conversations
title_fullStr	Fusing pairwise modalities for emotion recognition in conversations
title_full_unstemmed	Fusing pairwise modalities for emotion recognition in conversations
title_sort	fusing pairwise modalities for emotion recognition in conversations
publishDate	2024
url	https://hdl.handle.net/10356/175811
_version_	1814047307033739264

Fusing pairwise modalities for emotion recognition in conversations

Similar Items