M3SA: Multimodal Sentiment Analysis based on multi-scale feature extraction and multi-task learning

Sentiment analysis plays an indispensable part in human-computer interaction. Multimodal sentiment analysis can overcome the shortcomings of unimodal sentiment analysis by fusing multimodal data. However, how to extracte improved feature representations and how to execute effective modality fusion a...

Full description

Saved in:
Bibliographic Details
Main Authors: LIN, Changkai, CHENG, Hongju, RAO, Qiang, YANG, Yang
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2024
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/8755
https://ink.library.smu.edu.sg/context/sis_research/article/9758/viewcontent/2024_M3SA_Multimodalpav.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-9758
record_format dspace
spelling sg-smu-ink.sis_research-97582024-05-03T06:48:56Z M3SA: Multimodal Sentiment Analysis based on multi-scale feature extraction and multi-task learning LIN, Changkai CHENG, Hongju RAO, Qiang YANG, Yang Sentiment analysis plays an indispensable part in human-computer interaction. Multimodal sentiment analysis can overcome the shortcomings of unimodal sentiment analysis by fusing multimodal data. However, how to extracte improved feature representations and how to execute effective modality fusion are two crucial problems in multimodal sentiment analysis. Traditional work uses simple sub-models for feature extraction, and they ignore features of different scales and fuse different modalities of data equally, making it easier to incorporate extraneous information and affect analysis accuracy. In this paper, we propose a Multimodal Sentiment Analysis model based on Multi-scale feature extraction and Multi-task learning (M 3 SA). First, we propose a multi-scale feature extraction method that models the outputs of different hidden layers with the method of channel attention. Second, a multimodal fusion strategy based on the key modality is proposed, which utilizes the attention mechanism to raise the proportion of the key modality and mines the relationship between the key modality and other modalities. Finally, we use the multi-task learning approach to train the proposed model, ensuring that the model can learn better feature representations. Experimental results on two publicly available multimodal sentiment analysis datasets demonstrate that the proposed method is effective and that the proposed model outperforms baselines. 2024-02-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8755 info:doi/10.1109/TASLP.2024.3361374 https://ink.library.smu.edu.sg/context/sis_research/article/9758/viewcontent/2024_M3SA_Multimodalpav.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Multimodal sentiment analysis multi-scale feature extraction multi-task learning multimodal data fusion Graphics and Human Computer Interfaces Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Multimodal sentiment analysis
multi-scale feature extraction
multi-task learning
multimodal data fusion
Graphics and Human Computer Interfaces
Numerical Analysis and Scientific Computing
spellingShingle Multimodal sentiment analysis
multi-scale feature extraction
multi-task learning
multimodal data fusion
Graphics and Human Computer Interfaces
Numerical Analysis and Scientific Computing
LIN, Changkai
CHENG, Hongju
RAO, Qiang
YANG, Yang
M3SA: Multimodal Sentiment Analysis based on multi-scale feature extraction and multi-task learning
description Sentiment analysis plays an indispensable part in human-computer interaction. Multimodal sentiment analysis can overcome the shortcomings of unimodal sentiment analysis by fusing multimodal data. However, how to extracte improved feature representations and how to execute effective modality fusion are two crucial problems in multimodal sentiment analysis. Traditional work uses simple sub-models for feature extraction, and they ignore features of different scales and fuse different modalities of data equally, making it easier to incorporate extraneous information and affect analysis accuracy. In this paper, we propose a Multimodal Sentiment Analysis model based on Multi-scale feature extraction and Multi-task learning (M 3 SA). First, we propose a multi-scale feature extraction method that models the outputs of different hidden layers with the method of channel attention. Second, a multimodal fusion strategy based on the key modality is proposed, which utilizes the attention mechanism to raise the proportion of the key modality and mines the relationship between the key modality and other modalities. Finally, we use the multi-task learning approach to train the proposed model, ensuring that the model can learn better feature representations. Experimental results on two publicly available multimodal sentiment analysis datasets demonstrate that the proposed method is effective and that the proposed model outperforms baselines.
format text
author LIN, Changkai
CHENG, Hongju
RAO, Qiang
YANG, Yang
author_facet LIN, Changkai
CHENG, Hongju
RAO, Qiang
YANG, Yang
author_sort LIN, Changkai
title M3SA: Multimodal Sentiment Analysis based on multi-scale feature extraction and multi-task learning
title_short M3SA: Multimodal Sentiment Analysis based on multi-scale feature extraction and multi-task learning
title_full M3SA: Multimodal Sentiment Analysis based on multi-scale feature extraction and multi-task learning
title_fullStr M3SA: Multimodal Sentiment Analysis based on multi-scale feature extraction and multi-task learning
title_full_unstemmed M3SA: Multimodal Sentiment Analysis based on multi-scale feature extraction and multi-task learning
title_sort m3sa: multimodal sentiment analysis based on multi-scale feature extraction and multi-task learning
publisher Institutional Knowledge at Singapore Management University
publishDate 2024
url https://ink.library.smu.edu.sg/sis_research/8755
https://ink.library.smu.edu.sg/context/sis_research/article/9758/viewcontent/2024_M3SA_Multimodalpav.pdf
_version_ 1814047502657126400