Towards explainable harmful meme detection through multimodal debate between Large Language Models

The age of social media is flooded with Internet memes, necessitating a clear grasp and effective identification of harmful ones. This task presents a significant challenge due to the implicit meaning embedded in memes, which is not explicitly conveyed through the surface text and image. However, ex...

Full description

Saved in:

Bibliographic Details
Main Authors:	LIN, Hongzhan, LUO, Ziyang, GAO, Wei, MA, Jing, WANG, Bo, YANG, Ruichao
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2024
Subjects:	harmful meme detection explainability multimodal debate LLMs Artificial Intelligence and Robotics Numerical Analysis and Scientific Computing Social Media
Online Access:	https://ink.library.smu.edu.sg/sis_research/9324 https://ink.library.smu.edu.sg/context/sis_research/article/10324/viewcontent/3589334.3645381_pvoa_cc_by.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-10324
record_format	dspace
spelling	sg-smu-ink.sis_research-103242025-01-02T03:14:21Z Towards explainable harmful meme detection through multimodal debate between Large Language Models LIN, Hongzhan LUO, Ziyang GAO, Wei MA, Jing WANG, Bo YANG, Ruichao The age of social media is flooded with Internet memes, necessitating a clear grasp and effective identification of harmful ones. This task presents a significant challenge due to the implicit meaning embedded in memes, which is not explicitly conveyed through the surface text and image. However, existing harmful meme detection methods do not present readable explanations that unveil such implicit meaning to support their detection decisions. In this paper, we propose an explainable approach to detect harmful memes, achieved through reasoning over conflicting rationales from both harmless and harmful positions. Specifically, inspired by the powerful capacity of Large Language Models (LLMs) on text generation and reasoning, we first elicit multimodal debate between LLMs to generate the explanations derived from the contradictory arguments. Then we propose to fine-tune a small language model as the debate judge for harmfulness inference, to facilitate multimodal fusion between the harmfulness rationales and the intrinsic multimodal information within memes. In this way, our model is empowered to perform dialectical reasoning over intricate and implicit harm-indicative patterns, utilizing multimodal explanations originating from both harmless and harmful arguments. Extensive experiments on three public meme datasets demonstrate that our harmful meme detection approach achieves much better performance than state-of-the-art methods and exhibits a superior capacity for explaining the meme harmfulness of the model predictions. 2024-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9324 info:doi/10.1145/3589334.3645381 https://ink.library.smu.edu.sg/context/sis_research/article/10324/viewcontent/3589334.3645381_pvoa_cc_by.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University harmful meme detection explainability multimodal debate LLMs Artificial Intelligence and Robotics Numerical Analysis and Scientific Computing Social Media
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	harmful meme detection explainability multimodal debate LLMs Artificial Intelligence and Robotics Numerical Analysis and Scientific Computing Social Media
spellingShingle	harmful meme detection explainability multimodal debate LLMs Artificial Intelligence and Robotics Numerical Analysis and Scientific Computing Social Media LIN, Hongzhan LUO, Ziyang GAO, Wei MA, Jing WANG, Bo YANG, Ruichao Towards explainable harmful meme detection through multimodal debate between Large Language Models
description	The age of social media is flooded with Internet memes, necessitating a clear grasp and effective identification of harmful ones. This task presents a significant challenge due to the implicit meaning embedded in memes, which is not explicitly conveyed through the surface text and image. However, existing harmful meme detection methods do not present readable explanations that unveil such implicit meaning to support their detection decisions. In this paper, we propose an explainable approach to detect harmful memes, achieved through reasoning over conflicting rationales from both harmless and harmful positions. Specifically, inspired by the powerful capacity of Large Language Models (LLMs) on text generation and reasoning, we first elicit multimodal debate between LLMs to generate the explanations derived from the contradictory arguments. Then we propose to fine-tune a small language model as the debate judge for harmfulness inference, to facilitate multimodal fusion between the harmfulness rationales and the intrinsic multimodal information within memes. In this way, our model is empowered to perform dialectical reasoning over intricate and implicit harm-indicative patterns, utilizing multimodal explanations originating from both harmless and harmful arguments. Extensive experiments on three public meme datasets demonstrate that our harmful meme detection approach achieves much better performance than state-of-the-art methods and exhibits a superior capacity for explaining the meme harmfulness of the model predictions.
format	text
author	LIN, Hongzhan LUO, Ziyang GAO, Wei MA, Jing WANG, Bo YANG, Ruichao
author_facet	LIN, Hongzhan LUO, Ziyang GAO, Wei MA, Jing WANG, Bo YANG, Ruichao
author_sort	LIN, Hongzhan
title	Towards explainable harmful meme detection through multimodal debate between Large Language Models
title_short	Towards explainable harmful meme detection through multimodal debate between Large Language Models
title_full	Towards explainable harmful meme detection through multimodal debate between Large Language Models
title_fullStr	Towards explainable harmful meme detection through multimodal debate between Large Language Models
title_full_unstemmed	Towards explainable harmful meme detection through multimodal debate between Large Language Models
title_sort	towards explainable harmful meme detection through multimodal debate between large language models
publisher	Institutional Knowledge at Singapore Management University
publishDate	2024
url	https://ink.library.smu.edu.sg/sis_research/9324 https://ink.library.smu.edu.sg/context/sis_research/article/10324/viewcontent/3589334.3645381_pvoa_cc_by.pdf
_version_	1821237235993804800

Towards explainable harmful meme detection through multimodal debate between Large Language Models

Similar Items