Self-adaptive fine-grained multi-modal data augmentation for semi-supervised muti-modal coreference resolution

Coreference resolution, an essential task in natural language processing, is particularly challenging in multi-modal scenarios where data comes in various forms and modalities. Despite advancements, limitations due to scarce labeled data and underleveraged unlabeled data persist. We address these is...

Full description

Saved in:

Bibliographic Details
Main Authors:	ZHENG, Li, CHEN, Boyu, FEI, Hao, LI, Fei, WU, Shengqiong, LIAO, Lizi, JI, Donghong
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2024
Subjects:	Coreference Resolution Multi-modal Semi-supervised Learning Artificial Intelligence and Robotics Computer Sciences
Online Access:	https://ink.library.smu.edu.sg/sis_research/9694 https://ink.library.smu.edu.sg/context/sis_research/article/10694/viewcontent/Self_Adaptive_Fine_grain.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-10694
record_format	dspace
spelling	sg-smu-ink.sis_research-106942024-11-28T09:05:51Z Self-adaptive fine-grained multi-modal data augmentation for semi-supervised muti-modal coreference resolution ZHENG, Li CHEN, Boyu FEI, Hao LI, Fei WU, Shengqiong LIAO, Lizi JI, Donghong Coreference resolution, an essential task in natural language processing, is particularly challenging in multi-modal scenarios where data comes in various forms and modalities. Despite advancements, limitations due to scarce labeled data and underleveraged unlabeled data persist. We address these issues with a self-adaptive fine-grained multi-modal data augmentation framework for semi-supervised MCR, focusing on enriching training data from labeled datasets and tapping into the untapped potential of unlabeled data. Regarding the former issue, we first leverage text coreference resolution datasets and diffusion models,to perform fine-grained text-to-image generation with aligned text entities and image bounding boxes. We then introduce a self-adaptive selection strategy, meticulously curating the augmented data to enhance the diversity and volume of the training set without compromising its quality. For the latter issue, we design a self-adaptive threshold strategy that dynamically adjusts the confidence threshold based on the model's learning status and performance, enabling effective utilization of valuable information from unlabeled data. Additionally, we incorporate a distance smoothing term, which smooths distances between positive and negative samples, enhancing discriminative power of the model?s feature representations and addressing noise and uncertainty in the unlabeled data. Our experiments on the widely-used CIN dataset show that our framework significantly outperforms state-of-the-art baselines by at least 9.57% on MUC F1 score and 4.92% on CoNLL F1 score. Remarkably, against weakly-supervised baselines, our framework achieves a staggering 22.24% enhancement in MUC F1 score. These results, underpinned by in-depth analyses, underscore the effectiveness and potential of our approach for advancing MCR tasks. 2024-10-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9694 info:doi/10.1145/3664647.3680966 https://ink.library.smu.edu.sg/context/sis_research/article/10694/viewcontent/Self_Adaptive_Fine_grain.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Coreference Resolution Multi-modal Semi-supervised Learning Artificial Intelligence and Robotics Computer Sciences
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Coreference Resolution Multi-modal Semi-supervised Learning Artificial Intelligence and Robotics Computer Sciences
spellingShingle	Coreference Resolution Multi-modal Semi-supervised Learning Artificial Intelligence and Robotics Computer Sciences ZHENG, Li CHEN, Boyu FEI, Hao LI, Fei WU, Shengqiong LIAO, Lizi JI, Donghong Self-adaptive fine-grained multi-modal data augmentation for semi-supervised muti-modal coreference resolution
description	Coreference resolution, an essential task in natural language processing, is particularly challenging in multi-modal scenarios where data comes in various forms and modalities. Despite advancements, limitations due to scarce labeled data and underleveraged unlabeled data persist. We address these issues with a self-adaptive fine-grained multi-modal data augmentation framework for semi-supervised MCR, focusing on enriching training data from labeled datasets and tapping into the untapped potential of unlabeled data. Regarding the former issue, we first leverage text coreference resolution datasets and diffusion models,to perform fine-grained text-to-image generation with aligned text entities and image bounding boxes. We then introduce a self-adaptive selection strategy, meticulously curating the augmented data to enhance the diversity and volume of the training set without compromising its quality. For the latter issue, we design a self-adaptive threshold strategy that dynamically adjusts the confidence threshold based on the model's learning status and performance, enabling effective utilization of valuable information from unlabeled data. Additionally, we incorporate a distance smoothing term, which smooths distances between positive and negative samples, enhancing discriminative power of the model?s feature representations and addressing noise and uncertainty in the unlabeled data. Our experiments on the widely-used CIN dataset show that our framework significantly outperforms state-of-the-art baselines by at least 9.57% on MUC F1 score and 4.92% on CoNLL F1 score. Remarkably, against weakly-supervised baselines, our framework achieves a staggering 22.24% enhancement in MUC F1 score. These results, underpinned by in-depth analyses, underscore the effectiveness and potential of our approach for advancing MCR tasks.
format	text
author	ZHENG, Li CHEN, Boyu FEI, Hao LI, Fei WU, Shengqiong LIAO, Lizi JI, Donghong
author_facet	ZHENG, Li CHEN, Boyu FEI, Hao LI, Fei WU, Shengqiong LIAO, Lizi JI, Donghong
author_sort	ZHENG, Li
title	Self-adaptive fine-grained multi-modal data augmentation for semi-supervised muti-modal coreference resolution
title_short	Self-adaptive fine-grained multi-modal data augmentation for semi-supervised muti-modal coreference resolution
title_full	Self-adaptive fine-grained multi-modal data augmentation for semi-supervised muti-modal coreference resolution
title_fullStr	Self-adaptive fine-grained multi-modal data augmentation for semi-supervised muti-modal coreference resolution
title_full_unstemmed	Self-adaptive fine-grained multi-modal data augmentation for semi-supervised muti-modal coreference resolution
title_sort	self-adaptive fine-grained multi-modal data augmentation for semi-supervised muti-modal coreference resolution
publisher	Institutional Knowledge at Singapore Management University
publishDate	2024
url	https://ink.library.smu.edu.sg/sis_research/9694 https://ink.library.smu.edu.sg/context/sis_research/article/10694/viewcontent/Self_Adaptive_Fine_grain.pdf
_version_	1819113105095917568

Self-adaptive fine-grained multi-modal data augmentation for semi-supervised muti-modal coreference resolution

Similar Items