Virtual prompt pre-training for prototype-based few-shot relation extraction

Prompt tuning with pre-trained language models (PLM) has exhibited outstanding performance by reducing the gap between pre-training tasks and various downstream applications, which requires additional labor efforts in label word mappings and prompt template engineering. However, in a label intensive...

Full description

Saved in:

Bibliographic Details
Main Authors:	He, Kai, Huang, Yucheng, Mao, Rui, Gong, Tieliang, Li, Chen, Cambria, Erik
Other Authors:	School of Computer Science and Engineering
Format:	Article
Language:	English
Published:	2023
Subjects:	Engineering::Computer science and engineering Few-shot Learning Information Extraction
Online Access:	https://hdl.handle.net/10356/170494
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-170494
record_format	dspace
spelling	sg-ntu-dr.10356-1704942023-09-15T06:39:46Z Virtual prompt pre-training for prototype-based few-shot relation extraction He, Kai Huang, Yucheng Mao, Rui Gong, Tieliang Li, Chen Cambria, Erik School of Computer Science and Engineering Engineering::Computer science and engineering Few-shot Learning Information Extraction Prompt tuning with pre-trained language models (PLM) has exhibited outstanding performance by reducing the gap between pre-training tasks and various downstream applications, which requires additional labor efforts in label word mappings and prompt template engineering. However, in a label intensive research domain, e.g., few-shot relation extraction (RE), manually defining label word mappings is particularly challenging, because the number of utilized relation label classes with complex relation names can be extremely large. Besides, the manual prompt development in natural language is subjective to individuals. To tackle these issues, we propose a virtual prompt pre-training method, projecting the virtual prompt to latent space, then fusing with PLM parameters. The pre-training is entity-relation-aware for RE, including the tasks of mask entity prediction, entity typing, distant supervised RE, and contrastive prompt pre-training. The proposed pre-training method can provide robust initialization for prompt encoding, while maintaining the interaction with the PLM. Furthermore, the virtual prompt can effectively avoid the labor efforts and the subjectivity issue in label word mapping and prompt template engineering. Our proposed prompt-based prototype network delivers a novel learning paradigm to model entities and relations via the probability distribution and Euclidean distance of the predictions of query instances and prototypes. The results indicate that our model yields an averaged accuracy gain of 4.21% on two few-shot datasets over strong RE baselines. Based on our proposed framework, our pre-trained model outperforms the strongest RE-related PLM by 6.52%. Agency for Science, Technology and Research (ASTAR) This research is supported by the Agency for Science, Technology and Research (ASTAR) under its AME Programmatic Funding Scheme (Project #A18A2b0046). This work is also supported by the Key Research and Development Program of Ningxia Hui Nationality Autonomous Region (2022BEG02025); The Key Research and Development Program of Shaanxi Province (2021GXLH-Z-095); The Innovative Research Group of the National Natural Science Foundation of China (61721002); The innovation team from the Ministry of Education (IRT_17R86). 2023-09-15T06:39:45Z 2023-09-15T06:39:45Z 2023 Journal Article He, K., Huang, Y., Mao, R., Gong, T., Li, C. & Cambria, E. (2023). Virtual prompt pre-training for prototype-based few-shot relation extraction. Expert Systems With Applications, 213(Part A), 118927-. https://dx.doi.org/10.1016/j.eswa.2022.118927 0957-4174 https://hdl.handle.net/10356/170494 10.1016/j.eswa.2022.118927 2-s2.0-85139597157 Part A 213 118927 en A18A2b0046 Expert Systems with Applications © 2022 Elsevier Ltd. All rights reserved.
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering Few-shot Learning Information Extraction
spellingShingle	Engineering::Computer science and engineering Few-shot Learning Information Extraction He, Kai Huang, Yucheng Mao, Rui Gong, Tieliang Li, Chen Cambria, Erik Virtual prompt pre-training for prototype-based few-shot relation extraction
description	Prompt tuning with pre-trained language models (PLM) has exhibited outstanding performance by reducing the gap between pre-training tasks and various downstream applications, which requires additional labor efforts in label word mappings and prompt template engineering. However, in a label intensive research domain, e.g., few-shot relation extraction (RE), manually defining label word mappings is particularly challenging, because the number of utilized relation label classes with complex relation names can be extremely large. Besides, the manual prompt development in natural language is subjective to individuals. To tackle these issues, we propose a virtual prompt pre-training method, projecting the virtual prompt to latent space, then fusing with PLM parameters. The pre-training is entity-relation-aware for RE, including the tasks of mask entity prediction, entity typing, distant supervised RE, and contrastive prompt pre-training. The proposed pre-training method can provide robust initialization for prompt encoding, while maintaining the interaction with the PLM. Furthermore, the virtual prompt can effectively avoid the labor efforts and the subjectivity issue in label word mapping and prompt template engineering. Our proposed prompt-based prototype network delivers a novel learning paradigm to model entities and relations via the probability distribution and Euclidean distance of the predictions of query instances and prototypes. The results indicate that our model yields an averaged accuracy gain of 4.21% on two few-shot datasets over strong RE baselines. Based on our proposed framework, our pre-trained model outperforms the strongest RE-related PLM by 6.52%.
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering He, Kai Huang, Yucheng Mao, Rui Gong, Tieliang Li, Chen Cambria, Erik
format	Article
author	He, Kai Huang, Yucheng Mao, Rui Gong, Tieliang Li, Chen Cambria, Erik
author_sort	He, Kai
title	Virtual prompt pre-training for prototype-based few-shot relation extraction
title_short	Virtual prompt pre-training for prototype-based few-shot relation extraction
title_full	Virtual prompt pre-training for prototype-based few-shot relation extraction
title_fullStr	Virtual prompt pre-training for prototype-based few-shot relation extraction
title_full_unstemmed	Virtual prompt pre-training for prototype-based few-shot relation extraction
title_sort	virtual prompt pre-training for prototype-based few-shot relation extraction
publishDate	2023
url	https://hdl.handle.net/10356/170494
_version_	1779156807652999168

Virtual prompt pre-training for prototype-based few-shot relation extraction

Similar Items