Converse attention knowledge transfer for low-resource named entity recognition

In recent years, great success has been achieved in many tasks of natural language processing (NLP), e.g., named entity recognition (NER), especially in the high-resource language, i.e., English, thanks in part to the considerable amount of labeled resources. More labeled resources, better word repr...

Full description

Saved in:
Bibliographic Details
Main Authors: Lyu, Shengfei, Sun, Linghao, Yi, Huixiong, Liu, Yong, Chen, Huanhuan, Miao, Chunyan
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181468
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In recent years, great success has been achieved in many tasks of natural language processing (NLP), e.g., named entity recognition (NER), especially in the high-resource language, i.e., English, thanks in part to the considerable amount of labeled resources. More labeled resources, better word representations. However, most low-resource languages do not have such an abundance of labeled data as high-resource English, leading to poor performance of NER in these low-resource languages due to poor word representations. In the paper, we propose converse attention network (CAN) to augment word representations in low-resource languages from the high-resource language, improving the performance of NER in low-resource languages by transferring knowledge learned in the high-resource language. CAN first translates sentences in low-resource languages into high-resource English using an attention-based translation module. In the process of translation, CAN obtains the attention matrices that align word representations of high-resource language space and low-resource language space. Furthermore, CAN augments word representations learned in low-resource language space with word representations learned in high-resource language space using the attention matrices. Experiments on four low-resource NER datasets show that CAN achieves consistent and significant performance improvements, which indicates the effectiveness of CAN.