Effective type label-based synergistic representation learning for biomedical event trigger detection
Background: Detecting event triggers in biomedical texts, which contain domain knowledge and context-dependent terms, is more challenging than in general-domain texts. Most state-of-the-art models rely mainly on external resources such as linguistic tools and knowledge bases to improve system perfor...
Saved in:
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/180450 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-180450 |
---|---|
record_format |
dspace |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science Biomedical event trigger detection Representation learning |
spellingShingle |
Computer and Information Science Biomedical event trigger detection Representation learning Hao, Anran Yuan, Haohan Hui, Siu Cheung Su, Jian Effective type label-based synergistic representation learning for biomedical event trigger detection |
description |
Background: Detecting event triggers in biomedical texts, which contain domain knowledge and context-dependent terms, is more challenging than in general-domain texts. Most state-of-the-art models rely mainly on external resources such as linguistic tools and knowledge bases to improve system performance. However, they lack effective mechanisms to obtain semantic clues from label specification and sentence context. Given its success in image classification, label representation learning is a promising approach to enhancing biomedical event trigger detection models by leveraging the rich semantics of pre-defined event type labels. Results: In this paper, we propose the Biomedical Label-based Synergistic representation Learning (BioLSL) model, which effectively utilizes event type labels by learning their correlation with trigger words and enriches the representation contextually. The BioLSL model consists of three modules. Firstly, the Domain-specific Joint Encoding module employs a transformer-based, domain-specific pre-trained architecture to jointly encode input sentences and pre-defined event type labels. Secondly, the Label-based Synergistic Representation Learning module learns the semantic relationships between input texts and event type labels, and generates a Label-Trigger Aware Representation (LTAR) and a Label-Context Aware Representation (LCAR) for enhanced semantic representations. Finally, the Trigger Classification module makes structured predictions, where each label is predicted with respect to its neighbours. We conduct experiments on three benchmark BioNLP datasets, namely MLEE, GE09, and GE11, to evaluate our proposed BioLSL model. Results show that BioLSL has achieved state-of-the-art performance, outperforming the baseline models. Conclusions: The proposed BioLSL model demonstrates good performance for biomedical event trigger detection without using any external resources. This suggests that label representation learning and context-aware enhancement are promising directions for improving the task. The key enhancement is that BioLSL effectively learns to construct semantic linkages between the event mentions and type labels, which provide the latent information of label-trigger and label-context relationships in biomedical texts. Moreover, additional experiments on BioLSL show that it performs exceptionally well with limited training data under the data-scarce scenarios. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Hao, Anran Yuan, Haohan Hui, Siu Cheung Su, Jian |
format |
Article |
author |
Hao, Anran Yuan, Haohan Hui, Siu Cheung Su, Jian |
author_sort |
Hao, Anran |
title |
Effective type label-based synergistic representation learning for biomedical event trigger detection |
title_short |
Effective type label-based synergistic representation learning for biomedical event trigger detection |
title_full |
Effective type label-based synergistic representation learning for biomedical event trigger detection |
title_fullStr |
Effective type label-based synergistic representation learning for biomedical event trigger detection |
title_full_unstemmed |
Effective type label-based synergistic representation learning for biomedical event trigger detection |
title_sort |
effective type label-based synergistic representation learning for biomedical event trigger detection |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/180450 |
_version_ |
1814047451100741632 |
spelling |
sg-ntu-dr.10356-1804502024-10-11T15:36:40Z Effective type label-based synergistic representation learning for biomedical event trigger detection Hao, Anran Yuan, Haohan Hui, Siu Cheung Su, Jian School of Computer Science and Engineering Institute for Infocomm Research, A*STAR Computer and Information Science Biomedical event trigger detection Representation learning Background: Detecting event triggers in biomedical texts, which contain domain knowledge and context-dependent terms, is more challenging than in general-domain texts. Most state-of-the-art models rely mainly on external resources such as linguistic tools and knowledge bases to improve system performance. However, they lack effective mechanisms to obtain semantic clues from label specification and sentence context. Given its success in image classification, label representation learning is a promising approach to enhancing biomedical event trigger detection models by leveraging the rich semantics of pre-defined event type labels. Results: In this paper, we propose the Biomedical Label-based Synergistic representation Learning (BioLSL) model, which effectively utilizes event type labels by learning their correlation with trigger words and enriches the representation contextually. The BioLSL model consists of three modules. Firstly, the Domain-specific Joint Encoding module employs a transformer-based, domain-specific pre-trained architecture to jointly encode input sentences and pre-defined event type labels. Secondly, the Label-based Synergistic Representation Learning module learns the semantic relationships between input texts and event type labels, and generates a Label-Trigger Aware Representation (LTAR) and a Label-Context Aware Representation (LCAR) for enhanced semantic representations. Finally, the Trigger Classification module makes structured predictions, where each label is predicted with respect to its neighbours. We conduct experiments on three benchmark BioNLP datasets, namely MLEE, GE09, and GE11, to evaluate our proposed BioLSL model. Results show that BioLSL has achieved state-of-the-art performance, outperforming the baseline models. Conclusions: The proposed BioLSL model demonstrates good performance for biomedical event trigger detection without using any external resources. This suggests that label representation learning and context-aware enhancement are promising directions for improving the task. The key enhancement is that BioLSL effectively learns to construct semantic linkages between the event mentions and type labels, which provide the latent information of label-trigger and label-context relationships in biomedical texts. Moreover, additional experiments on BioLSL show that it performs exceptionally well with limited training data under the data-scarce scenarios. Agency for Science, Technology and Research (A*STAR) Published version This research is supported by the Agency for Science, Technology, and Research (A*STAR), Singapore. 2024-10-08T00:40:48Z 2024-10-08T00:40:48Z 2024 Journal Article Hao, A., Yuan, H., Hui, S. C. & Su, J. (2024). Effective type label-based synergistic representation learning for biomedical event trigger detection. BMC Bioinformatics, 25(1), 251-. https://dx.doi.org/10.1186/s12859-024-05851-1 1471-2105 https://hdl.handle.net/10356/180450 10.1186/s12859-024-05851-1 25 2-s2.0-85200257970 1 25 251 en BMC Bioinformatics © 2024 The Author(s). Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publi cdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. application/pdf |