Effective type label-based synergistic representation learning for biomedical event trigger detection

Background: Detecting event triggers in biomedical texts, which contain domain knowledge and context-dependent terms, is more challenging than in general-domain texts. Most state-of-the-art models rely mainly on external resources such as linguistic tools and knowledge bases to improve system perfor...

Full description

Saved in:
Bibliographic Details
Main Authors: Hao, Anran, Yuan, Haohan, Hui, Siu Cheung, Su, Jian
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/180450
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-180450
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
Biomedical event trigger detection
Representation learning
spellingShingle Computer and Information Science
Biomedical event trigger detection
Representation learning
Hao, Anran
Yuan, Haohan
Hui, Siu Cheung
Su, Jian
Effective type label-based synergistic representation learning for biomedical event trigger detection
description Background: Detecting event triggers in biomedical texts, which contain domain knowledge and context-dependent terms, is more challenging than in general-domain texts. Most state-of-the-art models rely mainly on external resources such as linguistic tools and knowledge bases to improve system performance. However, they lack effective mechanisms to obtain semantic clues from label specification and sentence context. Given its success in image classification, label representation learning is a promising approach to enhancing biomedical event trigger detection models by leveraging the rich semantics of pre-defined event type labels. Results: In this paper, we propose the Biomedical Label-based Synergistic representation Learning (BioLSL) model, which effectively utilizes event type labels by learning their correlation with trigger words and enriches the representation contextually. The BioLSL model consists of three modules. Firstly, the Domain-specific Joint Encoding module employs a transformer-based, domain-specific pre-trained architecture to jointly encode input sentences and pre-defined event type labels. Secondly, the Label-based Synergistic Representation Learning module learns the semantic relationships between input texts and event type labels, and generates a Label-Trigger Aware Representation (LTAR) and a Label-Context Aware Representation (LCAR) for enhanced semantic representations. Finally, the Trigger Classification module makes structured predictions, where each label is predicted with respect to its neighbours. We conduct experiments on three benchmark BioNLP datasets, namely MLEE, GE09, and GE11, to evaluate our proposed BioLSL model. Results show that BioLSL has achieved state-of-the-art performance, outperforming the baseline models. Conclusions: The proposed BioLSL model demonstrates good performance for biomedical event trigger detection without using any external resources. This suggests that label representation learning and context-aware enhancement are promising directions for improving the task. The key enhancement is that BioLSL effectively learns to construct semantic linkages between the event mentions and type labels, which provide the latent information of label-trigger and label-context relationships in biomedical texts. Moreover, additional experiments on BioLSL show that it performs exceptionally well with limited training data under the data-scarce scenarios.
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Hao, Anran
Yuan, Haohan
Hui, Siu Cheung
Su, Jian
format Article
author Hao, Anran
Yuan, Haohan
Hui, Siu Cheung
Su, Jian
author_sort Hao, Anran
title Effective type label-based synergistic representation learning for biomedical event trigger detection
title_short Effective type label-based synergistic representation learning for biomedical event trigger detection
title_full Effective type label-based synergistic representation learning for biomedical event trigger detection
title_fullStr Effective type label-based synergistic representation learning for biomedical event trigger detection
title_full_unstemmed Effective type label-based synergistic representation learning for biomedical event trigger detection
title_sort effective type label-based synergistic representation learning for biomedical event trigger detection
publishDate 2024
url https://hdl.handle.net/10356/180450
_version_ 1814047451100741632
spelling sg-ntu-dr.10356-1804502024-10-11T15:36:40Z Effective type label-based synergistic representation learning for biomedical event trigger detection Hao, Anran Yuan, Haohan Hui, Siu Cheung Su, Jian School of Computer Science and Engineering Institute for Infocomm Research, A*STAR Computer and Information Science Biomedical event trigger detection Representation learning Background: Detecting event triggers in biomedical texts, which contain domain knowledge and context-dependent terms, is more challenging than in general-domain texts. Most state-of-the-art models rely mainly on external resources such as linguistic tools and knowledge bases to improve system performance. However, they lack effective mechanisms to obtain semantic clues from label specification and sentence context. Given its success in image classification, label representation learning is a promising approach to enhancing biomedical event trigger detection models by leveraging the rich semantics of pre-defined event type labels. Results: In this paper, we propose the Biomedical Label-based Synergistic representation Learning (BioLSL) model, which effectively utilizes event type labels by learning their correlation with trigger words and enriches the representation contextually. The BioLSL model consists of three modules. Firstly, the Domain-specific Joint Encoding module employs a transformer-based, domain-specific pre-trained architecture to jointly encode input sentences and pre-defined event type labels. Secondly, the Label-based Synergistic Representation Learning module learns the semantic relationships between input texts and event type labels, and generates a Label-Trigger Aware Representation (LTAR) and a Label-Context Aware Representation (LCAR) for enhanced semantic representations. Finally, the Trigger Classification module makes structured predictions, where each label is predicted with respect to its neighbours. We conduct experiments on three benchmark BioNLP datasets, namely MLEE, GE09, and GE11, to evaluate our proposed BioLSL model. Results show that BioLSL has achieved state-of-the-art performance, outperforming the baseline models. Conclusions: The proposed BioLSL model demonstrates good performance for biomedical event trigger detection without using any external resources. This suggests that label representation learning and context-aware enhancement are promising directions for improving the task. The key enhancement is that BioLSL effectively learns to construct semantic linkages between the event mentions and type labels, which provide the latent information of label-trigger and label-context relationships in biomedical texts. Moreover, additional experiments on BioLSL show that it performs exceptionally well with limited training data under the data-scarce scenarios. Agency for Science, Technology and Research (A*STAR) Published version This research is supported by the Agency for Science, Technology, and Research (A*STAR), Singapore. 2024-10-08T00:40:48Z 2024-10-08T00:40:48Z 2024 Journal Article Hao, A., Yuan, H., Hui, S. C. & Su, J. (2024). Effective type label-based synergistic representation learning for biomedical event trigger detection. BMC Bioinformatics, 25(1), 251-. https://dx.doi.org/10.1186/s12859-024-05851-1 1471-2105 https://hdl.handle.net/10356/180450 10.1186/s12859-024-05851-1 25 2-s2.0-85200257970 1 25 251 en BMC Bioinformatics © 2024 The Author(s). Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publi cdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. application/pdf