Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature Modeling Study

Background: With the rapid expansion of biomedical literature, biomedical information extraction has attracted increasing attention from researchers. In particular, relation extraction between 2 entities is a long-term research topic. Objective: This study aimed to perform 2 multiclass relation ext...

Full description

Saved in:
Bibliographic Details
Main Authors: Yongbin, Li, Stephanie, Chua
Format: Article
Language:English
Published: JMIR Publications 2022
Subjects:
Online Access:http://ir.unimas.my/id/eprint/40275/3/Relation%20Extraction%20-%20Copy.pdf
http://ir.unimas.my/id/eprint/40275/
https://medinform.jmir.org/2022/10/e41136
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Malaysia Sarawak
Language: English
id my.unimas.ir.40275
record_format eprints
spelling my.unimas.ir.402752022-10-27T06:23:10Z http://ir.unimas.my/id/eprint/40275/ Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature Modeling Study Yongbin, Li Stephanie, Chua QA75 Electronic computers. Computer science Background: With the rapid expansion of biomedical literature, biomedical information extraction has attracted increasing attention from researchers. In particular, relation extraction between 2 entities is a long-term research topic. Objective: This study aimed to perform 2 multiclass relation extraction tasks of Biomedical Natural Language Processing Workshop 2019 Open Shared Tasks: relation extraction of Bacteria-Biotope (BB-rel) task and binary relation extraction of plant seed development (SeeDev-binary) task. In essence, these 2 tasks are aimed at extracting the relation between annotated entity pairs from biomedical texts, which is a challenging problem. Methods: Traditional research methods adopted feature- or kernel-based methods and achieved good performance. For these tasks, we propose a deep learning model based on a combination of several distributed features, such as domain-specific word embedding, part-of-speech embedding, entity-type embedding, distance embedding, and position embedding. The multi-head attention mechanism is used to extract the global semantic features of an entire sentence. Meanwhile, we introduced a dependency-type feature and the shortest dependency path connecting 2 candidate entities in the syntactic dependency graph to enrich the feature representation. Results: Experiments show that our proposed model has excellent performance in biomedical relation extraction, achieving F1 scores of 65.56% and 38.04% on the test sets of the BB-rel and SeeDev-binary tasks. Especially in the SeeDev-binary task, the F1 score of our model is superior to that of other existing models and achieves state-of-the-art performance. Conclusions: We demonstrated that the multi-head attention mechanism can learn relevant syntactic and semantic features in different representation subspaces and different positions to extract comprehensive feature representation. Moreover, syntactic dependency features can improve the performance of the model by learning dependency relation between the entities in biomedical texts. JMIR Publications 2022-10-20 Article PeerReviewed text en http://ir.unimas.my/id/eprint/40275/3/Relation%20Extraction%20-%20Copy.pdf Yongbin, Li and Stephanie, Chua (2022) Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature Modeling Study. JMIR Medical Informatics, 10 (10). ISSN 2291-9694 https://medinform.jmir.org/2022/10/e41136 doi: 10.2196/41136
institution Universiti Malaysia Sarawak
building Centre for Academic Information Services (CAIS)
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Sarawak
content_source UNIMAS Institutional Repository
url_provider http://ir.unimas.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Yongbin, Li
Stephanie, Chua
Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature Modeling Study
description Background: With the rapid expansion of biomedical literature, biomedical information extraction has attracted increasing attention from researchers. In particular, relation extraction between 2 entities is a long-term research topic. Objective: This study aimed to perform 2 multiclass relation extraction tasks of Biomedical Natural Language Processing Workshop 2019 Open Shared Tasks: relation extraction of Bacteria-Biotope (BB-rel) task and binary relation extraction of plant seed development (SeeDev-binary) task. In essence, these 2 tasks are aimed at extracting the relation between annotated entity pairs from biomedical texts, which is a challenging problem. Methods: Traditional research methods adopted feature- or kernel-based methods and achieved good performance. For these tasks, we propose a deep learning model based on a combination of several distributed features, such as domain-specific word embedding, part-of-speech embedding, entity-type embedding, distance embedding, and position embedding. The multi-head attention mechanism is used to extract the global semantic features of an entire sentence. Meanwhile, we introduced a dependency-type feature and the shortest dependency path connecting 2 candidate entities in the syntactic dependency graph to enrich the feature representation. Results: Experiments show that our proposed model has excellent performance in biomedical relation extraction, achieving F1 scores of 65.56% and 38.04% on the test sets of the BB-rel and SeeDev-binary tasks. Especially in the SeeDev-binary task, the F1 score of our model is superior to that of other existing models and achieves state-of-the-art performance. Conclusions: We demonstrated that the multi-head attention mechanism can learn relevant syntactic and semantic features in different representation subspaces and different positions to extract comprehensive feature representation. Moreover, syntactic dependency features can improve the performance of the model by learning dependency relation between the entities in biomedical texts.
format Article
author Yongbin, Li
Stephanie, Chua
author_facet Yongbin, Li
Stephanie, Chua
author_sort Yongbin, Li
title Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature Modeling Study
title_short Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature Modeling Study
title_full Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature Modeling Study
title_fullStr Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature Modeling Study
title_full_unstemmed Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature Modeling Study
title_sort relation extraction in biomedical texts based on multi-head attention model with syntactic dependency feature modeling study
publisher JMIR Publications
publishDate 2022
url http://ir.unimas.my/id/eprint/40275/3/Relation%20Extraction%20-%20Copy.pdf
http://ir.unimas.my/id/eprint/40275/
https://medinform.jmir.org/2022/10/e41136
_version_ 1748184474915438592