BIOMEDICAL EVENTS EXTRACTION USING MULTI-LABEL CLASSIFICATION AND PRE-TRAINED BERT

Biomedical event extraction is a combined task of named-entity recognition (NER) and relation extraction (RE) applied to biomedical texts to obtain a list of events in biomedical texts. At present, the best biomedical event extraction research uses sequence labeling techniques with the joint meth...

Full description

Saved in:

Bibliographic Details
Main Author:	Mulya, Dimmas
Format:	Theses
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/73358
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:73358
spelling	id-itb.:733582023-06-19T18:56:29ZBIOMEDICAL EVENTS EXTRACTION USING MULTI-LABEL CLASSIFICATION AND PRE-TRAINED BERT Mulya, Dimmas Indonesia Theses biomedical event extraction, pipeline method, sequence labelling, BERT, multi-label classification. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/73358 Biomedical event extraction is a combined task of named-entity recognition (NER) and relation extraction (RE) applied to biomedical texts to obtain a list of events in biomedical texts. At present, the best biomedical event extraction research uses sequence labeling techniques with the joint method approach, softmax decoder in the event trigger identification section, and BioBERT v1.1 encoder. However, this event extraction model has several drawbacks, which are built using joint method where each task is executed independently, does not provide special handling of the emergence of event trigger labels which include multi-labels, and still uses the BioBERT v1.1 encoder which is still using vocabulary from non biomedical domains. In this thesis research, a modification of the biomedical event extraction model was carried out to correct this error. The modifications applied are changing the joint method to pipeline so that it can provide forward information between tasks, in the event trigger identification task, the softmax decoder is replaced with a sigmoid to handle multi-labels, and the BERT encoder has been trained with a biomedical domain specific vocabulary, and to avoid overfitting to certain word patterns, event masking system will also be applied in transitions between pipeline modules. The experiment was carried out in the form of a comparison of the modified model architecture with the original architecture of previous research using the F1-Score evaluation metric. From the modifications made, the performance improvement of the biomedical event extraction model occurs by applying an encoder that has been built with a biomedical specific domain vocabulary. Changing the joint method to pipeline and changing the softmax decoder to sigmoid in the event trigger identification task did not provide an increase in the biomedical event extraction model. The best model results were built using the joint method, softmax decoder, and SciBERT encoder with an F1-Score value of 64.50. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	Biomedical event extraction is a combined task of named-entity recognition (NER) and relation extraction (RE) applied to biomedical texts to obtain a list of events in biomedical texts. At present, the best biomedical event extraction research uses sequence labeling techniques with the joint method approach, softmax decoder in the event trigger identification section, and BioBERT v1.1 encoder. However, this event extraction model has several drawbacks, which are built using joint method where each task is executed independently, does not provide special handling of the emergence of event trigger labels which include multi-labels, and still uses the BioBERT v1.1 encoder which is still using vocabulary from non biomedical domains. In this thesis research, a modification of the biomedical event extraction model was carried out to correct this error. The modifications applied are changing the joint method to pipeline so that it can provide forward information between tasks, in the event trigger identification task, the softmax decoder is replaced with a sigmoid to handle multi-labels, and the BERT encoder has been trained with a biomedical domain specific vocabulary, and to avoid overfitting to certain word patterns, event masking system will also be applied in transitions between pipeline modules. The experiment was carried out in the form of a comparison of the modified model architecture with the original architecture of previous research using the F1-Score evaluation metric. From the modifications made, the performance improvement of the biomedical event extraction model occurs by applying an encoder that has been built with a biomedical specific domain vocabulary. Changing the joint method to pipeline and changing the softmax decoder to sigmoid in the event trigger identification task did not provide an increase in the biomedical event extraction model. The best model results were built using the joint method, softmax decoder, and SciBERT encoder with an F1-Score value of 64.50.
format	Theses
author	Mulya, Dimmas
spellingShingle	Mulya, Dimmas BIOMEDICAL EVENTS EXTRACTION USING MULTI-LABEL CLASSIFICATION AND PRE-TRAINED BERT
author_facet	Mulya, Dimmas
author_sort	Mulya, Dimmas
title	BIOMEDICAL EVENTS EXTRACTION USING MULTI-LABEL CLASSIFICATION AND PRE-TRAINED BERT
title_short	BIOMEDICAL EVENTS EXTRACTION USING MULTI-LABEL CLASSIFICATION AND PRE-TRAINED BERT
title_full	BIOMEDICAL EVENTS EXTRACTION USING MULTI-LABEL CLASSIFICATION AND PRE-TRAINED BERT
title_fullStr	BIOMEDICAL EVENTS EXTRACTION USING MULTI-LABEL CLASSIFICATION AND PRE-TRAINED BERT
title_full_unstemmed	BIOMEDICAL EVENTS EXTRACTION USING MULTI-LABEL CLASSIFICATION AND PRE-TRAINED BERT
title_sort	biomedical events extraction using multi-label classification and pre-trained bert
url	https://digilib.itb.ac.id/gdl/view/73358
_version_	1823651888882515968

BIOMEDICAL EVENTS EXTRACTION USING MULTI-LABEL CLASSIFICATION AND PRE-TRAINED BERT

Similar Items