Extended distributed prototypical for biomedical named entity recognition

Biomedical Named Entity Recognition (Bio-NER) is an essential step of biomedical information extraction and biomedical text mining. Although, a lot of researches have been made in the design of rule-based and supervised tools for general NER, Bio-NER still remains a challenge and an area of active r...

Full description

Saved in:
Bibliographic Details
Main Authors: Maan Tareq Abd, Masnizah Mohd
Format: Article
Language:English
Published: Penerbit Universiti Kebangsaan Malaysia 2017
Online Access:http://journalarticle.ukm.my/11849/1/18684-64021-1-PB.pdf
http://journalarticle.ukm.my/11849/
http://ejournals.ukm.my/apjitm/issue/view/1050
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Kebangsaan Malaysia
Language: English
id my-ukm.journal.11849
record_format eprints
spelling my-ukm.journal.118492018-07-10T00:19:23Z http://journalarticle.ukm.my/11849/ Extended distributed prototypical for biomedical named entity recognition Maan Tareq Abd, Masnizah Mohd, Biomedical Named Entity Recognition (Bio-NER) is an essential step of biomedical information extraction and biomedical text mining. Although, a lot of researches have been made in the design of rule-based and supervised tools for general NER, Bio-NER still remains a challenge and an area of active research, as still there is huge difference in F-score of 10 points between general newswire NER and Bio-NER. The complex structures of the biomedical entities pose a huge challenge for their recognition. To handle this, this paper explores different effective word representations with Support Vector Machine (SVM) to deal with the complex structures of biomedical named entities. First, this paper identifies and evaluates a set of morphological and contextual features with SVM learning method for Bio-NER. This paper also presents an extended distributed representation word embedding technique (EDRWE) for Bio-NER. These models are evaluated on widely used standard Bio-NER dataset namely GENIA corpus. Experimental results show that EDRWE technique improves the overall performance of the Bio-NER and outperforms all other representation methods. Results analysis shows that the new EDRWE is satisfactory and effective for Bio-NER especially when only a small-sized data set is available. Penerbit Universiti Kebangsaan Malaysia 2017-12 Article PeerReviewed application/pdf en http://journalarticle.ukm.my/11849/1/18684-64021-1-PB.pdf Maan Tareq Abd, and Masnizah Mohd, (2017) Extended distributed prototypical for biomedical named entity recognition. Asia-Pacific Journal of Information Technology and Multimedia, 6 (2). pp. 1-11. ISSN 2289-2192 http://ejournals.ukm.my/apjitm/issue/view/1050
institution Universiti Kebangsaan Malaysia
building Perpustakaan Tun Sri Lanang Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Kebangsaan Malaysia
content_source UKM Journal Article Repository
url_provider http://journalarticle.ukm.my/
language English
description Biomedical Named Entity Recognition (Bio-NER) is an essential step of biomedical information extraction and biomedical text mining. Although, a lot of researches have been made in the design of rule-based and supervised tools for general NER, Bio-NER still remains a challenge and an area of active research, as still there is huge difference in F-score of 10 points between general newswire NER and Bio-NER. The complex structures of the biomedical entities pose a huge challenge for their recognition. To handle this, this paper explores different effective word representations with Support Vector Machine (SVM) to deal with the complex structures of biomedical named entities. First, this paper identifies and evaluates a set of morphological and contextual features with SVM learning method for Bio-NER. This paper also presents an extended distributed representation word embedding technique (EDRWE) for Bio-NER. These models are evaluated on widely used standard Bio-NER dataset namely GENIA corpus. Experimental results show that EDRWE technique improves the overall performance of the Bio-NER and outperforms all other representation methods. Results analysis shows that the new EDRWE is satisfactory and effective for Bio-NER especially when only a small-sized data set is available.
format Article
author Maan Tareq Abd,
Masnizah Mohd,
spellingShingle Maan Tareq Abd,
Masnizah Mohd,
Extended distributed prototypical for biomedical named entity recognition
author_facet Maan Tareq Abd,
Masnizah Mohd,
author_sort Maan Tareq Abd,
title Extended distributed prototypical for biomedical named entity recognition
title_short Extended distributed prototypical for biomedical named entity recognition
title_full Extended distributed prototypical for biomedical named entity recognition
title_fullStr Extended distributed prototypical for biomedical named entity recognition
title_full_unstemmed Extended distributed prototypical for biomedical named entity recognition
title_sort extended distributed prototypical for biomedical named entity recognition
publisher Penerbit Universiti Kebangsaan Malaysia
publishDate 2017
url http://journalarticle.ukm.my/11849/1/18684-64021-1-PB.pdf
http://journalarticle.ukm.my/11849/
http://ejournals.ukm.my/apjitm/issue/view/1050
_version_ 1643738620597633024