Constructing Biological Knowledge Base using Named Entities Recognition and Term Collocation

Over the last few decades, the publishing of biological literature has dramatically increased due to technological developments. Thus, a crucial process is to extract information from this large number of writings by utilizing a biological named entity (NER) approach to automatically label correspon...

Full description

Saved in:
Bibliographic Details
Main Authors: Supattanawaree Thipcharoen, Watshara Shoombuatong, Samerkae Somhom, Rattasit Sukahut, Jeerayut Chaijaruwanich
Language:English
Published: Science Faculty of Chiang Mai University 2019
Subjects:
Online Access:http://it.science.cmu.ac.th/ejournal/dl.php?journal_id=6824
http://cmuir.cmu.ac.th/jspui/handle/6653943832/66125
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Chiang Mai University
Language: English
id th-cmuir.6653943832-66125
record_format dspace
spelling th-cmuir.6653943832-661252019-08-21T09:18:22Z Constructing Biological Knowledge Base using Named Entities Recognition and Term Collocation Supattanawaree Thipcharoen Watshara Shoombuatong Samerkae Somhom Rattasit Sukahut Jeerayut Chaijaruwanich Biological information extraction Biological Named Entity Recognition Conditional Random Fields Poisson Collocations Over the last few decades, the publishing of biological literature has dramatically increased due to technological developments. Thus, a crucial process is to extract information from this large number of writings by utilizing a biological named entity (NER) approach to automatically label corresponding biological terms. It is desirable to propose an effective method to identify biological named entities and automatically establish the specific knowledge base from biological literature. Herein, we made efforts in investigating biological information extraction for establishing specific knowledge as follows: 1) proposing NER method based on the efficient conditional random fields (CRFs) model, called NER-CRF, for performing on the benchmarking data (JNLPBA2004). The proposed NER method provided a higher result with 90.42% recall, 97.74% precision, and 94.30% F-measure, compared with the existing method with 75.99% recall, 69.42% precision, and 72.55% F-measure; 2) applying the Poisson approach for constructing an interpretability biological knowledge network to give good understanding to the global properties collocation of biological terms from the literature. Our finding provided the collocations of biological terms from the literature providing some insights to the specific biological literature. 2019-08-21T09:18:22Z 2019-08-21T09:18:22Z 2016 Chiang Mai Journal of Science 43, 3 (Apr 2016), 661 - 671 0125-2526 http://it.science.cmu.ac.th/ejournal/dl.php?journal_id=6824 http://cmuir.cmu.ac.th/jspui/handle/6653943832/66125 Eng Science Faculty of Chiang Mai University
institution Chiang Mai University
building Chiang Mai University Library
country Thailand
collection CMU Intellectual Repository
language English
topic Biological information extraction
Biological Named Entity Recognition
Conditional Random Fields
Poisson Collocations
spellingShingle Biological information extraction
Biological Named Entity Recognition
Conditional Random Fields
Poisson Collocations
Supattanawaree Thipcharoen
Watshara Shoombuatong
Samerkae Somhom
Rattasit Sukahut
Jeerayut Chaijaruwanich
Constructing Biological Knowledge Base using Named Entities Recognition and Term Collocation
description Over the last few decades, the publishing of biological literature has dramatically increased due to technological developments. Thus, a crucial process is to extract information from this large number of writings by utilizing a biological named entity (NER) approach to automatically label corresponding biological terms. It is desirable to propose an effective method to identify biological named entities and automatically establish the specific knowledge base from biological literature. Herein, we made efforts in investigating biological information extraction for establishing specific knowledge as follows: 1) proposing NER method based on the efficient conditional random fields (CRFs) model, called NER-CRF, for performing on the benchmarking data (JNLPBA2004). The proposed NER method provided a higher result with 90.42% recall, 97.74% precision, and 94.30% F-measure, compared with the existing method with 75.99% recall, 69.42% precision, and 72.55% F-measure; 2) applying the Poisson approach for constructing an interpretability biological knowledge network to give good understanding to the global properties collocation of biological terms from the literature. Our finding provided the collocations of biological terms from the literature providing some insights to the specific biological literature.
author Supattanawaree Thipcharoen
Watshara Shoombuatong
Samerkae Somhom
Rattasit Sukahut
Jeerayut Chaijaruwanich
author_facet Supattanawaree Thipcharoen
Watshara Shoombuatong
Samerkae Somhom
Rattasit Sukahut
Jeerayut Chaijaruwanich
author_sort Supattanawaree Thipcharoen
title Constructing Biological Knowledge Base using Named Entities Recognition and Term Collocation
title_short Constructing Biological Knowledge Base using Named Entities Recognition and Term Collocation
title_full Constructing Biological Knowledge Base using Named Entities Recognition and Term Collocation
title_fullStr Constructing Biological Knowledge Base using Named Entities Recognition and Term Collocation
title_full_unstemmed Constructing Biological Knowledge Base using Named Entities Recognition and Term Collocation
title_sort constructing biological knowledge base using named entities recognition and term collocation
publisher Science Faculty of Chiang Mai University
publishDate 2019
url http://it.science.cmu.ac.th/ejournal/dl.php?journal_id=6824
http://cmuir.cmu.ac.th/jspui/handle/6653943832/66125
_version_ 1681426396763652096