Constructing biological knowledge base using named entities recognition and term collocation
© 2016, Chiang Mai Journal of Science. All rights reserved. Over the last few decades, the publishing of biological literature has dramatically increased due to technological developments. Thus, a crucial process is to extract information from this large number of writings by utilizing a biological...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Journal |
Published: |
2018
|
Subjects: | |
Online Access: | https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84961817052&origin=inward http://cmuir.cmu.ac.th/jspui/handle/6653943832/55292 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Chiang Mai University |
id |
th-cmuir.6653943832-55292 |
---|---|
record_format |
dspace |
spelling |
th-cmuir.6653943832-552922018-09-05T03:14:20Z Constructing biological knowledge base using named entities recognition and term collocation Supattanawaree Thipcharoen Watshara Shoombuatong Samerkae Somhom Rattasit Sukhahuta Jeerayut Chaijaruwanich Biochemistry, Genetics and Molecular Biology Chemistry Materials Science Mathematics Physics and Astronomy © 2016, Chiang Mai Journal of Science. All rights reserved. Over the last few decades, the publishing of biological literature has dramatically increased due to technological developments. Thus, a crucial process is to extract information from this large number of writings by utilizing a biological named entity (NER) approach to automatically label corresponding biological terms. It is desirable to propose an effective method to identify biological named entities and automatically establish the specific knowledge base from biological literature. Herein, we made efforts in investigating biological information extraction for establishing specific knowledge as follows: 1) proposing NER method based on the efficient conditional random fields (CRFs) model, called NER-CRF, for performing on the benchmarking data (JNLPBA2004). The proposed NER method provided a higher result with 90.42% recall, 97.74% precision, and 94.30% F-measure, compared with the existing method with 75.99% recall, 69.42% precision, and 72.55% F-measure; 2) applying the Poisson approach for constructing an interpretability biological knowledge network to give good understanding to the global properties collocation of biological terms from the literature. Our finding provided the collocations of biological terms from the literature providing some insights to the specific biological literature. 2018-09-05T02:54:06Z 2018-09-05T02:54:06Z 2016-01-01 Journal 01252526 2-s2.0-84961817052 https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84961817052&origin=inward http://cmuir.cmu.ac.th/jspui/handle/6653943832/55292 |
institution |
Chiang Mai University |
building |
Chiang Mai University Library |
country |
Thailand |
collection |
CMU Intellectual Repository |
topic |
Biochemistry, Genetics and Molecular Biology Chemistry Materials Science Mathematics Physics and Astronomy |
spellingShingle |
Biochemistry, Genetics and Molecular Biology Chemistry Materials Science Mathematics Physics and Astronomy Supattanawaree Thipcharoen Watshara Shoombuatong Samerkae Somhom Rattasit Sukhahuta Jeerayut Chaijaruwanich Constructing biological knowledge base using named entities recognition and term collocation |
description |
© 2016, Chiang Mai Journal of Science. All rights reserved. Over the last few decades, the publishing of biological literature has dramatically increased due to technological developments. Thus, a crucial process is to extract information from this large number of writings by utilizing a biological named entity (NER) approach to automatically label corresponding biological terms. It is desirable to propose an effective method to identify biological named entities and automatically establish the specific knowledge base from biological literature. Herein, we made efforts in investigating biological information extraction for establishing specific knowledge as follows: 1) proposing NER method based on the efficient conditional random fields (CRFs) model, called NER-CRF, for performing on the benchmarking data (JNLPBA2004). The proposed NER method provided a higher result with 90.42% recall, 97.74% precision, and 94.30% F-measure, compared with the existing method with 75.99% recall, 69.42% precision, and 72.55% F-measure; 2) applying the Poisson approach for constructing an interpretability biological knowledge network to give good understanding to the global properties collocation of biological terms from the literature. Our finding provided the collocations of biological terms from the literature providing some insights to the specific biological literature. |
format |
Journal |
author |
Supattanawaree Thipcharoen Watshara Shoombuatong Samerkae Somhom Rattasit Sukhahuta Jeerayut Chaijaruwanich |
author_facet |
Supattanawaree Thipcharoen Watshara Shoombuatong Samerkae Somhom Rattasit Sukhahuta Jeerayut Chaijaruwanich |
author_sort |
Supattanawaree Thipcharoen |
title |
Constructing biological knowledge base using named entities recognition and term collocation |
title_short |
Constructing biological knowledge base using named entities recognition and term collocation |
title_full |
Constructing biological knowledge base using named entities recognition and term collocation |
title_fullStr |
Constructing biological knowledge base using named entities recognition and term collocation |
title_full_unstemmed |
Constructing biological knowledge base using named entities recognition and term collocation |
title_sort |
constructing biological knowledge base using named entities recognition and term collocation |
publishDate |
2018 |
url |
https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84961817052&origin=inward http://cmuir.cmu.ac.th/jspui/handle/6653943832/55292 |
_version_ |
1681424478681169920 |