Constructing biological knowledge base using named entities recognition and term collocation

© 2016, Chiang Mai Journal of Science. All rights reserved. Over the last few decades, the publishing of biological literature has dramatically increased due to technological developments. Thus, a crucial process is to extract information from this large number of writings by utilizing a biological...

Full description

Saved in:
Bibliographic Details
Main Authors: Supattanawaree Thipcharoen, Watshara Shoombuatong, Samerkae Somhom, Rattasit Sukhahuta, Jeerayut Chaijaruwanich
Other Authors: Chiang Mai University
Format: Article
Published: 2018
Subjects:
Online Access:https://repository.li.mahidol.ac.th/handle/123456789/43179
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Mahidol University
id th-mahidol.43179
record_format dspace
spelling th-mahidol.431792019-03-14T15:04:15Z Constructing biological knowledge base using named entities recognition and term collocation Supattanawaree Thipcharoen Watshara Shoombuatong Samerkae Somhom Rattasit Sukhahuta Jeerayut Chaijaruwanich Chiang Mai University Mahidol University Biochemistry, Genetics and Molecular Biology Chemistry Materials Science Mathematics © 2016, Chiang Mai Journal of Science. All rights reserved. Over the last few decades, the publishing of biological literature has dramatically increased due to technological developments. Thus, a crucial process is to extract information from this large number of writings by utilizing a biological named entity (NER) approach to automatically label corresponding biological terms. It is desirable to propose an effective method to identify biological named entities and automatically establish the specific knowledge base from biological literature. Herein, we made efforts in investigating biological information extraction for establishing specific knowledge as follows: 1) proposing NER method based on the efficient conditional random fields (CRFs) model, called NER-CRF, for performing on the benchmarking data (JNLPBA2004). The proposed NER method provided a higher result with 90.42% recall, 97.74% precision, and 94.30% F-measure, compared with the existing method with 75.99% recall, 69.42% precision, and 72.55% F-measure; 2) applying the Poisson approach for constructing an interpretability biological knowledge network to give good understanding to the global properties collocation of biological terms from the literature. Our finding provided the collocations of biological terms from the literature providing some insights to the specific biological literature. 2018-12-11T02:23:27Z 2019-03-14T08:04:15Z 2018-12-11T02:23:27Z 2019-03-14T08:04:15Z 2016-01-01 Article Chiang Mai Journal of Science. Vol.43, No.3 (2016), 660-670 01252526 2-s2.0-84961817052 https://repository.li.mahidol.ac.th/handle/123456789/43179 Mahidol University SCOPUS https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84961817052&origin=inward
institution Mahidol University
building Mahidol University Library
continent Asia
country Thailand
Thailand
content_provider Mahidol University Library
collection Mahidol University Institutional Repository
topic Biochemistry, Genetics and Molecular Biology
Chemistry
Materials Science
Mathematics
spellingShingle Biochemistry, Genetics and Molecular Biology
Chemistry
Materials Science
Mathematics
Supattanawaree Thipcharoen
Watshara Shoombuatong
Samerkae Somhom
Rattasit Sukhahuta
Jeerayut Chaijaruwanich
Constructing biological knowledge base using named entities recognition and term collocation
description © 2016, Chiang Mai Journal of Science. All rights reserved. Over the last few decades, the publishing of biological literature has dramatically increased due to technological developments. Thus, a crucial process is to extract information from this large number of writings by utilizing a biological named entity (NER) approach to automatically label corresponding biological terms. It is desirable to propose an effective method to identify biological named entities and automatically establish the specific knowledge base from biological literature. Herein, we made efforts in investigating biological information extraction for establishing specific knowledge as follows: 1) proposing NER method based on the efficient conditional random fields (CRFs) model, called NER-CRF, for performing on the benchmarking data (JNLPBA2004). The proposed NER method provided a higher result with 90.42% recall, 97.74% precision, and 94.30% F-measure, compared with the existing method with 75.99% recall, 69.42% precision, and 72.55% F-measure; 2) applying the Poisson approach for constructing an interpretability biological knowledge network to give good understanding to the global properties collocation of biological terms from the literature. Our finding provided the collocations of biological terms from the literature providing some insights to the specific biological literature.
author2 Chiang Mai University
author_facet Chiang Mai University
Supattanawaree Thipcharoen
Watshara Shoombuatong
Samerkae Somhom
Rattasit Sukhahuta
Jeerayut Chaijaruwanich
format Article
author Supattanawaree Thipcharoen
Watshara Shoombuatong
Samerkae Somhom
Rattasit Sukhahuta
Jeerayut Chaijaruwanich
author_sort Supattanawaree Thipcharoen
title Constructing biological knowledge base using named entities recognition and term collocation
title_short Constructing biological knowledge base using named entities recognition and term collocation
title_full Constructing biological knowledge base using named entities recognition and term collocation
title_fullStr Constructing biological knowledge base using named entities recognition and term collocation
title_full_unstemmed Constructing biological knowledge base using named entities recognition and term collocation
title_sort constructing biological knowledge base using named entities recognition and term collocation
publishDate 2018
url https://repository.li.mahidol.ac.th/handle/123456789/43179
_version_ 1763494457019203584