Khmer POS Tagging Using Conditional Random Fields

© 2018, Springer Nature Singapore Pte Ltd. The transformation-based approach with hybrid of rule-based and tri-gram have already been introduced for Khmer part-of-speech (POS) tagging. In this study, in order to further explore this topic, we present an alternative approach to Khmer POS tagging usin...

Full description

Saved in:
Bibliographic Details
Main Authors: Sokunsatya Sangvat, Charnyote Pluempitiwiriyawej
Other Authors: Mahidol University
Format: Conference or Workshop Item
Published: 2019
Subjects:
Online Access:https://repository.li.mahidol.ac.th/handle/123456789/45669
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Mahidol University
id th-mahidol.45669
record_format dspace
spelling th-mahidol.456692019-08-23T18:31:01Z Khmer POS Tagging Using Conditional Random Fields Sokunsatya Sangvat Charnyote Pluempitiwiriyawej Mahidol University Computer Science Mathematics © 2018, Springer Nature Singapore Pte Ltd. The transformation-based approach with hybrid of rule-based and tri-gram have already been introduced for Khmer part-of-speech (POS) tagging. In this study, in order to further explore this topic, we present an alternative approach to Khmer POS tagging using Conditional Random Fields (CRFs). Since the features greatly affect the tagging accuracy, we investigate five groups of features and use them with the CRF model. First, we study different contextual information and use it as our baseline model. We then analyze the characteristics of Khmer and come up with three additional groups of language-related features including morphemes, word-shapes and name-entities. We also explore the use of lexicon as features to further improve the accuracy of our tagger. Our proposed approach has been evaluated on a corpus of 41,058 words and 27 POS tags. The comparative study has shown that our proposed approach produces a competitive accuracy compared to other Khmer POS tagging approaches. 2019-08-23T10:58:35Z 2019-08-23T10:58:35Z 2018-01-01 Conference Paper Communications in Computer and Information Science. Vol.781, (2018), 169-178 10.1007/978-981-10-8438-6_14 18650929 2-s2.0-85044073164 https://repository.li.mahidol.ac.th/handle/123456789/45669 Mahidol University SCOPUS https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85044073164&origin=inward
institution Mahidol University
building Mahidol University Library
continent Asia
country Thailand
Thailand
content_provider Mahidol University Library
collection Mahidol University Institutional Repository
topic Computer Science
Mathematics
spellingShingle Computer Science
Mathematics
Sokunsatya Sangvat
Charnyote Pluempitiwiriyawej
Khmer POS Tagging Using Conditional Random Fields
description © 2018, Springer Nature Singapore Pte Ltd. The transformation-based approach with hybrid of rule-based and tri-gram have already been introduced for Khmer part-of-speech (POS) tagging. In this study, in order to further explore this topic, we present an alternative approach to Khmer POS tagging using Conditional Random Fields (CRFs). Since the features greatly affect the tagging accuracy, we investigate five groups of features and use them with the CRF model. First, we study different contextual information and use it as our baseline model. We then analyze the characteristics of Khmer and come up with three additional groups of language-related features including morphemes, word-shapes and name-entities. We also explore the use of lexicon as features to further improve the accuracy of our tagger. Our proposed approach has been evaluated on a corpus of 41,058 words and 27 POS tags. The comparative study has shown that our proposed approach produces a competitive accuracy compared to other Khmer POS tagging approaches.
author2 Mahidol University
author_facet Mahidol University
Sokunsatya Sangvat
Charnyote Pluempitiwiriyawej
format Conference or Workshop Item
author Sokunsatya Sangvat
Charnyote Pluempitiwiriyawej
author_sort Sokunsatya Sangvat
title Khmer POS Tagging Using Conditional Random Fields
title_short Khmer POS Tagging Using Conditional Random Fields
title_full Khmer POS Tagging Using Conditional Random Fields
title_fullStr Khmer POS Tagging Using Conditional Random Fields
title_full_unstemmed Khmer POS Tagging Using Conditional Random Fields
title_sort khmer pos tagging using conditional random fields
publishDate 2019
url https://repository.li.mahidol.ac.th/handle/123456789/45669
_version_ 1763489056250920960