Using Stanford part-of-speech tagger for the morphologically-rich Filipino Language
This research focuses on the implementation of a Maximum Entropy-based Part-of-Speech (POS) tagger for Filipino. It uses the Stanford POS tagger - a trainable POS tagger that has been trained on English, Chinese, Arabic, and other languages and producing one of the highest results in each language....
Saved in:
Main Authors: | , |
---|---|
Format: | text |
Published: |
Animo Repository
2019
|
Subjects: | |
Online Access: | https://animorepository.dlsu.edu.ph/faculty_research/484 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | De La Salle University |
id |
oai:animorepository.dlsu.edu.ph:faculty_research-1483 |
---|---|
record_format |
eprints |
spelling |
oai:animorepository.dlsu.edu.ph:faculty_research-14832022-07-07T02:47:27Z Using Stanford part-of-speech tagger for the morphologically-rich Filipino Language Go, Matthew Phillip V. Nocon, Nicco Louis S. This research focuses on the implementation of a Maximum Entropy-based Part-of-Speech (POS) tagger for Filipino. It uses the Stanford POS tagger - a trainable POS tagger that has been trained on English, Chinese, Arabic, and other languages and producing one of the highest results in each language. The tagger was trained for Filipino using a 406k token corpus and considering unique Filipino linguistic phenomena such as high morphology and intra-sentential code-switches. The Filipino POS tagger resulted to 96.15% tagging accuracy which currently presents the highest accuracy and with a large lead among existing POS taggers for Filipino. Copyright © 2017 Matthew Phillip Go and Nicco Nocon 2019-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/faculty_research/484 Faculty Research Work Animo Repository Filipino language—Parts of speech South and Southeast Asian Languages and Societies |
institution |
De La Salle University |
building |
De La Salle University Library |
continent |
Asia |
country |
Philippines Philippines |
content_provider |
De La Salle University Library |
collection |
DLSU Institutional Repository |
topic |
Filipino language—Parts of speech South and Southeast Asian Languages and Societies |
spellingShingle |
Filipino language—Parts of speech South and Southeast Asian Languages and Societies Go, Matthew Phillip V. Nocon, Nicco Louis S. Using Stanford part-of-speech tagger for the morphologically-rich Filipino Language |
description |
This research focuses on the implementation of a Maximum Entropy-based Part-of-Speech (POS) tagger for Filipino. It uses the Stanford POS tagger - a trainable POS tagger that has been trained on English, Chinese, Arabic, and other languages and producing one of the highest results in each language. The tagger was trained for Filipino using a 406k token corpus and considering unique Filipino linguistic phenomena such as high morphology and intra-sentential code-switches. The Filipino POS tagger resulted to 96.15% tagging accuracy which currently presents the highest accuracy and with a large lead among existing POS taggers for Filipino. Copyright © 2017 Matthew Phillip Go and Nicco Nocon |
format |
text |
author |
Go, Matthew Phillip V. Nocon, Nicco Louis S. |
author_facet |
Go, Matthew Phillip V. Nocon, Nicco Louis S. |
author_sort |
Go, Matthew Phillip V. |
title |
Using Stanford part-of-speech tagger for the morphologically-rich Filipino Language |
title_short |
Using Stanford part-of-speech tagger for the morphologically-rich Filipino Language |
title_full |
Using Stanford part-of-speech tagger for the morphologically-rich Filipino Language |
title_fullStr |
Using Stanford part-of-speech tagger for the morphologically-rich Filipino Language |
title_full_unstemmed |
Using Stanford part-of-speech tagger for the morphologically-rich Filipino Language |
title_sort |
using stanford part-of-speech tagger for the morphologically-rich filipino language |
publisher |
Animo Repository |
publishDate |
2019 |
url |
https://animorepository.dlsu.edu.ph/faculty_research/484 |
_version_ |
1738854790845169664 |