Using Stanford part-of-speech tagger for the morphologically-rich Filipino Language

This research focuses on the implementation of a Maximum Entropy-based Part-of-Speech (POS) tagger for Filipino. It uses the Stanford POS tagger - a trainable POS tagger that has been trained on English, Chinese, Arabic, and other languages and producing one of the highest results in each language....

Full description

Saved in:
Bibliographic Details
Main Authors: Go, Matthew Phillip V., Nocon, Nicco Louis S.
Format: text
Published: Animo Repository 2019
Subjects:
Online Access:https://animorepository.dlsu.edu.ph/faculty_research/484
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
id oai:animorepository.dlsu.edu.ph:faculty_research-1483
record_format eprints
spelling oai:animorepository.dlsu.edu.ph:faculty_research-14832022-07-07T02:47:27Z Using Stanford part-of-speech tagger for the morphologically-rich Filipino Language Go, Matthew Phillip V. Nocon, Nicco Louis S. This research focuses on the implementation of a Maximum Entropy-based Part-of-Speech (POS) tagger for Filipino. It uses the Stanford POS tagger - a trainable POS tagger that has been trained on English, Chinese, Arabic, and other languages and producing one of the highest results in each language. The tagger was trained for Filipino using a 406k token corpus and considering unique Filipino linguistic phenomena such as high morphology and intra-sentential code-switches. The Filipino POS tagger resulted to 96.15% tagging accuracy which currently presents the highest accuracy and with a large lead among existing POS taggers for Filipino. Copyright © 2017 Matthew Phillip Go and Nicco Nocon 2019-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/faculty_research/484 Faculty Research Work Animo Repository Filipino language—Parts of speech South and Southeast Asian Languages and Societies
institution De La Salle University
building De La Salle University Library
continent Asia
country Philippines
Philippines
content_provider De La Salle University Library
collection DLSU Institutional Repository
topic Filipino language—Parts of speech
South and Southeast Asian Languages and Societies
spellingShingle Filipino language—Parts of speech
South and Southeast Asian Languages and Societies
Go, Matthew Phillip V.
Nocon, Nicco Louis S.
Using Stanford part-of-speech tagger for the morphologically-rich Filipino Language
description This research focuses on the implementation of a Maximum Entropy-based Part-of-Speech (POS) tagger for Filipino. It uses the Stanford POS tagger - a trainable POS tagger that has been trained on English, Chinese, Arabic, and other languages and producing one of the highest results in each language. The tagger was trained for Filipino using a 406k token corpus and considering unique Filipino linguistic phenomena such as high morphology and intra-sentential code-switches. The Filipino POS tagger resulted to 96.15% tagging accuracy which currently presents the highest accuracy and with a large lead among existing POS taggers for Filipino. Copyright © 2017 Matthew Phillip Go and Nicco Nocon
format text
author Go, Matthew Phillip V.
Nocon, Nicco Louis S.
author_facet Go, Matthew Phillip V.
Nocon, Nicco Louis S.
author_sort Go, Matthew Phillip V.
title Using Stanford part-of-speech tagger for the morphologically-rich Filipino Language
title_short Using Stanford part-of-speech tagger for the morphologically-rich Filipino Language
title_full Using Stanford part-of-speech tagger for the morphologically-rich Filipino Language
title_fullStr Using Stanford part-of-speech tagger for the morphologically-rich Filipino Language
title_full_unstemmed Using Stanford part-of-speech tagger for the morphologically-rich Filipino Language
title_sort using stanford part-of-speech tagger for the morphologically-rich filipino language
publisher Animo Repository
publishDate 2019
url https://animorepository.dlsu.edu.ph/faculty_research/484
_version_ 1738854790845169664