A stemming algorithm for Tagalog words

Tag-SA, a Tagalog Stemming Algorithm, was developed for all forms of Tagalog words. It can be used specifically for morphological analysis to derive root words. In addition, it can also be applied to information retrieval (IR) to conflate different word forms to a common canonical form. It uses the...

Full description

Saved in:
Bibliographic Details
Main Author: Bonus, Don Erick J.
Format: text
Language:English
Published: Animo Repository 2003
Subjects:
Online Access:https://animorepository.dlsu.edu.ph/etd_masteral/3111
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
Language: English
id oai:animorepository.dlsu.edu.ph:etd_masteral-9949
record_format eprints
spelling oai:animorepository.dlsu.edu.ph:etd_masteral-99492023-05-24T00:25:44Z A stemming algorithm for Tagalog words Bonus, Don Erick J. Tag-SA, a Tagalog Stemming Algorithm, was developed for all forms of Tagalog words. It can be used specifically for morphological analysis to derive root words. In addition, it can also be applied to information retrieval (IR) to conflate different word forms to a common canonical form. It uses the principle of iterative affix removal and is context sensitive. The system was tested and evaluated based on error counting using 6,382 words variants derived from three sources (duplicates included). The resulting understemming error of less than 15 % and overstemming error of less than 0.005 % indicate a good performance of TagSA. 2003-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/etd_masteral/3111 Master's Theses English Animo Repository Computer algorithms Tagalog language Word processing
institution De La Salle University
building De La Salle University Library
continent Asia
country Philippines
Philippines
content_provider De La Salle University Library
collection DLSU Institutional Repository
language English
topic Computer algorithms
Tagalog language
Word processing
spellingShingle Computer algorithms
Tagalog language
Word processing
Bonus, Don Erick J.
A stemming algorithm for Tagalog words
description Tag-SA, a Tagalog Stemming Algorithm, was developed for all forms of Tagalog words. It can be used specifically for morphological analysis to derive root words. In addition, it can also be applied to information retrieval (IR) to conflate different word forms to a common canonical form. It uses the principle of iterative affix removal and is context sensitive. The system was tested and evaluated based on error counting using 6,382 words variants derived from three sources (duplicates included). The resulting understemming error of less than 15 % and overstemming error of less than 0.005 % indicate a good performance of TagSA.
format text
author Bonus, Don Erick J.
author_facet Bonus, Don Erick J.
author_sort Bonus, Don Erick J.
title A stemming algorithm for Tagalog words
title_short A stemming algorithm for Tagalog words
title_full A stemming algorithm for Tagalog words
title_fullStr A stemming algorithm for Tagalog words
title_full_unstemmed A stemming algorithm for Tagalog words
title_sort stemming algorithm for tagalog words
publisher Animo Repository
publishDate 2003
url https://animorepository.dlsu.edu.ph/etd_masteral/3111
_version_ 1767197074356436992