A stemming algorithm for Tagalog words

Tag-SA, a Tagalog Stemming Algorithm, was developed for all forms of Tagalog words. It can be used specifically for morphological analysis to derive root words. In addition, it can also be applied to information retrieval (IR) to conflate different word forms to a common canonical form. It uses the...

Full description

Saved in:

Bibliographic Details
Main Author:	Bonus, Don Erick J.
Format:	text
Language:	English
Published:	Animo Repository 2003
Subjects:	Computer algorithms Tagalog language Word processing
Online Access:	https://animorepository.dlsu.edu.ph/etd_masteral/3111
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	De La Salle University
Language:	English

Description
Summary:	Tag-SA, a Tagalog Stemming Algorithm, was developed for all forms of Tagalog words. It can be used specifically for morphological analysis to derive root words. In addition, it can also be applied to information retrieval (IR) to conflate different word forms to a common canonical form. It uses the principle of iterative affix removal and is context sensitive. The system was tested and evaluated based on error counting using 6,382 words variants derived from three sources (duplicates included). The resulting understemming error of less than 15 % and overstemming error of less than 0.005 % indicate a good performance of TagSA.

A stemming algorithm for Tagalog words

Similar Items