A Tagalog morphological analyzer using example-based approach

Example-based MA approaches learn a languages morphology from a set of examples. Researches in this area have been developed to address the time consuming and costly development of rule-based MAs. But most researches in this area are centered on concatenative morphology and little work has been done...

Full description

Saved in:
Bibliographic Details
Main Author: See, Solomon Lim
Format: text
Language:English
Published: Animo Repository 2006
Subjects:
Online Access:https://animorepository.dlsu.edu.ph/etd_masteral/3395
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
Language: English
id oai:animorepository.dlsu.edu.ph:etd_masteral-10233
record_format eprints
spelling oai:animorepository.dlsu.edu.ph:etd_masteral-102332020-12-26T02:01:51Z A Tagalog morphological analyzer using example-based approach See, Solomon Lim Example-based MA approaches learn a languages morphology from a set of examples. Researches in this area have been developed to address the time consuming and costly development of rule-based MAs. But most researches in this area are centered on concatenative morphology and little work has been done for non-concatenative morphology due to its complexity. Tagalog is an example of a language that exhibits non-concatenative morphology. Some works on example-based MA that has been able to handle such morphologies incorrectly models the morphological phenomena of infixation and reduplication. An example-based MA that learns string rewrite rules from a word pair was developed to handle the different morphological phenomena in Tagalog, namely prefixation, infixation, suffixation, cirumfixation, internal vowel changes, and partial and whole word reduplication, and its morphotactic rules. The model was evaluated against a Filipino lexicon because the language is composed mainly Tagalog words, adapts Tagalog morphology and is a language commonly used in the Philippines. The model was tested using ten-fold cross validation with 40,272 word pairs. The model developed performs better with words exhibiting infixation and reduplication and has an accuracy of 90% for both derivational and inflectional morphology from an original performance of 88% using the original model. The analysis time on the other hand increased from 11 minutes using the original model to 35 minutes using the developed model. The developed model can be used to discover affixes and its associated morphological categories for other languages that exhibit the same morphological phenomena. The current limitation of the model is that it is unable to properly model agglutination and a solution considering syllabication and phonology for word alignment is recommended to further improve its performance. 2006-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/etd_masteral/3395 Master's Theses English Animo Repository Grammar Comparative and general--Morphology
institution De La Salle University
building De La Salle University Library
continent Asia
country Philippines
Philippines
content_provider De La Salle University Library
collection DLSU Institutional Repository
language English
topic Grammar
Comparative and general--Morphology
spellingShingle Grammar
Comparative and general--Morphology
See, Solomon Lim
A Tagalog morphological analyzer using example-based approach
description Example-based MA approaches learn a languages morphology from a set of examples. Researches in this area have been developed to address the time consuming and costly development of rule-based MAs. But most researches in this area are centered on concatenative morphology and little work has been done for non-concatenative morphology due to its complexity. Tagalog is an example of a language that exhibits non-concatenative morphology. Some works on example-based MA that has been able to handle such morphologies incorrectly models the morphological phenomena of infixation and reduplication. An example-based MA that learns string rewrite rules from a word pair was developed to handle the different morphological phenomena in Tagalog, namely prefixation, infixation, suffixation, cirumfixation, internal vowel changes, and partial and whole word reduplication, and its morphotactic rules. The model was evaluated against a Filipino lexicon because the language is composed mainly Tagalog words, adapts Tagalog morphology and is a language commonly used in the Philippines. The model was tested using ten-fold cross validation with 40,272 word pairs. The model developed performs better with words exhibiting infixation and reduplication and has an accuracy of 90% for both derivational and inflectional morphology from an original performance of 88% using the original model. The analysis time on the other hand increased from 11 minutes using the original model to 35 minutes using the developed model. The developed model can be used to discover affixes and its associated morphological categories for other languages that exhibit the same morphological phenomena. The current limitation of the model is that it is unable to properly model agglutination and a solution considering syllabication and phonology for word alignment is recommended to further improve its performance.
format text
author See, Solomon Lim
author_facet See, Solomon Lim
author_sort See, Solomon Lim
title A Tagalog morphological analyzer using example-based approach
title_short A Tagalog morphological analyzer using example-based approach
title_full A Tagalog morphological analyzer using example-based approach
title_fullStr A Tagalog morphological analyzer using example-based approach
title_full_unstemmed A Tagalog morphological analyzer using example-based approach
title_sort tagalog morphological analyzer using example-based approach
publisher Animo Repository
publishDate 2006
url https://animorepository.dlsu.edu.ph/etd_masteral/3395
_version_ 1775631135014912000