A hybrid approach in analyzing filipino morphology

This paper presents a hybrid-approach on Filipino Morphological Analysis by combining root word extraction and a rule-based grammatical information extrac- tion. In the hopes of the machine to learn and understand the different Filipino Morphological phenomena, this work introduced and compared mult...

Full description

Saved in:

Bibliographic Details
Main Author:	Yambao, Arian N.
Format:	text
Language:	English
Published:	Animo Repository 2021
Subjects:	Morphology Extraction (Linguistics) Computer Sciences
Online Access:	https://animorepository.dlsu.edu.ph/etdm_comsci/1 https://animorepository.dlsu.edu.ph/cgi/viewcontent.cgi?article=1005&context=etdm_comsci
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	De La Salle University
Language:	English

id	oai:animorepository.dlsu.edu.ph:etdm_comsci-1005
record_format	eprints
spelling	oai:animorepository.dlsu.edu.ph:etdm_comsci-10052021-09-08T03:42:23Z A hybrid approach in analyzing filipino morphology Yambao, Arian N. This paper presents a hybrid-approach on Filipino Morphological Analysis by combining root word extraction and a rule-based grammatical information extrac- tion. In the hopes of the machine to learn and understand the different Filipino Morphological phenomena, this work introduced and compared multiple neural network models and its variants: Feedforward Neural Networks (FFNN), Recur- rent Neural Networks (RNN), and BERT, for extracting the root word of any given Tagalog word, the performance of each models in this work was measured with accuracy and Levenshtein distance metrics and have seen that the BERT model was the best performing model on both the default test dataset (87.45%) and the UD-Treebanks dataset (79.11%). It was further compared with the MAGTaga- log Morphological analyser, which only performed 51.26% in the UD-Treebanks dataset. Although, it could be noticed that the BERT model had the biggest memory requirement and time-to-train making it slightly inefficient in terms of rapid development. Further problems regarding the models’ performance include suffixation, particularly suffixes ending ’g’ and ’ng’. It is noted that the mentioned suffixes were not also part of the official Tagalog morphological rules. It can be solved on future iterations of this work. All the proposed solutions performed very well in identifyin the different Filipino morphological phenomena and this work can be used on other NLP tasks with its API design. Keywords: Morphological Analysis 2021-01-01T08:00:00Z text application/pdf https://animorepository.dlsu.edu.ph/etdm_comsci/1 https://animorepository.dlsu.edu.ph/cgi/viewcontent.cgi?article=1005&context=etdm_comsci Computer Science Master's Theses English Animo Repository Morphology Extraction (Linguistics) Computer Sciences Morphology
institution	De La Salle University
building	De La Salle University Library
continent	Asia
country	Philippines Philippines
content_provider	De La Salle University Library
collection	DLSU Institutional Repository
language	English
topic	Morphology Extraction (Linguistics) Computer Sciences Morphology
spellingShingle	Morphology Extraction (Linguistics) Computer Sciences Morphology Yambao, Arian N. A hybrid approach in analyzing filipino morphology
description	This paper presents a hybrid-approach on Filipino Morphological Analysis by combining root word extraction and a rule-based grammatical information extrac- tion. In the hopes of the machine to learn and understand the different Filipino Morphological phenomena, this work introduced and compared multiple neural network models and its variants: Feedforward Neural Networks (FFNN), Recur- rent Neural Networks (RNN), and BERT, for extracting the root word of any given Tagalog word, the performance of each models in this work was measured with accuracy and Levenshtein distance metrics and have seen that the BERT model was the best performing model on both the default test dataset (87.45%) and the UD-Treebanks dataset (79.11%). It was further compared with the MAGTaga- log Morphological analyser, which only performed 51.26% in the UD-Treebanks dataset. Although, it could be noticed that the BERT model had the biggest memory requirement and time-to-train making it slightly inefficient in terms of rapid development. Further problems regarding the models’ performance include suffixation, particularly suffixes ending ’g’ and ’ng’. It is noted that the mentioned suffixes were not also part of the official Tagalog morphological rules. It can be solved on future iterations of this work. All the proposed solutions performed very well in identifyin the different Filipino morphological phenomena and this work can be used on other NLP tasks with its API design. Keywords: Morphological Analysis
format	text
author	Yambao, Arian N.
author_facet	Yambao, Arian N.
author_sort	Yambao, Arian N.
title	A hybrid approach in analyzing filipino morphology
title_short	A hybrid approach in analyzing filipino morphology
title_full	A hybrid approach in analyzing filipino morphology
title_fullStr	A hybrid approach in analyzing filipino morphology
title_full_unstemmed	A hybrid approach in analyzing filipino morphology
title_sort	hybrid approach in analyzing filipino morphology
publisher	Animo Repository
publishDate	2021
url	https://animorepository.dlsu.edu.ph/etdm_comsci/1 https://animorepository.dlsu.edu.ph/cgi/viewcontent.cgi?article=1005&context=etdm_comsci
_version_	1710755611526823936

A hybrid approach in analyzing filipino morphology

Similar Items