NormAPI: An API for normalizing Filipino shortcut texts

As the number of Internet and mobile phone users grows, texting and chatting have become popular means of communication. Reaching new heights, the extensive use of cellphones and Internet led into the creation of a new language, where words are transformed and made shorter using various styles. Shor...

Full description

Saved in:
Bibliographic Details
Main Authors: Cuevas, Justin Gems G., Magat, Enrico Darwin S., Nocon, Nicco Louis S., Suministrado, Peter Gabriel D.
Format: text
Language:English
Published: Animo Repository 2014
Online Access:https://animorepository.dlsu.edu.ph/etd_bachelors/11826
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
Language: English
id oai:animorepository.dlsu.edu.ph:etd_bachelors-12471
record_format eprints
spelling oai:animorepository.dlsu.edu.ph:etd_bachelors-124712022-07-07T02:51:32Z NormAPI: An API for normalizing Filipino shortcut texts Cuevas, Justin Gems G. Magat, Enrico Darwin S. Nocon, Nicco Louis S. Suministrado, Peter Gabriel D. As the number of Internet and mobile phone users grows, texting and chatting have become popular means of communication. Reaching new heights, the extensive use of cellphones and Internet led into the creation of a new language, where words are transformed and made shorter using various styles. Shortcut texting is used all over the world and in recent years, numerous researchers have created normalization systems in different languages that would transform shortcut texts back into their original forms. This research designed techniques and developed NormAPI, a system that will normalize Filipino shortcut texts. Focused on modern Filipino language which includes code-switching, the system primarily contributes to Natural Language Processing (NLP) research as a preprocessing system that corrects informalities in shortcut texts before they are handed for complete data processing. Functionalities include using four normalization variations namely, Dictionary Substitution Approach (DSA), Statistical Machine Translation (SMT), SMT after DSA and SMT before DSA, with 0.68384, 0.79650, 0.75634 and 0.80750 BLEU scores, respectively. Additionally, options such as setting the dictionary, generating language models, getting BLEU scores and more can be utilized by users based on their preferences. 2014-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/etd_bachelors/11826 Bachelor's Theses English Animo Repository
institution De La Salle University
building De La Salle University Library
continent Asia
country Philippines
Philippines
content_provider De La Salle University Library
collection DLSU Institutional Repository
language English
description As the number of Internet and mobile phone users grows, texting and chatting have become popular means of communication. Reaching new heights, the extensive use of cellphones and Internet led into the creation of a new language, where words are transformed and made shorter using various styles. Shortcut texting is used all over the world and in recent years, numerous researchers have created normalization systems in different languages that would transform shortcut texts back into their original forms. This research designed techniques and developed NormAPI, a system that will normalize Filipino shortcut texts. Focused on modern Filipino language which includes code-switching, the system primarily contributes to Natural Language Processing (NLP) research as a preprocessing system that corrects informalities in shortcut texts before they are handed for complete data processing. Functionalities include using four normalization variations namely, Dictionary Substitution Approach (DSA), Statistical Machine Translation (SMT), SMT after DSA and SMT before DSA, with 0.68384, 0.79650, 0.75634 and 0.80750 BLEU scores, respectively. Additionally, options such as setting the dictionary, generating language models, getting BLEU scores and more can be utilized by users based on their preferences.
format text
author Cuevas, Justin Gems G.
Magat, Enrico Darwin S.
Nocon, Nicco Louis S.
Suministrado, Peter Gabriel D.
spellingShingle Cuevas, Justin Gems G.
Magat, Enrico Darwin S.
Nocon, Nicco Louis S.
Suministrado, Peter Gabriel D.
NormAPI: An API for normalizing Filipino shortcut texts
author_facet Cuevas, Justin Gems G.
Magat, Enrico Darwin S.
Nocon, Nicco Louis S.
Suministrado, Peter Gabriel D.
author_sort Cuevas, Justin Gems G.
title NormAPI: An API for normalizing Filipino shortcut texts
title_short NormAPI: An API for normalizing Filipino shortcut texts
title_full NormAPI: An API for normalizing Filipino shortcut texts
title_fullStr NormAPI: An API for normalizing Filipino shortcut texts
title_full_unstemmed NormAPI: An API for normalizing Filipino shortcut texts
title_sort normapi: an api for normalizing filipino shortcut texts
publisher Animo Repository
publishDate 2014
url https://animorepository.dlsu.edu.ph/etd_bachelors/11826
_version_ 1738854802142527488