A spell checker for a low-resourced and morphologically rich language

Spell checking plays an important role in improving the quality of documents. Various e orts have been made towards advancement of spell checkers on other languages such as in English that has almost perfected spell checking system (e.g. Microsoft Word). However, few efforts were made to develop an...

Full description

Saved in:
Bibliographic Details
Main Author: Octaviano, Manolito V., Jr.
Format: text
Language:English
Published: Animo Repository 2017
Subjects:
Online Access:https://animorepository.dlsu.edu.ph/etd_masteral/5603
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
Language: English
id oai:animorepository.dlsu.edu.ph:etd_masteral-12441
record_format eprints
spelling oai:animorepository.dlsu.edu.ph:etd_masteral-124412024-09-09T00:41:17Z A spell checker for a low-resourced and morphologically rich language Octaviano, Manolito V., Jr. Spell checking plays an important role in improving the quality of documents. Various e orts have been made towards advancement of spell checkers on other languages such as in English that has almost perfected spell checking system (e.g. Microsoft Word). However, few efforts were made to develop an efficient Filipino spell checkers. One major challenge of existing Filipino spell checkers, being dictionary based, is the lack of a complete dictionary to capture all infected forms (e.g. isinasama `including', isasama `will be included', and isinama `included' with the base form sama `include'), borrowing (e.g. magtex `to text' and nagtex `texted'), and code-switching (e.g. mag-text `to text', and magte-text `will be texting', and nag-text `texted' with the base form `text') of a word. Another is that existing systems cannot handle code-switching wherein the words that are valid is marked incorrect. In this research, a combined automaton and N-gram approach of spell checking is proposed for Filipino, a morphologically rich and low-resourced language. Furthermore, the research introduce a modi cation of metaphone algorithm to incorporate in the language for ranking the suggestion of the system based on phonetics. The system results to 81% recall, 53.64% precision, 64.53% f-measure, and 87.78% suggestion adequacy on 100 sentences taken from exercise documents of Filipino students. 2017-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/etd_masteral/5603 Master's Theses English Animo Repository Machine theory Filipino language--Orthography and spelling--Computer programs
institution De La Salle University
building De La Salle University Library
continent Asia
country Philippines
Philippines
content_provider De La Salle University Library
collection DLSU Institutional Repository
language English
topic Machine theory
Filipino language--Orthography and spelling--Computer programs
spellingShingle Machine theory
Filipino language--Orthography and spelling--Computer programs
Octaviano, Manolito V., Jr.
A spell checker for a low-resourced and morphologically rich language
description Spell checking plays an important role in improving the quality of documents. Various e orts have been made towards advancement of spell checkers on other languages such as in English that has almost perfected spell checking system (e.g. Microsoft Word). However, few efforts were made to develop an efficient Filipino spell checkers. One major challenge of existing Filipino spell checkers, being dictionary based, is the lack of a complete dictionary to capture all infected forms (e.g. isinasama `including', isasama `will be included', and isinama `included' with the base form sama `include'), borrowing (e.g. magtex `to text' and nagtex `texted'), and code-switching (e.g. mag-text `to text', and magte-text `will be texting', and nag-text `texted' with the base form `text') of a word. Another is that existing systems cannot handle code-switching wherein the words that are valid is marked incorrect. In this research, a combined automaton and N-gram approach of spell checking is proposed for Filipino, a morphologically rich and low-resourced language. Furthermore, the research introduce a modi cation of metaphone algorithm to incorporate in the language for ranking the suggestion of the system based on phonetics. The system results to 81% recall, 53.64% precision, 64.53% f-measure, and 87.78% suggestion adequacy on 100 sentences taken from exercise documents of Filipino students.
format text
author Octaviano, Manolito V., Jr.
author_facet Octaviano, Manolito V., Jr.
author_sort Octaviano, Manolito V., Jr.
title A spell checker for a low-resourced and morphologically rich language
title_short A spell checker for a low-resourced and morphologically rich language
title_full A spell checker for a low-resourced and morphologically rich language
title_fullStr A spell checker for a low-resourced and morphologically rich language
title_full_unstemmed A spell checker for a low-resourced and morphologically rich language
title_sort spell checker for a low-resourced and morphologically rich language
publisher Animo Repository
publishDate 2017
url https://animorepository.dlsu.edu.ph/etd_masteral/5603
_version_ 1811611523257204736