A spell checker for a low-resourced and morphologically rich language
Spell checking plays an important role in improving the quality of documents. Various e orts have been made towards advancement of spell checkers on other languages such as in English that has almost perfected spell checking system (e.g. Microsoft Word). However, few efforts were made to develop an...
Saved in:
Main Author: | |
---|---|
Format: | text |
Language: | English |
Published: |
Animo Repository
2017
|
Subjects: | |
Online Access: | https://animorepository.dlsu.edu.ph/etd_masteral/5603 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | De La Salle University |
Language: | English |
id |
oai:animorepository.dlsu.edu.ph:etd_masteral-12441 |
---|---|
record_format |
eprints |
spelling |
oai:animorepository.dlsu.edu.ph:etd_masteral-124412024-09-09T00:41:17Z A spell checker for a low-resourced and morphologically rich language Octaviano, Manolito V., Jr. Spell checking plays an important role in improving the quality of documents. Various e orts have been made towards advancement of spell checkers on other languages such as in English that has almost perfected spell checking system (e.g. Microsoft Word). However, few efforts were made to develop an efficient Filipino spell checkers. One major challenge of existing Filipino spell checkers, being dictionary based, is the lack of a complete dictionary to capture all infected forms (e.g. isinasama `including', isasama `will be included', and isinama `included' with the base form sama `include'), borrowing (e.g. magtex `to text' and nagtex `texted'), and code-switching (e.g. mag-text `to text', and magte-text `will be texting', and nag-text `texted' with the base form `text') of a word. Another is that existing systems cannot handle code-switching wherein the words that are valid is marked incorrect. In this research, a combined automaton and N-gram approach of spell checking is proposed for Filipino, a morphologically rich and low-resourced language. Furthermore, the research introduce a modi cation of metaphone algorithm to incorporate in the language for ranking the suggestion of the system based on phonetics. The system results to 81% recall, 53.64% precision, 64.53% f-measure, and 87.78% suggestion adequacy on 100 sentences taken from exercise documents of Filipino students. 2017-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/etd_masteral/5603 Master's Theses English Animo Repository Machine theory Filipino language--Orthography and spelling--Computer programs |
institution |
De La Salle University |
building |
De La Salle University Library |
continent |
Asia |
country |
Philippines Philippines |
content_provider |
De La Salle University Library |
collection |
DLSU Institutional Repository |
language |
English |
topic |
Machine theory Filipino language--Orthography and spelling--Computer programs |
spellingShingle |
Machine theory Filipino language--Orthography and spelling--Computer programs Octaviano, Manolito V., Jr. A spell checker for a low-resourced and morphologically rich language |
description |
Spell checking plays an important role in improving the quality of documents. Various e orts have been made towards advancement of spell checkers on other languages such as in English that has almost perfected spell checking system (e.g. Microsoft Word). However, few efforts were made to develop an efficient Filipino spell checkers. One major challenge of existing Filipino spell checkers, being dictionary based, is the lack of a complete dictionary to capture all infected forms (e.g. isinasama `including', isasama `will be included', and isinama `included' with the base form sama `include'), borrowing (e.g. magtex `to text' and nagtex `texted'), and code-switching (e.g. mag-text `to text', and magte-text `will be texting', and nag-text `texted' with the base form `text') of a word. Another is that existing systems cannot handle code-switching wherein the words that are valid is marked incorrect. In this research, a combined automaton and N-gram approach of spell checking is proposed for Filipino, a morphologically rich and low-resourced language. Furthermore, the research introduce a modi cation of metaphone algorithm to incorporate in the language for ranking the suggestion of the system based on phonetics. The system results to 81% recall, 53.64% precision, 64.53% f-measure, and 87.78% suggestion adequacy on 100 sentences taken from exercise documents of Filipino students. |
format |
text |
author |
Octaviano, Manolito V., Jr. |
author_facet |
Octaviano, Manolito V., Jr. |
author_sort |
Octaviano, Manolito V., Jr. |
title |
A spell checker for a low-resourced and morphologically rich language |
title_short |
A spell checker for a low-resourced and morphologically rich language |
title_full |
A spell checker for a low-resourced and morphologically rich language |
title_fullStr |
A spell checker for a low-resourced and morphologically rich language |
title_full_unstemmed |
A spell checker for a low-resourced and morphologically rich language |
title_sort |
spell checker for a low-resourced and morphologically rich language |
publisher |
Animo Repository |
publishDate |
2017 |
url |
https://animorepository.dlsu.edu.ph/etd_masteral/5603 |
_version_ |
1811611523257204736 |