A spell checker for a low-resourced and morphologically rich language

Spell checking plays an important role in improving the quality of documents by identifying misspelled words in the document. There are various efforts made towards advancement of spell checkers on other languages such as in English that has almost perfected spell checking system (e.g. Microsoft Wor...

Full description

Saved in:
Bibliographic Details
Main Authors: Octaviano, Manolito, Borra, Allan
Format: text
Published: Animo Repository 2017
Subjects:
Online Access:https://animorepository.dlsu.edu.ph/faculty_research/3634
https://animorepository.dlsu.edu.ph/context/faculty_research/article/4636/type/native/viewcontent/TENCON.2017.8228160
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
id oai:animorepository.dlsu.edu.ph:faculty_research-4636
record_format eprints
spelling oai:animorepository.dlsu.edu.ph:faculty_research-46362021-09-20T07:10:00Z A spell checker for a low-resourced and morphologically rich language Octaviano, Manolito Borra, Allan Spell checking plays an important role in improving the quality of documents by identifying misspelled words in the document. There are various efforts made towards advancement of spell checkers on other languages such as in English that has almost perfected spell checking system (e.g. Microsoft Word). However, few efforts were made to develop an efficient Filipino spell checker. One major challenge of existing Filipino spell checkers, being dictionary-based, is the lack of a complete dictionary to capture all inflected forms (e.g. isinasama 'including', isasama 'will be included', and isinama 'included' with the base form sama 'include'), borrowing (e.g. magtex 'to text' and nagtex 'texted'), and code-switching (e.g. magtext 'to text', and nag-text 'texted' with the base form 'text') of a word. In addition, existing systems cannot handle code-switching wherein valid words are being marked as erroneous. In this research, a spell checking is designed for Filipino-low-resourced morphologically rich language. It detects and corrects typographical errors in the language and introduces a modified version of metaphone algorithm for ranking the candidate suggestions. The system results to 81% recall, 53.64% precision, 64.53% f-measure, and 87.78% suggestion adequacy on 100 sentences taken from exercise documents of Filipino students. © 2017 IEEE. 2017-12-19T08:00:00Z text text/html https://animorepository.dlsu.edu.ph/faculty_research/3634 info:doi/10.1109/TENCON.2017.8228160 https://animorepository.dlsu.edu.ph/context/faculty_research/article/4636/type/native/viewcontent/TENCON.2017.8228160 Faculty Research Work Animo Repository Filipino language--Orthography and spelling--Data processing Spelling errors Computer Sciences
institution De La Salle University
building De La Salle University Library
continent Asia
country Philippines
Philippines
content_provider De La Salle University Library
collection DLSU Institutional Repository
topic Filipino language--Orthography and spelling--Data processing
Spelling errors
Computer Sciences
spellingShingle Filipino language--Orthography and spelling--Data processing
Spelling errors
Computer Sciences
Octaviano, Manolito
Borra, Allan
A spell checker for a low-resourced and morphologically rich language
description Spell checking plays an important role in improving the quality of documents by identifying misspelled words in the document. There are various efforts made towards advancement of spell checkers on other languages such as in English that has almost perfected spell checking system (e.g. Microsoft Word). However, few efforts were made to develop an efficient Filipino spell checker. One major challenge of existing Filipino spell checkers, being dictionary-based, is the lack of a complete dictionary to capture all inflected forms (e.g. isinasama 'including', isasama 'will be included', and isinama 'included' with the base form sama 'include'), borrowing (e.g. magtex 'to text' and nagtex 'texted'), and code-switching (e.g. magtext 'to text', and nag-text 'texted' with the base form 'text') of a word. In addition, existing systems cannot handle code-switching wherein valid words are being marked as erroneous. In this research, a spell checking is designed for Filipino-low-resourced morphologically rich language. It detects and corrects typographical errors in the language and introduces a modified version of metaphone algorithm for ranking the candidate suggestions. The system results to 81% recall, 53.64% precision, 64.53% f-measure, and 87.78% suggestion adequacy on 100 sentences taken from exercise documents of Filipino students. © 2017 IEEE.
format text
author Octaviano, Manolito
Borra, Allan
author_facet Octaviano, Manolito
Borra, Allan
author_sort Octaviano, Manolito
title A spell checker for a low-resourced and morphologically rich language
title_short A spell checker for a low-resourced and morphologically rich language
title_full A spell checker for a low-resourced and morphologically rich language
title_fullStr A spell checker for a low-resourced and morphologically rich language
title_full_unstemmed A spell checker for a low-resourced and morphologically rich language
title_sort spell checker for a low-resourced and morphologically rich language
publisher Animo Repository
publishDate 2017
url https://animorepository.dlsu.edu.ph/faculty_research/3634
https://animorepository.dlsu.edu.ph/context/faculty_research/article/4636/type/native/viewcontent/TENCON.2017.8228160
_version_ 1767195947157159936