A corpus based-Filipino grammar checker using hybrid N-gram rules from grammatically-correct terms

This study examines the use of a corpus-based approach as a method for detecting grammatical errors and suggesting corrections for the Filipino language. Prior to this study, the said approach has not yet been applied for the target language, while it showed a high potential in error detection and c...

Full description

Saved in:

Bibliographic Details
Main Author:	Go, Matthew Phillip
Format:	text
Language:	English
Published:	Animo Repository 2016
Subjects:	Filipino language > Grammar Filipino language Filipino language > Study and teaching
Online Access:	https://animorepository.dlsu.edu.ph/etd_masteral/5335
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	De La Salle University
Language:	English

id	oai:animorepository.dlsu.edu.ph:etd_masteral-12173
record_format	eprints
spelling	oai:animorepository.dlsu.edu.ph:etd_masteral-121732024-10-30T05:46:13Z A corpus based-Filipino grammar checker using hybrid N-gram rules from grammatically-correct terms Go, Matthew Phillip This study examines the use of a corpus-based approach as a method for detecting grammatical errors and suggesting corrections for the Filipino language. Prior to this study, the said approach has not yet been applied for the target language, while it showed a high potential in error detection and correction in other languages. Currently, Filipino grammar checker systems are limited and are mostly rule-based systems. One huge concern with this existing type of systems in Filipino is that it can only detect errors that were denied by the system which results to a very limited set of error types. The proposed approach, being corpus-based, learns grammar rules from a grammatically-correct and tagged corpus which will be used in detecting errors and providing suggestions. The grammar rules, which are hybrid n-grams, will be composed of words, part-of-speech tags, and lemmas. Input sentences will be compared against these grammar rules and identify whether there is an error or not using a weighted Levenshtein edit distance algorithm. Using this approach, the correction types can be suggested: insertion, deletion, substitution, merging, and unmerging. The approach also covers a broad range of error types such as: incorrect a x, misspellings, wrong word usage, missing word, unnecessary words, incorrectly merged words, and incorrectly unmerged words. The developed system has scored 64.11% in producing correct suggestions for 248 test phrases containing spelling/grammar errors and scored 70.95% accuracy in aging error-free words in a 1,284 error-free word corpus using only a small training corpus of 7,384 complex sentences. 2016-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/etd_masteral/5335 Master's Theses English Animo Repository Filipino language--Grammar Filipino language Filipino language--Study and teaching
institution	De La Salle University
building	De La Salle University Library
continent	Asia
country	Philippines Philippines
content_provider	De La Salle University Library
collection	DLSU Institutional Repository
language	English
topic	Filipino language--Grammar Filipino language Filipino language--Study and teaching
spellingShingle	Filipino language--Grammar Filipino language Filipino language--Study and teaching Go, Matthew Phillip A corpus based-Filipino grammar checker using hybrid N-gram rules from grammatically-correct terms
description	This study examines the use of a corpus-based approach as a method for detecting grammatical errors and suggesting corrections for the Filipino language. Prior to this study, the said approach has not yet been applied for the target language, while it showed a high potential in error detection and correction in other languages. Currently, Filipino grammar checker systems are limited and are mostly rule-based systems. One huge concern with this existing type of systems in Filipino is that it can only detect errors that were denied by the system which results to a very limited set of error types. The proposed approach, being corpus-based, learns grammar rules from a grammatically-correct and tagged corpus which will be used in detecting errors and providing suggestions. The grammar rules, which are hybrid n-grams, will be composed of words, part-of-speech tags, and lemmas. Input sentences will be compared against these grammar rules and identify whether there is an error or not using a weighted Levenshtein edit distance algorithm. Using this approach, the correction types can be suggested: insertion, deletion, substitution, merging, and unmerging. The approach also covers a broad range of error types such as: incorrect a x, misspellings, wrong word usage, missing word, unnecessary words, incorrectly merged words, and incorrectly unmerged words. The developed system has scored 64.11% in producing correct suggestions for 248 test phrases containing spelling/grammar errors and scored 70.95% accuracy in aging error-free words in a 1,284 error-free word corpus using only a small training corpus of 7,384 complex sentences.
format	text
author	Go, Matthew Phillip
author_facet	Go, Matthew Phillip
author_sort	Go, Matthew Phillip
title	A corpus based-Filipino grammar checker using hybrid N-gram rules from grammatically-correct terms
title_short	A corpus based-Filipino grammar checker using hybrid N-gram rules from grammatically-correct terms
title_full	A corpus based-Filipino grammar checker using hybrid N-gram rules from grammatically-correct terms
title_fullStr	A corpus based-Filipino grammar checker using hybrid N-gram rules from grammatically-correct terms
title_full_unstemmed	A corpus based-Filipino grammar checker using hybrid N-gram rules from grammatically-correct terms
title_sort	corpus based-filipino grammar checker using hybrid n-gram rules from grammatically-correct terms
publisher	Animo Repository
publishDate	2016
url	https://animorepository.dlsu.edu.ph/etd_masteral/5335
_version_	1814781380387667968

A corpus based-Filipino grammar checker using hybrid N-gram rules from grammatically-correct terms

Similar Items