Pattern matching refinements to dictionary-based code-switching point detection
This study presents the development and evaluation of pattern matching refinements (PMRs) to automatic code switching point (CSP) detection. With all PMRs, evaluation showed an accuracy of 94.51%. This is an improvement to reported accuracy rates of dictionary-based approaches, which are in the rang...
Saved in:
Main Authors: | , |
---|---|
Format: | text |
Published: |
Animo Repository
2012
|
Subjects: | |
Online Access: | https://animorepository.dlsu.edu.ph/faculty_research/588 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | De La Salle University |
id |
oai:animorepository.dlsu.edu.ph:faculty_research-1587 |
---|---|
record_format |
eprints |
spelling |
oai:animorepository.dlsu.edu.ph:faculty_research-15872022-01-06T00:59:33Z Pattern matching refinements to dictionary-based code-switching point detection Oco, Nathaniel Roxas, Rachel Edita This study presents the development and evaluation of pattern matching refinements (PMRs) to automatic code switching point (CSP) detection. With all PMRs, evaluation showed an accuracy of 94.51%. This is an improvement to reported accuracy rates of dictionary-based approaches, which are in the range of 75.22%-76.26% (Yeong and Tan, 2010). In our experiments, a 100-sentence Tagalog-English corpus was used as test bed. Analyses showed that the dictionary-based approach using part-of-speech checking yielded an accuracy of 79.76% only, and two notable linguistic phenomena, (1) intra-word code-switching and (2) common words, were shown to have caused the low accuracy. The devised PMRs, namely: (1) common word exclusion, (2) common word identification, and (3) common n-gram pruning address this and showed improved accuracy. The work can be extended using audio files and machine learning with larger language resources. © 2012 The PACLIC. 2012-12-01T08:00:00Z text https://animorepository.dlsu.edu.ph/faculty_research/588 Faculty Research Work Animo Repository Computational linguistics Code switching (Linguistics) Computer Sciences |
institution |
De La Salle University |
building |
De La Salle University Library |
continent |
Asia |
country |
Philippines Philippines |
content_provider |
De La Salle University Library |
collection |
DLSU Institutional Repository |
topic |
Computational linguistics Code switching (Linguistics) Computer Sciences |
spellingShingle |
Computational linguistics Code switching (Linguistics) Computer Sciences Oco, Nathaniel Roxas, Rachel Edita Pattern matching refinements to dictionary-based code-switching point detection |
description |
This study presents the development and evaluation of pattern matching refinements (PMRs) to automatic code switching point (CSP) detection. With all PMRs, evaluation showed an accuracy of 94.51%. This is an improvement to reported accuracy rates of dictionary-based approaches, which are in the range of 75.22%-76.26% (Yeong and Tan, 2010). In our experiments, a 100-sentence Tagalog-English corpus was used as test bed. Analyses showed that the dictionary-based approach using part-of-speech checking yielded an accuracy of 79.76% only, and two notable linguistic phenomena, (1) intra-word code-switching and (2) common words, were shown to have caused the low accuracy. The devised PMRs, namely: (1) common word exclusion, (2) common word identification, and (3) common n-gram pruning address this and showed improved accuracy. The work can be extended using audio files and machine learning with larger language resources. © 2012 The PACLIC. |
format |
text |
author |
Oco, Nathaniel Roxas, Rachel Edita |
author_facet |
Oco, Nathaniel Roxas, Rachel Edita |
author_sort |
Oco, Nathaniel |
title |
Pattern matching refinements to dictionary-based code-switching point detection |
title_short |
Pattern matching refinements to dictionary-based code-switching point detection |
title_full |
Pattern matching refinements to dictionary-based code-switching point detection |
title_fullStr |
Pattern matching refinements to dictionary-based code-switching point detection |
title_full_unstemmed |
Pattern matching refinements to dictionary-based code-switching point detection |
title_sort |
pattern matching refinements to dictionary-based code-switching point detection |
publisher |
Animo Repository |
publishDate |
2012 |
url |
https://animorepository.dlsu.edu.ph/faculty_research/588 |
_version_ |
1722366356432617472 |