Pattern matching refinements to dictionary-based code-switching point detection

This study presents the development and evaluation of pattern matching refinements (PMRs) to automatic code switching point (CSP) detection. With all PMRs, evaluation showed an accuracy of 94.51%. This is an improvement to reported accuracy rates of dictionary-based approaches, which are in the rang...

全面介紹

Saved in:
書目詳細資料
Main Authors: Oco, Nathaniel, Roxas, Rachel Edita
格式: text
出版: Animo Repository 2012
主題:
在線閱讀:https://animorepository.dlsu.edu.ph/faculty_research/588
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
實物特徵
總結:This study presents the development and evaluation of pattern matching refinements (PMRs) to automatic code switching point (CSP) detection. With all PMRs, evaluation showed an accuracy of 94.51%. This is an improvement to reported accuracy rates of dictionary-based approaches, which are in the range of 75.22%-76.26% (Yeong and Tan, 2010). In our experiments, a 100-sentence Tagalog-English corpus was used as test bed. Analyses showed that the dictionary-based approach using part-of-speech checking yielded an accuracy of 79.76% only, and two notable linguistic phenomena, (1) intra-word code-switching and (2) common words, were shown to have caused the low accuracy. The devised PMRs, namely: (1) common word exclusion, (2) common word identification, and (3) common n-gram pruning address this and showed improved accuracy. The work can be extended using audio files and machine learning with larger language resources. © 2012 The PACLIC.