Development, implementation and testing of language identification system for seven Philippine languages
Three Language Identification (LID)approaches, namely, acoustic, phonotactic, and prosodic approaches are explored for Philippine Languages. Gaussian Mixture Models (GMM) is used for acoustic and prosodic approaches. The acoustic features used were Mel Frequency Cepstral Coefficients (MFCC), Percept...
Saved in:
Main Authors: | , |
---|---|
Format: | text |
Published: |
Animo Repository
2015
|
Subjects: | |
Online Access: | https://animorepository.dlsu.edu.ph/faculty_research/3346 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | De La Salle University |
Summary: | Three Language Identification (LID)approaches, namely, acoustic, phonotactic, and prosodic approaches are explored for Philippine Languages. Gaussian Mixture Models (GMM) is used for acoustic and prosodic approaches. The acoustic features used were Mel Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP), Shifted Delta Cepstra (SDC) and Linear Prediction Cepstral Coefficients (LPCC). Pitch, rhythm, and energy are used as prosodic features. A Phone Recognition followed by Language Modelling (PRLM) and Parallel Phone Recognition followed by Language Modelling (PPRLM) are used for the phonotactic approach. After establishing that acoustic approach using a 32nd order PLP GMM-EM achieved the best performanceamong the combinations of approach and feature, three LID systems were built: 7-language LID, pair-wise LID and hierarchical LID; with average accuracy of 48.07%, 72.64% and 53.99%, respectively. Among the pair-wise LID systems the highest accuracy is 92.23% for Tagalog and Hiligaynon and the lowest accuracy is 52.21% for Bicolano and Tausug. In the hierarchical LID system, the accuracy for Tagalog, Cebuano, Bicolano, and Hiligaynon reached 80.56%, 80.26%, 78.26%, and 60.87% respectively. The LID systems that were designed, implemented and tested, are best suited for language verification or for language identification systems with small number of target languages that are closely related such as Philippine languages. © 2015, Science and Technology Information Institute. All rights reserved. |
---|