Development, implementation and testing of language identification system for seven Philippine languages

Three Language Identification (LID)approaches, namely, acoustic, phonotactic, and prosodic approaches are explored for Philippine Languages. Gaussian Mixture Models (GMM) is used for acoustic and prosodic approaches. The acoustic features used were Mel Frequency Cepstral Coefficients (MFCC), Percept...

Full description

Saved in:

Bibliographic Details
Main Authors:	Laguna, Ann Franchesca B., Guevara, Rowena Cristina L.
Format:	text
Published:	Animo Repository 2015
Subjects:	Computational linguistics Automatic speech recognition Computer Sciences
Online Access:	https://animorepository.dlsu.edu.ph/faculty_research/3346
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	De La Salle University

id	oai:animorepository.dlsu.edu.ph:faculty_research-4348
record_format	eprints
spelling	oai:animorepository.dlsu.edu.ph:faculty_research-43482021-09-06T02:56:27Z Development, implementation and testing of language identification system for seven Philippine languages Laguna, Ann Franchesca B. Guevara, Rowena Cristina L. Three Language Identification (LID)approaches, namely, acoustic, phonotactic, and prosodic approaches are explored for Philippine Languages. Gaussian Mixture Models (GMM) is used for acoustic and prosodic approaches. The acoustic features used were Mel Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP), Shifted Delta Cepstra (SDC) and Linear Prediction Cepstral Coefficients (LPCC). Pitch, rhythm, and energy are used as prosodic features. A Phone Recognition followed by Language Modelling (PRLM) and Parallel Phone Recognition followed by Language Modelling (PPRLM) are used for the phonotactic approach. After establishing that acoustic approach using a 32nd order PLP GMM-EM achieved the best performanceamong the combinations of approach and feature, three LID systems were built: 7-language LID, pair-wise LID and hierarchical LID; with average accuracy of 48.07%, 72.64% and 53.99%, respectively. Among the pair-wise LID systems the highest accuracy is 92.23% for Tagalog and Hiligaynon and the lowest accuracy is 52.21% for Bicolano and Tausug. In the hierarchical LID system, the accuracy for Tagalog, Cebuano, Bicolano, and Hiligaynon reached 80.56%, 80.26%, 78.26%, and 60.87% respectively. The LID systems that were designed, implemented and tested, are best suited for language verification or for language identification systems with small number of target languages that are closely related such as Philippine languages. © 2015, Science and Technology Information Institute. All rights reserved. 2015-06-01T07:00:00Z text https://animorepository.dlsu.edu.ph/faculty_research/3346 Faculty Research Work Animo Repository Computational linguistics Automatic speech recognition Computer Sciences
institution	De La Salle University
building	De La Salle University Library
continent	Asia
country	Philippines Philippines
content_provider	De La Salle University Library
collection	DLSU Institutional Repository
topic	Computational linguistics Automatic speech recognition Computer Sciences
spellingShingle	Computational linguistics Automatic speech recognition Computer Sciences Laguna, Ann Franchesca B. Guevara, Rowena Cristina L. Development, implementation and testing of language identification system for seven Philippine languages
description	Three Language Identification (LID)approaches, namely, acoustic, phonotactic, and prosodic approaches are explored for Philippine Languages. Gaussian Mixture Models (GMM) is used for acoustic and prosodic approaches. The acoustic features used were Mel Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP), Shifted Delta Cepstra (SDC) and Linear Prediction Cepstral Coefficients (LPCC). Pitch, rhythm, and energy are used as prosodic features. A Phone Recognition followed by Language Modelling (PRLM) and Parallel Phone Recognition followed by Language Modelling (PPRLM) are used for the phonotactic approach. After establishing that acoustic approach using a 32nd order PLP GMM-EM achieved the best performanceamong the combinations of approach and feature, three LID systems were built: 7-language LID, pair-wise LID and hierarchical LID; with average accuracy of 48.07%, 72.64% and 53.99%, respectively. Among the pair-wise LID systems the highest accuracy is 92.23% for Tagalog and Hiligaynon and the lowest accuracy is 52.21% for Bicolano and Tausug. In the hierarchical LID system, the accuracy for Tagalog, Cebuano, Bicolano, and Hiligaynon reached 80.56%, 80.26%, 78.26%, and 60.87% respectively. The LID systems that were designed, implemented and tested, are best suited for language verification or for language identification systems with small number of target languages that are closely related such as Philippine languages. © 2015, Science and Technology Information Institute. All rights reserved.
format	text
author	Laguna, Ann Franchesca B. Guevara, Rowena Cristina L.
author_facet	Laguna, Ann Franchesca B. Guevara, Rowena Cristina L.
author_sort	Laguna, Ann Franchesca B.
title	Development, implementation and testing of language identification system for seven Philippine languages
title_short	Development, implementation and testing of language identification system for seven Philippine languages
title_full	Development, implementation and testing of language identification system for seven Philippine languages
title_fullStr	Development, implementation and testing of language identification system for seven Philippine languages
title_full_unstemmed	Development, implementation and testing of language identification system for seven Philippine languages
title_sort	development, implementation and testing of language identification system for seven philippine languages
publisher	Animo Repository
publishDate	2015
url	https://animorepository.dlsu.edu.ph/faculty_research/3346
_version_	1767195886563098624

Development, implementation and testing of language identification system for seven Philippine languages

Similar Items