Filipino text-to-speech system: Tagapagsalita, 2
One of the main types of speech processing technologies today is Text-To-Speech (TTS) synthesis. This technology converts normal language text into speech. Many studies have been conducted to develop TTS systems for various languages. In this Filipino TTS, there are 327 diphones extracted from sets...
Saved in:
Main Authors: | , , |
---|---|
Format: | text |
Language: | English |
Published: |
Animo Repository
2009
|
Subjects: | |
Online Access: | https://animorepository.dlsu.edu.ph/etd_bachelors/7657 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | De La Salle University |
Language: | English |
id |
oai:animorepository.dlsu.edu.ph:etd_bachelors-8302 |
---|---|
record_format |
eprints |
spelling |
oai:animorepository.dlsu.edu.ph:etd_bachelors-83022021-07-30T04:17:42Z Filipino text-to-speech system: Tagapagsalita, 2 Jimenez, Jerick T. Juliano, Faye S. Silva, Elrick Jan P. One of the main types of speech processing technologies today is Text-To-Speech (TTS) synthesis. This technology converts normal language text into speech. Many studies have been conducted to develop TTS systems for various languages. In this Filipino TTS, there are 327 diphones extracted from sets of Filipino words, 234 are found valid. Diphones will undergo pre-processing and will be compressed using Linear Predictive Coding (LPC). Through inverse LPC, the diphones can be reproduce using the coefficients and excitations stored in the codebook. After the diphones are synthesized, its pitch, volume and duration are manipulated by a scaling factor depending on the accent mark assigned to it. Once the accent is applied to the diphone, it will be concatenated with the other diphones with the means of Overlap-Add Method (OLA) to form the output signal of the system. 25 respondents were asked to evaluate the system based on ease, syllabication, stress, articulation, and speed with the score of five being the highest and one being the lowest. The average of results for all uttered speech scored 4.453 for listening ease, 4.42 for syllabication, 3.83 for stress, 4.06 for articulation and 3.51 for speed. The linguist's average score are 3.86 for listening ease, 3.36 for syllabication, 2.3 for stress, 3 for articulation and 3.51 for speed. Also, the respondents were asked to do the accent mark test by listening to 15 Filipino words and identify the word that they heard based on the choices indicated in the survey sheet. An average score 11.21 out of 15 questions were achieved by the respondents in identifying the Filipino Heteronyms while the linguist's score was 13 out of 15. 2009-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/etd_bachelors/7657 Bachelor's Theses English Animo Repository Filipino language Technological innovations |
institution |
De La Salle University |
building |
De La Salle University Library |
continent |
Asia |
country |
Philippines Philippines |
content_provider |
De La Salle University Library |
collection |
DLSU Institutional Repository |
language |
English |
topic |
Filipino language Technological innovations |
spellingShingle |
Filipino language Technological innovations Jimenez, Jerick T. Juliano, Faye S. Silva, Elrick Jan P. Filipino text-to-speech system: Tagapagsalita, 2 |
description |
One of the main types of speech processing technologies today is Text-To-Speech (TTS) synthesis. This technology converts normal language text into speech. Many studies have been conducted to develop TTS systems for various languages. In this Filipino TTS, there are 327 diphones extracted from sets of Filipino words, 234 are found valid. Diphones will undergo pre-processing and will be compressed using Linear Predictive Coding (LPC). Through inverse LPC, the diphones can be reproduce using the coefficients and excitations stored in the codebook.
After the diphones are synthesized, its pitch, volume and duration are manipulated by a scaling factor depending on the accent mark assigned to it. Once the accent is applied to the diphone, it will be concatenated with the other diphones with the means of Overlap-Add Method (OLA) to form the output signal of the system.
25 respondents were asked to evaluate the system based on ease, syllabication, stress, articulation, and speed with the score of five being the highest and one being the lowest. The average of results for all uttered speech scored 4.453 for listening ease, 4.42 for syllabication, 3.83 for stress, 4.06 for articulation and 3.51 for speed. The linguist's average score are 3.86 for listening ease, 3.36 for syllabication, 2.3 for stress, 3 for articulation and 3.51 for speed. Also, the respondents were asked to do the accent mark test by listening to 15 Filipino words and identify the word that they heard based on the choices indicated in the survey sheet. An average score 11.21 out of 15 questions were achieved by the respondents in identifying the Filipino Heteronyms while the linguist's score was 13 out of 15. |
format |
text |
author |
Jimenez, Jerick T. Juliano, Faye S. Silva, Elrick Jan P. |
author_facet |
Jimenez, Jerick T. Juliano, Faye S. Silva, Elrick Jan P. |
author_sort |
Jimenez, Jerick T. |
title |
Filipino text-to-speech system: Tagapagsalita, 2 |
title_short |
Filipino text-to-speech system: Tagapagsalita, 2 |
title_full |
Filipino text-to-speech system: Tagapagsalita, 2 |
title_fullStr |
Filipino text-to-speech system: Tagapagsalita, 2 |
title_full_unstemmed |
Filipino text-to-speech system: Tagapagsalita, 2 |
title_sort |
filipino text-to-speech system: tagapagsalita, 2 |
publisher |
Animo Repository |
publishDate |
2009 |
url |
https://animorepository.dlsu.edu.ph/etd_bachelors/7657 |
_version_ |
1712576759538384896 |