MULTI SPEAKER SPEECH SYNTHESIS SYSTEM FOR INDONESIAN LANGUAGE

Generally, text-to-speech models only produce voice from a single speaker. The most straightforward method to produce another speaker’s voice, is to build a standalone synthesis model for each desired speaker’s voice. But such approach needs large amount of training data and computational resourc...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلف الرئيسي:	Jerremy Budiman, Marvin
التنسيق:	Theses
اللغة:	Indonesia
الوصول للمادة أونلاين:	https://digilib.itb.ac.id/gdl/view/70713
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة:	Institut Teknologi Bandung
اللغة:	Indonesia

id	id-itb.:70713
spelling	id-itb.:707132023-01-19T13:28:44ZMULTI SPEAKER SPEECH SYNTHESIS SYSTEM FOR INDONESIAN LANGUAGE Jerremy Budiman, Marvin Indonesia Theses speech synthesis, multi speaker, Indonesian language. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/70713 Generally, text-to-speech models only produce voice from a single speaker. The most straightforward method to produce another speaker’s voice, is to build a standalone synthesis model for each desired speaker’s voice. But such approach needs large amount of training data and computational resource. To overcome the problem, several architectures has been successful in producing synthesized speech from various speakers efficiently in terms of data and computation. One of the architectures is Deep Voice 3. In this work a multi speaker speech synthesis system is built for Indonesian language. The system is using Deep Voice 3 architecture, with several additional components for preprocessing dan post-processing. Some of the components are specifically implemented for Indonesian language. The system is built using a multi speaker dataset, consists of speech data from 145 Indonesian speaker. This system is evaluated subjectively to assess naturalness, similarity to original speaker, and intelligibility of the produced speech. The result shows that the system has MOS (mean opinion score) of 3.39 for speech naturalness dan 3.11 for speech similarity. In assessing speech intelligibility using SUS (semantically unpredictable sentence), the test gives 73.88% for sentence accuracy and 93.48% for word accuracy. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	Generally, text-to-speech models only produce voice from a single speaker. The most straightforward method to produce another speaker’s voice, is to build a standalone synthesis model for each desired speaker’s voice. But such approach needs large amount of training data and computational resource. To overcome the problem, several architectures has been successful in producing synthesized speech from various speakers efficiently in terms of data and computation. One of the architectures is Deep Voice 3. In this work a multi speaker speech synthesis system is built for Indonesian language. The system is using Deep Voice 3 architecture, with several additional components for preprocessing dan post-processing. Some of the components are specifically implemented for Indonesian language. The system is built using a multi speaker dataset, consists of speech data from 145 Indonesian speaker. This system is evaluated subjectively to assess naturalness, similarity to original speaker, and intelligibility of the produced speech. The result shows that the system has MOS (mean opinion score) of 3.39 for speech naturalness dan 3.11 for speech similarity. In assessing speech intelligibility using SUS (semantically unpredictable sentence), the test gives 73.88% for sentence accuracy and 93.48% for word accuracy.
format	Theses
author	Jerremy Budiman, Marvin
spellingShingle	Jerremy Budiman, Marvin MULTI SPEAKER SPEECH SYNTHESIS SYSTEM FOR INDONESIAN LANGUAGE
author_facet	Jerremy Budiman, Marvin
author_sort	Jerremy Budiman, Marvin
title	MULTI SPEAKER SPEECH SYNTHESIS SYSTEM FOR INDONESIAN LANGUAGE
title_short	MULTI SPEAKER SPEECH SYNTHESIS SYSTEM FOR INDONESIAN LANGUAGE
title_full	MULTI SPEAKER SPEECH SYNTHESIS SYSTEM FOR INDONESIAN LANGUAGE
title_fullStr	MULTI SPEAKER SPEECH SYNTHESIS SYSTEM FOR INDONESIAN LANGUAGE
title_full_unstemmed	MULTI SPEAKER SPEECH SYNTHESIS SYSTEM FOR INDONESIAN LANGUAGE
title_sort	multi speaker speech synthesis system for indonesian language
url	https://digilib.itb.ac.id/gdl/view/70713
_version_	1823650745316016128

MULTI SPEAKER SPEECH SYNTHESIS SYSTEM FOR INDONESIAN LANGUAGE

مواد مشابهة