MULTI SPEAKER SPEECH SYNTHESIS SYSTEM FOR INDONESIAN LANGUAGE

Generally, text-to-speech models only produce voice from a single speaker. The most straightforward method to produce another speaker’s voice, is to build a standalone synthesis model for each desired speaker’s voice. But such approach needs large amount of training data and computational resourc...

Full description

Saved in:
Bibliographic Details
Main Author: Jerremy Budiman, Marvin
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/70713
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:70713
spelling id-itb.:707132023-01-19T13:28:44ZMULTI SPEAKER SPEECH SYNTHESIS SYSTEM FOR INDONESIAN LANGUAGE Jerremy Budiman, Marvin Indonesia Theses speech synthesis, multi speaker, Indonesian language. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/70713 Generally, text-to-speech models only produce voice from a single speaker. The most straightforward method to produce another speaker’s voice, is to build a standalone synthesis model for each desired speaker’s voice. But such approach needs large amount of training data and computational resource. To overcome the problem, several architectures has been successful in producing synthesized speech from various speakers efficiently in terms of data and computation. One of the architectures is Deep Voice 3. In this work a multi speaker speech synthesis system is built for Indonesian language. The system is using Deep Voice 3 architecture, with several additional components for preprocessing dan post-processing. Some of the components are specifically implemented for Indonesian language. The system is built using a multi speaker dataset, consists of speech data from 145 Indonesian speaker. This system is evaluated subjectively to assess naturalness, similarity to original speaker, and intelligibility of the produced speech. The result shows that the system has MOS (mean opinion score) of 3.39 for speech naturalness dan 3.11 for speech similarity. In assessing speech intelligibility using SUS (semantically unpredictable sentence), the test gives 73.88% for sentence accuracy and 93.48% for word accuracy. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Generally, text-to-speech models only produce voice from a single speaker. The most straightforward method to produce another speaker’s voice, is to build a standalone synthesis model for each desired speaker’s voice. But such approach needs large amount of training data and computational resource. To overcome the problem, several architectures has been successful in producing synthesized speech from various speakers efficiently in terms of data and computation. One of the architectures is Deep Voice 3. In this work a multi speaker speech synthesis system is built for Indonesian language. The system is using Deep Voice 3 architecture, with several additional components for preprocessing dan post-processing. Some of the components are specifically implemented for Indonesian language. The system is built using a multi speaker dataset, consists of speech data from 145 Indonesian speaker. This system is evaluated subjectively to assess naturalness, similarity to original speaker, and intelligibility of the produced speech. The result shows that the system has MOS (mean opinion score) of 3.39 for speech naturalness dan 3.11 for speech similarity. In assessing speech intelligibility using SUS (semantically unpredictable sentence), the test gives 73.88% for sentence accuracy and 93.48% for word accuracy.
format Theses
author Jerremy Budiman, Marvin
spellingShingle Jerremy Budiman, Marvin
MULTI SPEAKER SPEECH SYNTHESIS SYSTEM FOR INDONESIAN LANGUAGE
author_facet Jerremy Budiman, Marvin
author_sort Jerremy Budiman, Marvin
title MULTI SPEAKER SPEECH SYNTHESIS SYSTEM FOR INDONESIAN LANGUAGE
title_short MULTI SPEAKER SPEECH SYNTHESIS SYSTEM FOR INDONESIAN LANGUAGE
title_full MULTI SPEAKER SPEECH SYNTHESIS SYSTEM FOR INDONESIAN LANGUAGE
title_fullStr MULTI SPEAKER SPEECH SYNTHESIS SYSTEM FOR INDONESIAN LANGUAGE
title_full_unstemmed MULTI SPEAKER SPEECH SYNTHESIS SYSTEM FOR INDONESIAN LANGUAGE
title_sort multi speaker speech synthesis system for indonesian language
url https://digilib.itb.ac.id/gdl/view/70713
_version_ 1822006387988758528