DEVELOPMENT OF TEXT-TO-SPEECH SYSTEM FOR AN INDONESIAN SMART SPEAKER

Generally, smart speakers are operated using the English language, even though Indonesian people generally have poor English language skills. There are three components in a smart speaker, Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS). End-to-End (E2...

Full description

Saved in:

Bibliographic Details
Main Author:	David Partogi, Ignatius
Format:	Final Project
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/72116
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:72116
spelling	id-itb.:721162023-03-06T03:53:10ZDEVELOPMENT OF TEXT-TO-SPEECH SYSTEM FOR AN INDONESIAN SMART SPEAKER David Partogi, Ignatius Indonesia Final Project Sistem smart speaker, Sistem TTS, Tacotron 2, Parallel WaveGAN, MOS, SUS. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/72116 Generally, smart speakers are operated using the English language, even though Indonesian people generally have poor English language skills. There are three components in a smart speaker, Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS). End-to-End (E2E) TTS system is a TTS system that can immediately process a text and generate audio from it. E2E TTS has two parts, spectrogram generator and vocoder. The TTS system for this research was built using Tacotron 2 which is the state of the art in TTS world as the spectrogram generator and Parallel WaveGAN as the vocoder. The dataset used for this research consist of 3000 pairs of audio and their text transcription that was sourced from an audiobook of Indonesian language school and college books, with a total duration of 9 hours, 22 minutes, and 30 seconds. Mean Opinion Score (MOS) testing of the TTS system for this research resulted in a MOS score of 3,24 ± 0,29, while the Semantically Unpredictable Sentence (SUS) testing from the TTS system for this research resulted in an accuracy score of (91.82 ± 7.63)%. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	Generally, smart speakers are operated using the English language, even though Indonesian people generally have poor English language skills. There are three components in a smart speaker, Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS). End-to-End (E2E) TTS system is a TTS system that can immediately process a text and generate audio from it. E2E TTS has two parts, spectrogram generator and vocoder. The TTS system for this research was built using Tacotron 2 which is the state of the art in TTS world as the spectrogram generator and Parallel WaveGAN as the vocoder. The dataset used for this research consist of 3000 pairs of audio and their text transcription that was sourced from an audiobook of Indonesian language school and college books, with a total duration of 9 hours, 22 minutes, and 30 seconds. Mean Opinion Score (MOS) testing of the TTS system for this research resulted in a MOS score of 3,24 ± 0,29, while the Semantically Unpredictable Sentence (SUS) testing from the TTS system for this research resulted in an accuracy score of (91.82 ± 7.63)%.
format	Final Project
author	David Partogi, Ignatius
spellingShingle	David Partogi, Ignatius DEVELOPMENT OF TEXT-TO-SPEECH SYSTEM FOR AN INDONESIAN SMART SPEAKER
author_facet	David Partogi, Ignatius
author_sort	David Partogi, Ignatius
title	DEVELOPMENT OF TEXT-TO-SPEECH SYSTEM FOR AN INDONESIAN SMART SPEAKER
title_short	DEVELOPMENT OF TEXT-TO-SPEECH SYSTEM FOR AN INDONESIAN SMART SPEAKER
title_full	DEVELOPMENT OF TEXT-TO-SPEECH SYSTEM FOR AN INDONESIAN SMART SPEAKER
title_fullStr	DEVELOPMENT OF TEXT-TO-SPEECH SYSTEM FOR AN INDONESIAN SMART SPEAKER
title_full_unstemmed	DEVELOPMENT OF TEXT-TO-SPEECH SYSTEM FOR AN INDONESIAN SMART SPEAKER
title_sort	development of text-to-speech system for an indonesian smart speaker
url	https://digilib.itb.ac.id/gdl/view/72116
_version_	1822006769387307008

DEVELOPMENT OF TEXT-TO-SPEECH SYSTEM FOR AN INDONESIAN SMART SPEAKER

Similar Items