Deep learning for speech synthesis

Speech is the most natural way for humans to communicate, and it is the majority of information that is transmitted in daily communication. Speech synthesis plays an important role of voice interaction. Although speech synthesis has been developed for more than half a century and has undergone sever...

Full description

Saved in:
Bibliographic Details
Main Author: Duan, Yue
Other Authors: Tan Yap Peng
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/159591
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-159591
record_format dspace
spelling sg-ntu-dr.10356-1595912022-06-28T01:12:31Z Deep learning for speech synthesis Duan, Yue Tan Yap Peng School of Electrical and Electronic Engineering EYPTan@ntu.edu.sg Engineering::Electrical and electronic engineering Engineering::Computer science and engineering Speech is the most natural way for humans to communicate, and it is the majority of information that is transmitted in daily communication. Speech synthesis plays an important role of voice interaction. Although speech synthesis has been developed for more than half a century and has undergone several evolutions, the research on how to make the synthesised speech more natural has always been a hot topic in speech synthesis. The latest advances in deep learning have shown impressive results in speech synthesis. This dissertation begins with an introduction to the development of speech synthesis and focuses on the applications of two deep learning methods in the field of speech synthesis in detail. This is followed by a principle analysis of the fundamental theories as well as supporting technologies used in this dissertation. And finally a multispeaker text to speech system composed of three building blocks based on neural network is implemented in this dissertation, which can perform speech synthesis with the voices of different speakers. Using the synthesizer trained with multiple corpora in different languages, the speech synthesis system proposed in this dissertation is able to perform a variety of monolingual and even mixed-language speech synthesis tasks. Master of Science (Computer Control and Automation) 2022-06-28T01:12:30Z 2022-06-28T01:12:30Z 2022 Thesis-Master by Coursework Duan, Y. (2022). Deep learning for speech synthesis. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/159591 https://hdl.handle.net/10356/159591 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Electrical and electronic engineering
Engineering::Computer science and engineering
spellingShingle Engineering::Electrical and electronic engineering
Engineering::Computer science and engineering
Duan, Yue
Deep learning for speech synthesis
description Speech is the most natural way for humans to communicate, and it is the majority of information that is transmitted in daily communication. Speech synthesis plays an important role of voice interaction. Although speech synthesis has been developed for more than half a century and has undergone several evolutions, the research on how to make the synthesised speech more natural has always been a hot topic in speech synthesis. The latest advances in deep learning have shown impressive results in speech synthesis. This dissertation begins with an introduction to the development of speech synthesis and focuses on the applications of two deep learning methods in the field of speech synthesis in detail. This is followed by a principle analysis of the fundamental theories as well as supporting technologies used in this dissertation. And finally a multispeaker text to speech system composed of three building blocks based on neural network is implemented in this dissertation, which can perform speech synthesis with the voices of different speakers. Using the synthesizer trained with multiple corpora in different languages, the speech synthesis system proposed in this dissertation is able to perform a variety of monolingual and even mixed-language speech synthesis tasks.
author2 Tan Yap Peng
author_facet Tan Yap Peng
Duan, Yue
format Thesis-Master by Coursework
author Duan, Yue
author_sort Duan, Yue
title Deep learning for speech synthesis
title_short Deep learning for speech synthesis
title_full Deep learning for speech synthesis
title_fullStr Deep learning for speech synthesis
title_full_unstemmed Deep learning for speech synthesis
title_sort deep learning for speech synthesis
publisher Nanyang Technological University
publishDate 2022
url https://hdl.handle.net/10356/159591
_version_ 1738844823290380288