Deep learning for speech synthesis

Speech is the most natural way for humans to communicate, and it is the majority of information that is transmitted in daily communication. Speech synthesis plays an important role of voice interaction. Although speech synthesis has been developed for more than half a century and has undergone sever...

Full description

Saved in:

Bibliographic Details
Main Author:	Duan, Yue
Other Authors:	Tan Yap Peng
Format:	Thesis-Master by Coursework
Language:	English
Published:	Nanyang Technological University 2022
Subjects:	Engineering::Electrical and electronic engineering Engineering::Computer science and engineering
Online Access:	https://hdl.handle.net/10356/159591
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-159591
record_format	dspace
spelling	sg-ntu-dr.10356-1595912022-06-28T01:12:31Z Deep learning for speech synthesis Duan, Yue Tan Yap Peng School of Electrical and Electronic Engineering EYPTan@ntu.edu.sg Engineering::Electrical and electronic engineering Engineering::Computer science and engineering Speech is the most natural way for humans to communicate, and it is the majority of information that is transmitted in daily communication. Speech synthesis plays an important role of voice interaction. Although speech synthesis has been developed for more than half a century and has undergone several evolutions, the research on how to make the synthesised speech more natural has always been a hot topic in speech synthesis. The latest advances in deep learning have shown impressive results in speech synthesis. This dissertation begins with an introduction to the development of speech synthesis and focuses on the applications of two deep learning methods in the field of speech synthesis in detail. This is followed by a principle analysis of the fundamental theories as well as supporting technologies used in this dissertation. And finally a multispeaker text to speech system composed of three building blocks based on neural network is implemented in this dissertation, which can perform speech synthesis with the voices of different speakers. Using the synthesizer trained with multiple corpora in different languages, the speech synthesis system proposed in this dissertation is able to perform a variety of monolingual and even mixed-language speech synthesis tasks. Master of Science (Computer Control and Automation) 2022-06-28T01:12:30Z 2022-06-28T01:12:30Z 2022 Thesis-Master by Coursework Duan, Y. (2022). Deep learning for speech synthesis. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/159591 https://hdl.handle.net/10356/159591 en application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Electrical and electronic engineering Engineering::Computer science and engineering
spellingShingle	Engineering::Electrical and electronic engineering Engineering::Computer science and engineering Duan, Yue Deep learning for speech synthesis
description	Speech is the most natural way for humans to communicate, and it is the majority of information that is transmitted in daily communication. Speech synthesis plays an important role of voice interaction. Although speech synthesis has been developed for more than half a century and has undergone several evolutions, the research on how to make the synthesised speech more natural has always been a hot topic in speech synthesis. The latest advances in deep learning have shown impressive results in speech synthesis. This dissertation begins with an introduction to the development of speech synthesis and focuses on the applications of two deep learning methods in the field of speech synthesis in detail. This is followed by a principle analysis of the fundamental theories as well as supporting technologies used in this dissertation. And finally a multispeaker text to speech system composed of three building blocks based on neural network is implemented in this dissertation, which can perform speech synthesis with the voices of different speakers. Using the synthesizer trained with multiple corpora in different languages, the speech synthesis system proposed in this dissertation is able to perform a variety of monolingual and even mixed-language speech synthesis tasks.
author2	Tan Yap Peng
author_facet	Tan Yap Peng Duan, Yue
format	Thesis-Master by Coursework
author	Duan, Yue
author_sort	Duan, Yue
title	Deep learning for speech synthesis
title_short	Deep learning for speech synthesis
title_full	Deep learning for speech synthesis
title_fullStr	Deep learning for speech synthesis
title_full_unstemmed	Deep learning for speech synthesis
title_sort	deep learning for speech synthesis
publisher	Nanyang Technological University
publishDate	2022
url	https://hdl.handle.net/10356/159591
_version_	1738844823290380288

Deep learning for speech synthesis

Similar Items