Deep learning for speech synthesis
Speech is the most natural way for humans to communicate, and it is the majority of information that is transmitted in daily communication. Speech synthesis plays an important role of voice interaction. Although speech synthesis has been developed for more than half a century and has undergone sever...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/159591 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-159591 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1595912022-06-28T01:12:31Z Deep learning for speech synthesis Duan, Yue Tan Yap Peng School of Electrical and Electronic Engineering EYPTan@ntu.edu.sg Engineering::Electrical and electronic engineering Engineering::Computer science and engineering Speech is the most natural way for humans to communicate, and it is the majority of information that is transmitted in daily communication. Speech synthesis plays an important role of voice interaction. Although speech synthesis has been developed for more than half a century and has undergone several evolutions, the research on how to make the synthesised speech more natural has always been a hot topic in speech synthesis. The latest advances in deep learning have shown impressive results in speech synthesis. This dissertation begins with an introduction to the development of speech synthesis and focuses on the applications of two deep learning methods in the field of speech synthesis in detail. This is followed by a principle analysis of the fundamental theories as well as supporting technologies used in this dissertation. And finally a multispeaker text to speech system composed of three building blocks based on neural network is implemented in this dissertation, which can perform speech synthesis with the voices of different speakers. Using the synthesizer trained with multiple corpora in different languages, the speech synthesis system proposed in this dissertation is able to perform a variety of monolingual and even mixed-language speech synthesis tasks. Master of Science (Computer Control and Automation) 2022-06-28T01:12:30Z 2022-06-28T01:12:30Z 2022 Thesis-Master by Coursework Duan, Y. (2022). Deep learning for speech synthesis. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/159591 https://hdl.handle.net/10356/159591 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Electrical and electronic engineering Engineering::Computer science and engineering |
spellingShingle |
Engineering::Electrical and electronic engineering Engineering::Computer science and engineering Duan, Yue Deep learning for speech synthesis |
description |
Speech is the most natural way for humans to communicate, and it is the majority of information that is transmitted in daily communication. Speech synthesis plays an important role of voice interaction. Although speech synthesis has been developed for more than half a century and has undergone several evolutions, the research on how to make the synthesised speech more natural has always been a hot topic in speech synthesis. The latest advances in deep learning have shown impressive results in speech synthesis. This dissertation begins with an introduction to the development of speech synthesis and focuses on the applications of two deep learning methods in the field of speech synthesis in detail. This is followed by a principle analysis of the fundamental theories as well as supporting technologies used in this dissertation. And finally a multispeaker text to speech system composed of three building blocks based on neural network is implemented in this dissertation, which can perform speech synthesis with the voices of different speakers. Using the synthesizer trained with multiple corpora in different languages, the speech synthesis system proposed in this dissertation is able to perform a variety of monolingual and even mixed-language speech synthesis tasks. |
author2 |
Tan Yap Peng |
author_facet |
Tan Yap Peng Duan, Yue |
format |
Thesis-Master by Coursework |
author |
Duan, Yue |
author_sort |
Duan, Yue |
title |
Deep learning for speech synthesis |
title_short |
Deep learning for speech synthesis |
title_full |
Deep learning for speech synthesis |
title_fullStr |
Deep learning for speech synthesis |
title_full_unstemmed |
Deep learning for speech synthesis |
title_sort |
deep learning for speech synthesis |
publisher |
Nanyang Technological University |
publishDate |
2022 |
url |
https://hdl.handle.net/10356/159591 |
_version_ |
1738844823290380288 |