Mandarin speech synthesis

Synthetic speech has been developed steadily for the past few decades. The objective of the project is to develop a Chinese Text-to-Speech system using concatenative synthesis. Concatenative synthesis involves playing pre-recorded samples of natural speech such as the word or syllable. As Mandari...

Full description

Saved in:

Bibliographic Details
Main Author:	Teoh, Xueli
Other Authors:	Foo Say Wei
Format:	Final Year Project
Language:	English
Published:	2015
Subjects:	DRNTU::Engineering::Electrical and electronic engineering
Online Access:	http://hdl.handle.net/10356/62164
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-62164
record_format	dspace
spelling	sg-ntu-dr.10356-621642023-07-07T15:53:06Z Mandarin speech synthesis Teoh, Xueli Foo Say Wei School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering Synthetic speech has been developed steadily for the past few decades. The objective of the project is to develop a Chinese Text-to-Speech system using concatenative synthesis. Concatenative synthesis involves playing pre-recorded samples of natural speech such as the word or syllable. As Mandarin Chinese is a tonal language with each character pronounced as a syllable, syllable is chosen as the basic synthesis unit in this project. A Chinese speech database containing more than 1300 Chinese syllables were pre-recorded and used for the concatenative speech synthesis. The technique of Time Domain Pitch Synchronous Overlap Add approach (TDPSOLA) was adopted to allow pre-recorded speech samples smoothly concatenated and provides a good controlling for the pitch and duration. A natural speech sentence was pre-recorded and used to compare with the concatenated synthetic speech. To make the concatenated synthetic speech sound as close as that of the natural speech, the technique of TD-PSOLA is adopted to change the duration of each character in the concatenated sentence so that the duration will be almost the same as that of the character in the natural utterance. The pitch of the word in the concatenated sentence is also modified using TD-PSOLA so that it fits the pitch contour of the word in the natural speech sentence. In the modification process, precise pitch detection is crucial. Pitch detection based on the method of autocorrelation is developed to obtain the pitch contour of each monosyllabic speech unit accurately. Bachelor of Engineering 2015-02-12T02:04:43Z 2015-02-12T02:04:43Z 2008 2008 Final Year Project (FYP) http://hdl.handle.net/10356/62164 en Nanyang Technological University 95 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Electrical and electronic engineering
spellingShingle	DRNTU::Engineering::Electrical and electronic engineering Teoh, Xueli Mandarin speech synthesis
description	Synthetic speech has been developed steadily for the past few decades. The objective of the project is to develop a Chinese Text-to-Speech system using concatenative synthesis. Concatenative synthesis involves playing pre-recorded samples of natural speech such as the word or syllable. As Mandarin Chinese is a tonal language with each character pronounced as a syllable, syllable is chosen as the basic synthesis unit in this project. A Chinese speech database containing more than 1300 Chinese syllables were pre-recorded and used for the concatenative speech synthesis. The technique of Time Domain Pitch Synchronous Overlap Add approach (TDPSOLA) was adopted to allow pre-recorded speech samples smoothly concatenated and provides a good controlling for the pitch and duration. A natural speech sentence was pre-recorded and used to compare with the concatenated synthetic speech. To make the concatenated synthetic speech sound as close as that of the natural speech, the technique of TD-PSOLA is adopted to change the duration of each character in the concatenated sentence so that the duration will be almost the same as that of the character in the natural utterance. The pitch of the word in the concatenated sentence is also modified using TD-PSOLA so that it fits the pitch contour of the word in the natural speech sentence. In the modification process, precise pitch detection is crucial. Pitch detection based on the method of autocorrelation is developed to obtain the pitch contour of each monosyllabic speech unit accurately.
author2	Foo Say Wei
author_facet	Foo Say Wei Teoh, Xueli
format	Final Year Project
author	Teoh, Xueli
author_sort	Teoh, Xueli
title	Mandarin speech synthesis
title_short	Mandarin speech synthesis
title_full	Mandarin speech synthesis
title_fullStr	Mandarin speech synthesis
title_full_unstemmed	Mandarin speech synthesis
title_sort	mandarin speech synthesis
publishDate	2015
url	http://hdl.handle.net/10356/62164
_version_	1772825350120669184

Mandarin speech synthesis

Similar Items