Mandarin speech synthesis

Synthetic speech has been developed steadily for the past few decades. The objective of the project is to develop a Chinese Text-to-Speech system using concatenative synthesis. Concatenative synthesis involves playing pre-recorded samples of natural speech such as the word or syllable. As Mandari...

Full description

Saved in:
Bibliographic Details
Main Author: Teoh, Xueli
Other Authors: Foo Say Wei
Format: Final Year Project
Language:English
Published: 2015
Subjects:
Online Access:http://hdl.handle.net/10356/62164
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Synthetic speech has been developed steadily for the past few decades. The objective of the project is to develop a Chinese Text-to-Speech system using concatenative synthesis. Concatenative synthesis involves playing pre-recorded samples of natural speech such as the word or syllable. As Mandarin Chinese is a tonal language with each character pronounced as a syllable, syllable is chosen as the basic synthesis unit in this project. A Chinese speech database containing more than 1300 Chinese syllables were pre-recorded and used for the concatenative speech synthesis. The technique of Time Domain Pitch Synchronous Overlap Add approach (TDPSOLA) was adopted to allow pre-recorded speech samples smoothly concatenated and provides a good controlling for the pitch and duration. A natural speech sentence was pre-recorded and used to compare with the concatenated synthetic speech. To make the concatenated synthetic speech sound as close as that of the natural speech, the technique of TD-PSOLA is adopted to change the duration of each character in the concatenated sentence so that the duration will be almost the same as that of the character in the natural utterance. The pitch of the word in the concatenated sentence is also modified using TD-PSOLA so that it fits the pitch contour of the word in the natural speech sentence. In the modification process, precise pitch detection is crucial. Pitch detection based on the method of autocorrelation is developed to obtain the pitch contour of each monosyllabic speech unit accurately.