Mandarin speech synthesis

Synthetic speech has been developed steadily for the past few decades. The objective of the project is to develop a Chinese Text-to-Speech system using concatenative synthesis. Concatenative synthesis involves playing pre-recorded samples of natural speech such as the word or syllable. As Mandari...

Full description

Saved in:
Bibliographic Details
Main Author: Teoh, Xueli
Other Authors: Foo Say Wei
Format: Final Year Project
Language:English
Published: 2015
Subjects:
Online Access:http://hdl.handle.net/10356/62164
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-62164
record_format dspace
spelling sg-ntu-dr.10356-621642023-07-07T15:53:06Z Mandarin speech synthesis Teoh, Xueli Foo Say Wei School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering Synthetic speech has been developed steadily for the past few decades. The objective of the project is to develop a Chinese Text-to-Speech system using concatenative synthesis. Concatenative synthesis involves playing pre-recorded samples of natural speech such as the word or syllable. As Mandarin Chinese is a tonal language with each character pronounced as a syllable, syllable is chosen as the basic synthesis unit in this project. A Chinese speech database containing more than 1300 Chinese syllables were pre-recorded and used for the concatenative speech synthesis. The technique of Time Domain Pitch Synchronous Overlap Add approach (TDPSOLA) was adopted to allow pre-recorded speech samples smoothly concatenated and provides a good controlling for the pitch and duration. A natural speech sentence was pre-recorded and used to compare with the concatenated synthetic speech. To make the concatenated synthetic speech sound as close as that of the natural speech, the technique of TD-PSOLA is adopted to change the duration of each character in the concatenated sentence so that the duration will be almost the same as that of the character in the natural utterance. The pitch of the word in the concatenated sentence is also modified using TD-PSOLA so that it fits the pitch contour of the word in the natural speech sentence. In the modification process, precise pitch detection is crucial. Pitch detection based on the method of autocorrelation is developed to obtain the pitch contour of each monosyllabic speech unit accurately. Bachelor of Engineering 2015-02-12T02:04:43Z 2015-02-12T02:04:43Z 2008 2008 Final Year Project (FYP) http://hdl.handle.net/10356/62164 en Nanyang Technological University 95 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Electrical and electronic engineering
spellingShingle DRNTU::Engineering::Electrical and electronic engineering
Teoh, Xueli
Mandarin speech synthesis
description Synthetic speech has been developed steadily for the past few decades. The objective of the project is to develop a Chinese Text-to-Speech system using concatenative synthesis. Concatenative synthesis involves playing pre-recorded samples of natural speech such as the word or syllable. As Mandarin Chinese is a tonal language with each character pronounced as a syllable, syllable is chosen as the basic synthesis unit in this project. A Chinese speech database containing more than 1300 Chinese syllables were pre-recorded and used for the concatenative speech synthesis. The technique of Time Domain Pitch Synchronous Overlap Add approach (TDPSOLA) was adopted to allow pre-recorded speech samples smoothly concatenated and provides a good controlling for the pitch and duration. A natural speech sentence was pre-recorded and used to compare with the concatenated synthetic speech. To make the concatenated synthetic speech sound as close as that of the natural speech, the technique of TD-PSOLA is adopted to change the duration of each character in the concatenated sentence so that the duration will be almost the same as that of the character in the natural utterance. The pitch of the word in the concatenated sentence is also modified using TD-PSOLA so that it fits the pitch contour of the word in the natural speech sentence. In the modification process, precise pitch detection is crucial. Pitch detection based on the method of autocorrelation is developed to obtain the pitch contour of each monosyllabic speech unit accurately.
author2 Foo Say Wei
author_facet Foo Say Wei
Teoh, Xueli
format Final Year Project
author Teoh, Xueli
author_sort Teoh, Xueli
title Mandarin speech synthesis
title_short Mandarin speech synthesis
title_full Mandarin speech synthesis
title_fullStr Mandarin speech synthesis
title_full_unstemmed Mandarin speech synthesis
title_sort mandarin speech synthesis
publishDate 2015
url http://hdl.handle.net/10356/62164
_version_ 1772825350120669184