Time domain modification of synthesis speech

Concatenative synthesis becomes more and more popular nowadays because of its high naturalness and ease to implement. In cocatenative synthesis, the prerecord samples are modified correspondingly to synthesize the desired speech. In the modification process, precise pitch detection and modificati...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلف الرئيسي:	Long, Hai
مؤلفون آخرون:	Foo Say Wei
التنسيق:	Final Year Project
اللغة:	English
منشور في:	2015
الموضوعات:	DRNTU::Engineering::Electrical and electronic engineering
الوصول للمادة أونلاين:	http://hdl.handle.net/10356/62155
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

الوصف
الملخص:	Concatenative synthesis becomes more and more popular nowadays because of its high naturalness and ease to implement. In cocatenative synthesis, the prerecord samples are modified correspondingly to synthesize the desired speech. In the modification process, precise pitch detection and modification are very important and greatly affected the quality of synthesized speech. AMDF is time domain pitch detection method with high accuracy and low computation complexity. It calculates a difference signal between the waveform and its time delayed copy at varies time delays. Pitch is extracted from the difference signal by seeking the first minimum. A pre spectra flattener, the center clipping, can increase the reliability of AMDF. The accuracy can be further enhanced by apply a probabilistic error correction after rough estimation by AMDF. TD-PSOLA is a popular time domain pitch and duration modification method. It decomposes the signal into a series of short-time signal and modifies the short-time waveform according to desired pitch and time scale factor. Finally, the synthesized speech is obtained by applying an overlap-add method. Instead of applying a constant pitch scale factor, a pitch scale function is used to achieve a purpose of changing the tone in Mandarin Chinese. The pitch scale function is derived from the four lexical tone models of Mandarin Chinese and determined experimentally.

Time domain modification of synthesis speech

مواد مشابهة