Voice conversion by speech synthesis

Speech signal contains two kinds of information. They are: (i) The message the speaker wants to convey to the listener and (ii) the characteristics of the speaker. In this project, we focus on the analysis and manipulation of speaker characteristics embedded in the speech signal for voice conversion...

全面介紹

Saved in:

書目詳細資料
主要作者:	Lee, Ming Hui.
其他作者:	Wan Chunru
格式:	Final Year Project
語言:	English
出版:	2009
主題:	DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Signal processing
在線閱讀:	http://hdl.handle.net/10356/16707
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!
機構:	Nanyang Technological University
語言:	English

實物特徵
總結:	Speech signal contains two kinds of information. They are: (i) The message the speaker wants to convey to the listener and (ii) the characteristics of the speaker. In this project, we focus on the analysis and manipulation of speaker characteristics embedded in the speech signal for voice conversion. Voice conversion involves transformation of the speaker characteristics in the speech uttered by a speaker (source speaker), so as to generate speech having the voice characteristics of the desired speaker (target speaker). Voice characteristics lie at the linguistic, suprasegmental and segmental levels. The speaker characteristics at the linguistic and suprasegmental levels are learned features. Hence they are difficult to derive from data and model. On the other hand, speaker characteristics at the segmental level can be attributed to the speech production mechanism and they are reflected in the source and system characteristics of the physical system. This mechanism that models after the human speech production is known as source-filter and the two models that are looked at are linear prediction (LP) and formant. But research has shown that the quality of the synthesis using the LP synthesizer is superior to that using the formant synthesizer and since linear prediction is the most primitive methodology, it will serve as an appropriate baseline for beginners in the area of speech processing. Thus, this will form the central idea of this project. To start, with little knowledge in speech signal processing prior to this project and for specialized data sets such as speech, it is necessary to gain understanding of the acoustic features and properties of speech data before advancing the field of speech analysis and synthesis. Using Matlab, routines and functions with graphical user interface support are implemented to enable user to step through the program runtime execution with ease. The programs are closely referenced and built on existing toolboxes. Finally, performance of the system for converting speech from one voice to another is summarized, tabulated and discussed. Drawbacks and shortcomings are determined and examined. Methods involved in evaluating these transformations of the voice conversion system are studied and subjective test is the method employed for evaluation of the results obtained in this project. The report concludes with an application that voice conversion has served as an invaluable tool; speech-to-speech translation is briefly looked at.

Voice conversion by speech synthesis

相似書籍