Mixture of factor analyzers using priors from non-parallel speech for voice conversion

A robust voice conversion function relies on a large amount of parallel training data, which is difficult to collect in practice. To tackle the sparse parallel training data problem in voice conversion, this paper describes a mixture of factor analyzers method which integrates prior knowledge from n...

全面介紹

Saved in:
書目詳細資料
Main Authors: Wu, Zhizheng, Kinnunen, Tomi, Chng, Eng Siong, Li, Haizhou
其他作者: School of Computer Engineering
格式: Article
語言:English
出版: 2013
主題:
在線閱讀:https://hdl.handle.net/10356/102726
http://hdl.handle.net/10220/16436
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English
實物特徵
總結:A robust voice conversion function relies on a large amount of parallel training data, which is difficult to collect in practice. To tackle the sparse parallel training data problem in voice conversion, this paper describes a mixture of factor analyzers method which integrates prior knowledge from non-parallel speech into the training of conversion function. The experiments on CMU ARCTIC corpus show that the proposed method improves the quality and similarity of converted speech. With both objective and subjective evaluations, we show the proposed method outperforms the baseline GMM method.