Mixture of factor analyzers using priors from non-parallel speech for voice conversion

A robust voice conversion function relies on a large amount of parallel training data, which is difficult to collect in practice. To tackle the sparse parallel training data problem in voice conversion, this paper describes a mixture of factor analyzers method which integrates prior knowledge from n...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلفون الرئيسيون: Wu, Zhizheng, Kinnunen, Tomi, Chng, Eng Siong, Li, Haizhou
مؤلفون آخرون: School of Computer Engineering
التنسيق: مقال
اللغة:English
منشور في: 2013
الموضوعات:
الوصول للمادة أونلاين:https://hdl.handle.net/10356/102726
http://hdl.handle.net/10220/16436
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة: Nanyang Technological University
اللغة: English
الوصف
الملخص:A robust voice conversion function relies on a large amount of parallel training data, which is difficult to collect in practice. To tackle the sparse parallel training data problem in voice conversion, this paper describes a mixture of factor analyzers method which integrates prior knowledge from non-parallel speech into the training of conversion function. The experiments on CMU ARCTIC corpus show that the proposed method improves the quality and similarity of converted speech. With both objective and subjective evaluations, we show the proposed method outperforms the baseline GMM method.