Mixture of factor analyzers using priors from non-parallel speech for voice conversion

A robust voice conversion function relies on a large amount of parallel training data, which is difficult to collect in practice. To tackle the sparse parallel training data problem in voice conversion, this paper describes a mixture of factor analyzers method which integrates prior knowledge from n...

全面介紹

Saved in:
書目詳細資料
Main Authors: Wu, Zhizheng, Kinnunen, Tomi, Chng, Eng Siong, Li, Haizhou
其他作者: School of Computer Engineering
格式: Article
語言:English
出版: 2013
主題:
在線閱讀:https://hdl.handle.net/10356/102726
http://hdl.handle.net/10220/16436
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English
id sg-ntu-dr.10356-102726
record_format dspace
spelling sg-ntu-dr.10356-1027262020-05-28T07:18:12Z Mixture of factor analyzers using priors from non-parallel speech for voice conversion Wu, Zhizheng Kinnunen, Tomi Chng, Eng Siong Li, Haizhou School of Computer Engineering Temasek Laboratories DRNTU::Engineering::Computer science and engineering A robust voice conversion function relies on a large amount of parallel training data, which is difficult to collect in practice. To tackle the sparse parallel training data problem in voice conversion, this paper describes a mixture of factor analyzers method which integrates prior knowledge from non-parallel speech into the training of conversion function. The experiments on CMU ARCTIC corpus show that the proposed method improves the quality and similarity of converted speech. With both objective and subjective evaluations, we show the proposed method outperforms the baseline GMM method. 2013-10-10T08:29:45Z 2019-12-06T20:59:37Z 2013-10-10T08:29:45Z 2019-12-06T20:59:37Z 2012 2012 Journal Article Wu, Z., Kinnunen, T., Chng, E. S., & Li, H. (2012). Mixture of factor analyzers using priors from non-parallel speech for voice conversion. IEEE signal processing letters, 19(12), 914-917. 1070-9908 https://hdl.handle.net/10356/102726 http://hdl.handle.net/10220/16436 10.1109/LSP.2012.2225615 en IEEE signal processing letters © 2012 IEEE
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering
spellingShingle DRNTU::Engineering::Computer science and engineering
Wu, Zhizheng
Kinnunen, Tomi
Chng, Eng Siong
Li, Haizhou
Mixture of factor analyzers using priors from non-parallel speech for voice conversion
description A robust voice conversion function relies on a large amount of parallel training data, which is difficult to collect in practice. To tackle the sparse parallel training data problem in voice conversion, this paper describes a mixture of factor analyzers method which integrates prior knowledge from non-parallel speech into the training of conversion function. The experiments on CMU ARCTIC corpus show that the proposed method improves the quality and similarity of converted speech. With both objective and subjective evaluations, we show the proposed method outperforms the baseline GMM method.
author2 School of Computer Engineering
author_facet School of Computer Engineering
Wu, Zhizheng
Kinnunen, Tomi
Chng, Eng Siong
Li, Haizhou
format Article
author Wu, Zhizheng
Kinnunen, Tomi
Chng, Eng Siong
Li, Haizhou
author_sort Wu, Zhizheng
title Mixture of factor analyzers using priors from non-parallel speech for voice conversion
title_short Mixture of factor analyzers using priors from non-parallel speech for voice conversion
title_full Mixture of factor analyzers using priors from non-parallel speech for voice conversion
title_fullStr Mixture of factor analyzers using priors from non-parallel speech for voice conversion
title_full_unstemmed Mixture of factor analyzers using priors from non-parallel speech for voice conversion
title_sort mixture of factor analyzers using priors from non-parallel speech for voice conversion
publishDate 2013
url https://hdl.handle.net/10356/102726
http://hdl.handle.net/10220/16436
_version_ 1681057254843875328