Mixture of factor analyzers using priors from non-parallel speech for voice conversion
A robust voice conversion function relies on a large amount of parallel training data, which is difficult to collect in practice. To tackle the sparse parallel training data problem in voice conversion, this paper describes a mixture of factor analyzers method which integrates prior knowledge from n...
Saved in:
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2013
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/102726 http://hdl.handle.net/10220/16436 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-102726 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1027262020-05-28T07:18:12Z Mixture of factor analyzers using priors from non-parallel speech for voice conversion Wu, Zhizheng Kinnunen, Tomi Chng, Eng Siong Li, Haizhou School of Computer Engineering Temasek Laboratories DRNTU::Engineering::Computer science and engineering A robust voice conversion function relies on a large amount of parallel training data, which is difficult to collect in practice. To tackle the sparse parallel training data problem in voice conversion, this paper describes a mixture of factor analyzers method which integrates prior knowledge from non-parallel speech into the training of conversion function. The experiments on CMU ARCTIC corpus show that the proposed method improves the quality and similarity of converted speech. With both objective and subjective evaluations, we show the proposed method outperforms the baseline GMM method. 2013-10-10T08:29:45Z 2019-12-06T20:59:37Z 2013-10-10T08:29:45Z 2019-12-06T20:59:37Z 2012 2012 Journal Article Wu, Z., Kinnunen, T., Chng, E. S., & Li, H. (2012). Mixture of factor analyzers using priors from non-parallel speech for voice conversion. IEEE signal processing letters, 19(12), 914-917. 1070-9908 https://hdl.handle.net/10356/102726 http://hdl.handle.net/10220/16436 10.1109/LSP.2012.2225615 en IEEE signal processing letters © 2012 IEEE |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering |
spellingShingle |
DRNTU::Engineering::Computer science and engineering Wu, Zhizheng Kinnunen, Tomi Chng, Eng Siong Li, Haizhou Mixture of factor analyzers using priors from non-parallel speech for voice conversion |
description |
A robust voice conversion function relies on a large amount of parallel training data, which is difficult to collect in practice. To tackle the sparse parallel training data problem in voice conversion, this paper describes a mixture of factor analyzers method which integrates prior knowledge from non-parallel speech into the training of conversion function. The experiments on CMU ARCTIC corpus show that the proposed method improves the quality and similarity of converted speech. With both objective and subjective evaluations, we show the proposed method outperforms the baseline GMM method. |
author2 |
School of Computer Engineering |
author_facet |
School of Computer Engineering Wu, Zhizheng Kinnunen, Tomi Chng, Eng Siong Li, Haizhou |
format |
Article |
author |
Wu, Zhizheng Kinnunen, Tomi Chng, Eng Siong Li, Haizhou |
author_sort |
Wu, Zhizheng |
title |
Mixture of factor analyzers using priors from non-parallel speech for voice conversion |
title_short |
Mixture of factor analyzers using priors from non-parallel speech for voice conversion |
title_full |
Mixture of factor analyzers using priors from non-parallel speech for voice conversion |
title_fullStr |
Mixture of factor analyzers using priors from non-parallel speech for voice conversion |
title_full_unstemmed |
Mixture of factor analyzers using priors from non-parallel speech for voice conversion |
title_sort |
mixture of factor analyzers using priors from non-parallel speech for voice conversion |
publishDate |
2013 |
url |
https://hdl.handle.net/10356/102726 http://hdl.handle.net/10220/16436 |
_version_ |
1681057254843875328 |