An exemplar-based approach to frequency warping for voice conversion
The voice conversion’s task is to modify a source speaker’s voice to sound like that of a target speaker. A conversion method is considered successful when the produced speech sounds natural and similar to the target speaker. This paper presents a new voice conversion framework in which we combine...
Saved in:
Main Authors: | , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2018
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/89630 http://hdl.handle.net/10220/47103 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-89630 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-896302020-03-07T11:48:51Z An exemplar-based approach to frequency warping for voice conversion Tian, Xiaohai Lee, Siu Wa Wu, Zhizheng Chng, Eng Siong Li, Haizhou School of Computer Science and Engineering NTU-UBC Research Centre of Excellence in Active Living for the Elderly Exemplar DRNTU::Engineering::Computer science and engineering Voice Conversion The voice conversion’s task is to modify a source speaker’s voice to sound like that of a target speaker. A conversion method is considered successful when the produced speech sounds natural and similar to the target speaker. This paper presents a new voice conversion framework in which we combine frequency warping and exemplar-based method for voice conversion. Our method maintains high-resolution details during conversion by directly applying frequency warping on the high-resolution spectrum to represent the target. The warping function is generated by a sparse interpolation from a dictionary of exemplar warping functions. As the generated warping function is dependent only on a very small set of exemplars, we do away with the statistical averaging effects inherited from Gaussian mixture models (GMM). To compensate for the conversion error, we also apply residual exemplars into the conversion process. Both objective and subjective evaluations on the VOICES database validated the effectiveness of the proposed voice conversion framework. We observed a significant improvement in speech quality over the state-of-the-art parametric methods. ASTAR (Agency for Sci., Tech. and Research, S’pore) Accepted version 2018-12-19T07:15:34Z 2019-12-06T17:29:53Z 2018-12-19T07:15:34Z 2019-12-06T17:29:53Z 2017 2017 Journal Article Tian, X., Lee, S. W., Wu, Z., Chng, E. S., & Li, H. (2017). An exemplar-based approach to frequency warping for voice conversion. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(10), 1863-1876. doi:10.1109/TASLP.2017.2723721 2329-9290 https://hdl.handle.net/10356/89630 http://hdl.handle.net/10220/47103 10.1109/TASLP.2017.2723721 208282 en IEEE/ACM Transactions on Audio, Speech, and Language Processing © 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/TASLP.2017.2723721]. 14 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
Exemplar DRNTU::Engineering::Computer science and engineering Voice Conversion |
spellingShingle |
Exemplar DRNTU::Engineering::Computer science and engineering Voice Conversion Tian, Xiaohai Lee, Siu Wa Wu, Zhizheng Chng, Eng Siong Li, Haizhou An exemplar-based approach to frequency warping for voice conversion |
description |
The voice conversion’s task is to modify a source speaker’s voice to sound like that of a target speaker. A conversion method is considered successful when the produced speech sounds
natural and similar to the target speaker. This paper presents a new voice conversion framework in which we combine frequency warping and exemplar-based method for voice conversion. Our
method maintains high-resolution details during conversion by directly applying frequency warping on the high-resolution spectrum to represent the target. The warping function is generated
by a sparse interpolation from a dictionary of exemplar warping functions. As the generated warping function is dependent only on a very small set of exemplars, we do away with the statistical
averaging effects inherited from Gaussian mixture models (GMM). To compensate for the conversion error, we also apply residual exemplars into the conversion process. Both objective
and subjective evaluations on the VOICES database validated the effectiveness of the proposed voice conversion framework. We observed a significant improvement in speech quality over
the state-of-the-art parametric methods. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Tian, Xiaohai Lee, Siu Wa Wu, Zhizheng Chng, Eng Siong Li, Haizhou |
format |
Article |
author |
Tian, Xiaohai Lee, Siu Wa Wu, Zhizheng Chng, Eng Siong Li, Haizhou |
author_sort |
Tian, Xiaohai |
title |
An exemplar-based approach to frequency warping for voice conversion |
title_short |
An exemplar-based approach to frequency warping for voice conversion |
title_full |
An exemplar-based approach to frequency warping for voice conversion |
title_fullStr |
An exemplar-based approach to frequency warping for voice conversion |
title_full_unstemmed |
An exemplar-based approach to frequency warping for voice conversion |
title_sort |
exemplar-based approach to frequency warping for voice conversion |
publishDate |
2018 |
url |
https://hdl.handle.net/10356/89630 http://hdl.handle.net/10220/47103 |
_version_ |
1681035645197221888 |