An exemplar-based approach to frequency warping for voice conversion

The voice conversion’s task is to modify a source speaker’s voice to sound like that of a target speaker. A conversion method is considered successful when the produced speech sounds natural and similar to the target speaker. This paper presents a new voice conversion framework in which we combine...

Full description

Saved in:
Bibliographic Details
Main Authors: Tian, Xiaohai, Lee, Siu Wa, Wu, Zhizheng, Chng, Eng Siong, Li, Haizhou
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2018
Subjects:
Online Access:https://hdl.handle.net/10356/89630
http://hdl.handle.net/10220/47103
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-89630
record_format dspace
spelling sg-ntu-dr.10356-896302020-03-07T11:48:51Z An exemplar-based approach to frequency warping for voice conversion Tian, Xiaohai Lee, Siu Wa Wu, Zhizheng Chng, Eng Siong Li, Haizhou School of Computer Science and Engineering NTU-UBC Research Centre of Excellence in Active Living for the Elderly Exemplar DRNTU::Engineering::Computer science and engineering Voice Conversion The voice conversion’s task is to modify a source speaker’s voice to sound like that of a target speaker. A conversion method is considered successful when the produced speech sounds natural and similar to the target speaker. This paper presents a new voice conversion framework in which we combine frequency warping and exemplar-based method for voice conversion. Our method maintains high-resolution details during conversion by directly applying frequency warping on the high-resolution spectrum to represent the target. The warping function is generated by a sparse interpolation from a dictionary of exemplar warping functions. As the generated warping function is dependent only on a very small set of exemplars, we do away with the statistical averaging effects inherited from Gaussian mixture models (GMM). To compensate for the conversion error, we also apply residual exemplars into the conversion process. Both objective and subjective evaluations on the VOICES database validated the effectiveness of the proposed voice conversion framework. We observed a significant improvement in speech quality over the state-of-the-art parametric methods. ASTAR (Agency for Sci., Tech. and Research, S’pore) Accepted version 2018-12-19T07:15:34Z 2019-12-06T17:29:53Z 2018-12-19T07:15:34Z 2019-12-06T17:29:53Z 2017 2017 Journal Article Tian, X., Lee, S. W., Wu, Z., Chng, E. S., & Li, H. (2017). An exemplar-based approach to frequency warping for voice conversion. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(10), 1863-1876. doi:10.1109/TASLP.2017.2723721 2329-9290 https://hdl.handle.net/10356/89630 http://hdl.handle.net/10220/47103 10.1109/TASLP.2017.2723721 208282 en IEEE/ACM Transactions on Audio, Speech, and Language Processing © 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/TASLP.2017.2723721]. 14 p. application/pdf
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic Exemplar
DRNTU::Engineering::Computer science and engineering
Voice Conversion
spellingShingle Exemplar
DRNTU::Engineering::Computer science and engineering
Voice Conversion
Tian, Xiaohai
Lee, Siu Wa
Wu, Zhizheng
Chng, Eng Siong
Li, Haizhou
An exemplar-based approach to frequency warping for voice conversion
description The voice conversion’s task is to modify a source speaker’s voice to sound like that of a target speaker. A conversion method is considered successful when the produced speech sounds natural and similar to the target speaker. This paper presents a new voice conversion framework in which we combine frequency warping and exemplar-based method for voice conversion. Our method maintains high-resolution details during conversion by directly applying frequency warping on the high-resolution spectrum to represent the target. The warping function is generated by a sparse interpolation from a dictionary of exemplar warping functions. As the generated warping function is dependent only on a very small set of exemplars, we do away with the statistical averaging effects inherited from Gaussian mixture models (GMM). To compensate for the conversion error, we also apply residual exemplars into the conversion process. Both objective and subjective evaluations on the VOICES database validated the effectiveness of the proposed voice conversion framework. We observed a significant improvement in speech quality over the state-of-the-art parametric methods.
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Tian, Xiaohai
Lee, Siu Wa
Wu, Zhizheng
Chng, Eng Siong
Li, Haizhou
format Article
author Tian, Xiaohai
Lee, Siu Wa
Wu, Zhizheng
Chng, Eng Siong
Li, Haizhou
author_sort Tian, Xiaohai
title An exemplar-based approach to frequency warping for voice conversion
title_short An exemplar-based approach to frequency warping for voice conversion
title_full An exemplar-based approach to frequency warping for voice conversion
title_fullStr An exemplar-based approach to frequency warping for voice conversion
title_full_unstemmed An exemplar-based approach to frequency warping for voice conversion
title_sort exemplar-based approach to frequency warping for voice conversion
publishDate 2018
url https://hdl.handle.net/10356/89630
http://hdl.handle.net/10220/47103
_version_ 1681035645197221888