An automatic voice conversion evaluation strategy based on perceptual background noise distortion and speaker similarity
Voice conversion aims to modify the characteristics of one speaker to make it sound like spoken by another speaker without changing the language content. This task has attracted considerable attention and various approaches have been proposed since two decades ago. The evaluation of voice conversio...
Saved in:
Main Authors: | , , , , , , , , , , , , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2019
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/89623 http://hdl.handle.net/10220/49691 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-89623 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-896232020-03-07T11:48:46Z An automatic voice conversion evaluation strategy based on perceptual background noise distortion and speaker similarity Huang, Dong-Yan Xie, Lei Zhang, Shaofei Lee, Yvonne Siu Wa Wu, Jie Ming, Huaiping Tian, Xiaohai Ding, Chuang Li, Mei Nguyen, Quy Hy Dong, Minghui Chng, Eng Siong Li, Haizhou School of Computer Science and Engineering 9th ISCA Speech Synthesis Workshop Engineering::Computer science and engineering Voice Conversion Objective Measures Voice conversion aims to modify the characteristics of one speaker to make it sound like spoken by another speaker without changing the language content. This task has attracted considerable attention and various approaches have been proposed since two decades ago. The evaluation of voice conversion approaches, usually through time-intensive subject listening tests, requires a huge amount of human labor. This paper proposes an automatic voice conversion evaluation strategy based on perceptual background noise distortion and speaker similarity. Experimental results show that our automatic evaluation results match the subjective listening results quite well. We further use our strategy to select best converted samples from multiple voice conversion systems and our submission achieves promising results in the voice conversion challenge (VCC2016). Published version 2019-08-20T04:39:20Z 2019-12-06T17:29:47Z 2019-08-20T04:39:20Z 2019-12-06T17:29:47Z 2016-09-01 2016 Conference Paper Huang, D.-Y., Xie, L., Lee, Y. S. W., Wu, J., Ming, H., Tian, X., … Li, H. (2016). An automatic voice conversion evaluation strategy based on perceptual background noise distortion and speaker similarity. 9th ISCA Speech Synthesis Workshop. doi:10.21437/SSW.2016-8 https://hdl.handle.net/10356/89623 http://hdl.handle.net/10220/49691 10.21437/SSW.2016-8 200458 en © 2016 International Speech Communication Association (ISCA). All rights reserved. This paper was published in 9th ISCA Speech Synthesis Workshop and is made available with permission of International Speech Communication Association (ISCA). 8 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering Voice Conversion Objective Measures |
spellingShingle |
Engineering::Computer science and engineering Voice Conversion Objective Measures Huang, Dong-Yan Xie, Lei Zhang, Shaofei Lee, Yvonne Siu Wa Wu, Jie Ming, Huaiping Tian, Xiaohai Ding, Chuang Li, Mei Nguyen, Quy Hy Dong, Minghui Chng, Eng Siong Li, Haizhou An automatic voice conversion evaluation strategy based on perceptual background noise distortion and speaker similarity |
description |
Voice conversion aims to modify the characteristics of one speaker to make it sound like spoken by another speaker without changing the language content. This task has attracted considerable
attention and various approaches have been proposed since two decades ago. The evaluation of voice conversion approaches, usually through time-intensive subject listening tests, requires a huge amount of human labor. This paper proposes an automatic voice conversion evaluation strategy based on perceptual background noise distortion and speaker similarity. Experimental results show that our automatic evaluation results match the subjective listening results quite well. We further
use our strategy to select best converted samples from multiple voice conversion systems and our submission achieves promising results in the voice conversion challenge (VCC2016). |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Huang, Dong-Yan Xie, Lei Zhang, Shaofei Lee, Yvonne Siu Wa Wu, Jie Ming, Huaiping Tian, Xiaohai Ding, Chuang Li, Mei Nguyen, Quy Hy Dong, Minghui Chng, Eng Siong Li, Haizhou |
format |
Conference or Workshop Item |
author |
Huang, Dong-Yan Xie, Lei Zhang, Shaofei Lee, Yvonne Siu Wa Wu, Jie Ming, Huaiping Tian, Xiaohai Ding, Chuang Li, Mei Nguyen, Quy Hy Dong, Minghui Chng, Eng Siong Li, Haizhou |
author_sort |
Huang, Dong-Yan |
title |
An automatic voice conversion evaluation strategy based on perceptual background noise distortion and speaker similarity |
title_short |
An automatic voice conversion evaluation strategy based on perceptual background noise distortion and speaker similarity |
title_full |
An automatic voice conversion evaluation strategy based on perceptual background noise distortion and speaker similarity |
title_fullStr |
An automatic voice conversion evaluation strategy based on perceptual background noise distortion and speaker similarity |
title_full_unstemmed |
An automatic voice conversion evaluation strategy based on perceptual background noise distortion and speaker similarity |
title_sort |
automatic voice conversion evaluation strategy based on perceptual background noise distortion and speaker similarity |
publishDate |
2019 |
url |
https://hdl.handle.net/10356/89623 http://hdl.handle.net/10220/49691 |
_version_ |
1681048436025065472 |