Multi-task learning for end-to-end noise-robust bandwidth extension

Bandwidth extension aims to reconstruct wideband speech signals from narrowband inputs to improve perceptual quality. Prior studies mostly perform bandwidth extension under the assumption that the narrowband signals are clean without noise. The use of such extension techniques is greatly limited in...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلفون الرئيسيون:	Hou, Nana, Xu, Chenglin, Zhou, Joey Tianyi, Chng, Eng Siong, Li, Haizhou
مؤلفون آخرون:	School of Computer Science and Engineering
التنسيق:	Conference or Workshop Item
اللغة:	English
منشور في:	2020
الموضوعات:	Engineering::Computer science and engineering Speech Enhancement Noise-robust Bandwidth Extension
الوصول للمادة أونلاين:	https://hdl.handle.net/10356/144855
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة:	Nanyang Technological University
اللغة:	English

id	sg-ntu-dr.10356-144855
record_format	dspace
spelling	sg-ntu-dr.10356-1448552020-12-05T20:10:27Z Multi-task learning for end-to-end noise-robust bandwidth extension Hou, Nana Xu, Chenglin Zhou, Joey Tianyi Chng, Eng Siong Li, Haizhou School of Computer Science and Engineering Interspeech 2020 Air Traffic Management Research Institute Engineering::Computer science and engineering Speech Enhancement Noise-robust Bandwidth Extension Bandwidth extension aims to reconstruct wideband speech signals from narrowband inputs to improve perceptual quality. Prior studies mostly perform bandwidth extension under the assumption that the narrowband signals are clean without noise. The use of such extension techniques is greatly limited in practice when signals are corrupted by noise. To alleviate such problem, we propose an end-to-end time-domain framework for noise-robust bandwidth extension, that jointly optimizes a mask-based speech enhancement and an ideal bandwidth extension module with multi-task learning. The proposed framework avoids decomposing the signals into magnitude and phase spectra, therefore, requires no phase estimation. Experimental results show that the proposed method achieves 14.3% and 15.8% relative improvements over the best baseline in terms of perceptual evaluation of speech quality (PESQ) and log-spectral distortion (LSD), respectively. Furthermore, our method is 3 times more compact than the best baseline in terms of the number of parameters. National Research Foundation (NRF) Published version This work was supported by Air Traffic Management Research Institute of Nanyang Technological University, HumanRobot Interaction Phase 1 (Grant No. 192 25 00054), National Research Foundation (NRF) Singapore under the National Robotics Programme; AI Speech Lab (Award No. AISG100E-2018-006), NRF Singapore under the AI Singapore Programme; Human Robot Collaborative AI for AME (Grant No. A18A2b0046), NRF Singapore; Neuromorphic Computing Programme (Grant No. A1687b0033), RIE2020 Advanced Manufacturing and Engineering Programmatic Grant. The work by H. Li is also partly supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy (University Allowance, EXC 2077, University of Bremen, Germany). 2020-11-30T08:15:09Z 2020-11-30T08:15:09Z 2020 Conference Paper Hou, N., Xu, C., Zhou, J. T., Chng, E. S., & Li, H. (2020). Multi-task learning for end-to-end noise-robust bandwidth extension. Interspeech 2020, 4069-4073. https://hdl.handle.net/10356/144855 4069 4073 en © 2020 International Speech Communication Association (ISCA). All rights reserved. This paper was published in Interspeech 2020 and is made available with permission of International Speech Communication Association (ISCA). application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering Speech Enhancement Noise-robust Bandwidth Extension
spellingShingle	Engineering::Computer science and engineering Speech Enhancement Noise-robust Bandwidth Extension Hou, Nana Xu, Chenglin Zhou, Joey Tianyi Chng, Eng Siong Li, Haizhou Multi-task learning for end-to-end noise-robust bandwidth extension
description	Bandwidth extension aims to reconstruct wideband speech signals from narrowband inputs to improve perceptual quality. Prior studies mostly perform bandwidth extension under the assumption that the narrowband signals are clean without noise. The use of such extension techniques is greatly limited in practice when signals are corrupted by noise. To alleviate such problem, we propose an end-to-end time-domain framework for noise-robust bandwidth extension, that jointly optimizes a mask-based speech enhancement and an ideal bandwidth extension module with multi-task learning. The proposed framework avoids decomposing the signals into magnitude and phase spectra, therefore, requires no phase estimation. Experimental results show that the proposed method achieves 14.3% and 15.8% relative improvements over the best baseline in terms of perceptual evaluation of speech quality (PESQ) and log-spectral distortion (LSD), respectively. Furthermore, our method is 3 times more compact than the best baseline in terms of the number of parameters.
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Hou, Nana Xu, Chenglin Zhou, Joey Tianyi Chng, Eng Siong Li, Haizhou
format	Conference or Workshop Item
author	Hou, Nana Xu, Chenglin Zhou, Joey Tianyi Chng, Eng Siong Li, Haizhou
author_sort	Hou, Nana
title	Multi-task learning for end-to-end noise-robust bandwidth extension
title_short	Multi-task learning for end-to-end noise-robust bandwidth extension
title_full	Multi-task learning for end-to-end noise-robust bandwidth extension
title_fullStr	Multi-task learning for end-to-end noise-robust bandwidth extension
title_full_unstemmed	Multi-task learning for end-to-end noise-robust bandwidth extension
title_sort	multi-task learning for end-to-end noise-robust bandwidth extension
publishDate	2020
url	https://hdl.handle.net/10356/144855
_version_	1688665520307437568

Multi-task learning for end-to-end noise-robust bandwidth extension

مواد مشابهة