Domain adversarial training for speech enhancement

The performance of deep learning approaches to speech enhancement degrades significantly in face of mismatch between training and testing. In this paper, we propose a domain adversarial training technique for unsupervised domain transfer, that 1) overcomes domain mismatch, and 2) provides a solution...

Full description

Saved in:

Bibliographic Details
Main Authors:	Hou, Nana, Xu, Chenglin, Chng, Eng Siong, Li, Haizhou
Other Authors:	School of Computer Science and Engineering
Format:	Conference or Workshop Item
Language:	English
Published:	2020
Subjects:	Engineering::Computer science and engineering Domain Adversarial Training Speech Enhancement
Online Access:	https://hdl.handle.net/10356/144786
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-144786
record_format	dspace
spelling	sg-ntu-dr.10356-1447862020-11-28T20:10:37Z Domain adversarial training for speech enhancement Hou, Nana Xu, Chenglin Chng, Eng Siong Li, Haizhou School of Computer Science and Engineering 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Air Traffic Management Research Institute Temasek Laboratories @ NTU Engineering::Computer science and engineering Domain Adversarial Training Speech Enhancement The performance of deep learning approaches to speech enhancement degrades significantly in face of mismatch between training and testing. In this paper, we propose a domain adversarial training technique for unsupervised domain transfer, that 1) overcomes domain mismatch, and 2) provides a solution to the scenario where we only have noisy speech data, and we don’t have clean-noisy parallel data in the new domain. Specifically, our method includes two parts that are jointly trained, 1) an enhancement net to map noisy speech to clean speech by indirectly estimating a mask with a spectrum approximation loss, and 2) a domain predictor to distinguish between domains. As the proposed approach is able to adapt to a new domain only with noisy speech data in target domain, we call it an unsupervised learning technique. Experiments suggest that our approach delivers voice quality comparable with other supervised learning techniques that require clean-noisy parallel data. Accepted version This research is supported by Temasek Laboratories@NTU, Nanyang Technological University, Singapore. 2020-11-24T06:30:28Z 2020-11-24T06:30:28Z 2019 Conference Paper Hou, N., Xu, C., Chng, E. S., & Li, H. (2019). Domain adversarial training for speech enhancement. Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 667-672. doi:10.1109/APSIPAASC47483.2019.9023218 https://hdl.handle.net/10356/144786 10.1109/APSIPAASC47483.2019.9023218 667 672 en © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: https://doi.org/10.1109/APSIPAASC47483.2019.9023218 application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering Domain Adversarial Training Speech Enhancement
spellingShingle	Engineering::Computer science and engineering Domain Adversarial Training Speech Enhancement Hou, Nana Xu, Chenglin Chng, Eng Siong Li, Haizhou Domain adversarial training for speech enhancement
description	The performance of deep learning approaches to speech enhancement degrades significantly in face of mismatch between training and testing. In this paper, we propose a domain adversarial training technique for unsupervised domain transfer, that 1) overcomes domain mismatch, and 2) provides a solution to the scenario where we only have noisy speech data, and we don’t have clean-noisy parallel data in the new domain. Specifically, our method includes two parts that are jointly trained, 1) an enhancement net to map noisy speech to clean speech by indirectly estimating a mask with a spectrum approximation loss, and 2) a domain predictor to distinguish between domains. As the proposed approach is able to adapt to a new domain only with noisy speech data in target domain, we call it an unsupervised learning technique. Experiments suggest that our approach delivers voice quality comparable with other supervised learning techniques that require clean-noisy parallel data.
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Hou, Nana Xu, Chenglin Chng, Eng Siong Li, Haizhou
format	Conference or Workshop Item
author	Hou, Nana Xu, Chenglin Chng, Eng Siong Li, Haizhou
author_sort	Hou, Nana
title	Domain adversarial training for speech enhancement
title_short	Domain adversarial training for speech enhancement
title_full	Domain adversarial training for speech enhancement
title_fullStr	Domain adversarial training for speech enhancement
title_full_unstemmed	Domain adversarial training for speech enhancement
title_sort	domain adversarial training for speech enhancement
publishDate	2020
url	https://hdl.handle.net/10356/144786
_version_	1688665609598926848

Domain adversarial training for speech enhancement

Similar Items