A joint-loss approach for speech enhancement via single-channel neural network and MVDR beamformer

Recent developments of noise reduction involves the use of neural beamforming. While some success is achieved, these algorithms rely solely on the gain of the beamformer to enhance the noisy signals. We propose a framework that comprises two stages where the first-stage neural network aims to achiev...

Full description

Saved in:
Bibliographic Details
Main Authors: Tan, Zhi-Wei, Nguyen, Anh Hai Trieu, Tran, Linh T. T., Khong, Andy Wai Hoong
Other Authors: School of Electrical and Electronic Engineering
Format: Conference or Workshop Item
Language:English
Published: 2021
Subjects:
Online Access:https://hdl.handle.net/10356/146260
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-146260
record_format dspace
spelling sg-ntu-dr.10356-1462602021-02-04T06:18:41Z A joint-loss approach for speech enhancement via single-channel neural network and MVDR beamformer Tan, Zhi-Wei Nguyen, Anh Hai Trieu Tran, Linh T. T. Khong, Andy Wai Hoong School of Electrical and Electronic Engineering 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) ST Engineering-NTU Corporate Lab Engineering Neural Beamforming Complex Spectral Mapping Recent developments of noise reduction involves the use of neural beamforming. While some success is achieved, these algorithms rely solely on the gain of the beamformer to enhance the noisy signals. We propose a framework that comprises two stages where the first-stage neural network aims to achieve a good estimate of the signal and noise to the secondstage beamformer. We also introduce an objective function that reduces the distortion of the speech component in each stage. This objective function improves the accuracy of the secondstage beamformer by enhancing the first-stage output, and in the second stage, enhances the training of the network by propagating the gradient through the beamforming operation. A parameter is introduced to control the trade-off between optimizing these two stages. Simulation results on the CHiME-3 dataset at low-SNR show that the proposed algorithm is able to exploit the enhancement gains from the neural network and the beamformer with improvement over other baseline algorithms in terms of speech distortion, quality and intelligibility. National Research Foundation (NRF) Accepted version This work was supported within the STE-NTU Corporate Lab with funding support from ST Engineering and the National Research Foundation (NRF) Singapore under the Corp Lab@University Scheme (Ref. MRP14) at Nanyang Technological University, Singapore. 2021-02-04T06:18:41Z 2021-02-04T06:18:41Z 2020 Conference Paper Tan, Z.-W., Nguyen, A. H. T., Tran, L. T. T., & Khong, A. W. H. (2020). A joint-loss approach for speech enhancement via single-channel neural network and MVDR beamformer. Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 841-849. https://hdl.handle.net/10356/146260 841 849 en MRP14 © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering
Neural Beamforming
Complex Spectral Mapping
spellingShingle Engineering
Neural Beamforming
Complex Spectral Mapping
Tan, Zhi-Wei
Nguyen, Anh Hai Trieu
Tran, Linh T. T.
Khong, Andy Wai Hoong
A joint-loss approach for speech enhancement via single-channel neural network and MVDR beamformer
description Recent developments of noise reduction involves the use of neural beamforming. While some success is achieved, these algorithms rely solely on the gain of the beamformer to enhance the noisy signals. We propose a framework that comprises two stages where the first-stage neural network aims to achieve a good estimate of the signal and noise to the secondstage beamformer. We also introduce an objective function that reduces the distortion of the speech component in each stage. This objective function improves the accuracy of the secondstage beamformer by enhancing the first-stage output, and in the second stage, enhances the training of the network by propagating the gradient through the beamforming operation. A parameter is introduced to control the trade-off between optimizing these two stages. Simulation results on the CHiME-3 dataset at low-SNR show that the proposed algorithm is able to exploit the enhancement gains from the neural network and the beamformer with improvement over other baseline algorithms in terms of speech distortion, quality and intelligibility.
author2 School of Electrical and Electronic Engineering
author_facet School of Electrical and Electronic Engineering
Tan, Zhi-Wei
Nguyen, Anh Hai Trieu
Tran, Linh T. T.
Khong, Andy Wai Hoong
format Conference or Workshop Item
author Tan, Zhi-Wei
Nguyen, Anh Hai Trieu
Tran, Linh T. T.
Khong, Andy Wai Hoong
author_sort Tan, Zhi-Wei
title A joint-loss approach for speech enhancement via single-channel neural network and MVDR beamformer
title_short A joint-loss approach for speech enhancement via single-channel neural network and MVDR beamformer
title_full A joint-loss approach for speech enhancement via single-channel neural network and MVDR beamformer
title_fullStr A joint-loss approach for speech enhancement via single-channel neural network and MVDR beamformer
title_full_unstemmed A joint-loss approach for speech enhancement via single-channel neural network and MVDR beamformer
title_sort joint-loss approach for speech enhancement via single-channel neural network and mvdr beamformer
publishDate 2021
url https://hdl.handle.net/10356/146260
_version_ 1692012967984365568