Deep learning speech enhancement in satellite radio communication

Clarity and intelligibility are critical aspects of speech. Deep learning models for speech enhancement uses different algorithms to improve the speech quality significantly before reaching the listener. Machine learning knowledge is crucial in generating models to predict the outcome of the s...

Full description

Saved in:
Bibliographic Details
Main Author: Low, Yuki Yu Jun
Other Authors: Arokiaswami Alphones
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/157874
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Clarity and intelligibility are critical aspects of speech. Deep learning models for speech enhancement uses different algorithms to improve the speech quality significantly before reaching the listener. Machine learning knowledge is crucial in generating models to predict the outcome of the speech enhancement model. In this report, we study about the source of noise in an audio when air pilot controllers communicate and methods used for speech enhancement, mainly Wave-U-Net and a hybrid Recurrent Neural Network-based model. Wave-U-Net is a multi-scale neural network that provides end-to-end audio source separation, which is a modification from U-Net. Wave-U-Net repeatedly resamples feature maps to calculate and integrate features at different time scales [1]. The RNN-based model uses a hybrid of deep learning in conjunction with the basics of audio signal processing. Our experiment shows that the proposed Wave-U-Net method improves the audio quality consistently with PESQ metric – a test methodology that automatically assess speech quality when compared to the hybrid RNN-based model.