Deep learning speech enhancement in satellite radio communication
Clarity and intelligibility are critical aspects of speech. Deep learning models for speech enhancement uses different algorithms to improve the speech quality significantly before reaching the listener. Machine learning knowledge is crucial in generating models to predict the outcome of the s...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/157874 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Clarity and intelligibility are critical aspects of speech. Deep learning models for speech
enhancement uses different algorithms to improve the speech quality significantly before
reaching the listener. Machine learning knowledge is crucial in generating models to predict the
outcome of the speech enhancement model. In this report, we study about the source of noise in
an audio when air pilot controllers communicate and methods used for speech enhancement,
mainly Wave-U-Net and a hybrid Recurrent Neural Network-based model. Wave-U-Net is a
multi-scale neural network that provides end-to-end audio source separation, which is a
modification from U-Net. Wave-U-Net repeatedly resamples feature maps to calculate and
integrate features at different time scales [1]. The RNN-based model uses a hybrid of deep
learning in conjunction with the basics of audio signal processing. Our experiment shows that
the proposed Wave-U-Net method improves the audio quality consistently with PESQ metric –
a test methodology that automatically assess speech quality when compared to the hybrid RNN-based model. |
---|