An efficient dilated convolutional neural network for UAV noise reduction at low input SNR

Acoustic applications on a multi-rotor unmanned aerial vehicle (UAV) have been hindered by its low input signal-to-noise ratio (SNR). Such low SNR condition poses prominent challenges for beamforming algorithms, statistical methods, and existing mask-based deep learning algorithms. We propose the sm...

Full description

Saved in:
Bibliographic Details
Main Authors: Tan, Zhi-Wei, Nguyen, Anh Hai Trieu, Khong, Andy Wai Hoong
Other Authors: School of Electrical and Electronic Engineering
Format: Conference or Workshop Item
Language:English
Published: 2020
Subjects:
Online Access:https://hdl.handle.net/10356/141755
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Acoustic applications on a multi-rotor unmanned aerial vehicle (UAV) have been hindered by its low input signal-to-noise ratio (SNR). Such low SNR condition poses prominent challenges for beamforming algorithms, statistical methods, and existing mask-based deep learning algorithms. We propose the small model on low SNR (SMoLnet), a compact convolutional neural network (CNN) to suppress UAV noise in noisy speech signals recorded off a microphone array mounted on the UAV. The proposed SMoLnet employs a large analysis window to achieve high spectral resolution since the loud UAV noise exhibits a narrow-band harmonic pattern. In the proposed SMoLnet model, exponentially-increasing dilated convolution layers were adopted to capture the global relationship across the frequency dimension. Furthermore, we performed direct spectral mapping between noisy and clean complex spectrogram to cater to the low SNR scenario. Simulation results show that the proposed SMoLnet outperforms existing dilation-based models in terms of speech quality and objective speech intelligibility metrics for UAV noise reduction. In addition, the proposed SMoLnet requires fewer parameters and achieves lower latency than the compared models.