DEEP DENOISING BABBLE NOISE USING A COMBINATION OF CNN AND RNN

Babble noise or chattering noise has become a type of sound signal noise that is difficult to handle. In this study, this noise handling was carried out by utilizing a deep learning model, more precisely a combination of CNN and RNN-based architectures. The model with this basis was chosen because...

Full description

Saved in:

Bibliographic Details
Main Author:	Nurul Hukmi, Imam
Format:	Final Project
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/76866
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:76866
spelling	id-itb.:768662023-08-19T09:44:17Z DEEP DENOISING BABBLE NOISE USING A COMBINATION OF CNN AND RNN Nurul Hukmi, Imam Indonesia Final Project denoising, deep learning, CNN, RNN, combination of CNN and RNN INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/76866 Babble noise or chattering noise has become a type of sound signal noise that is difficult to handle. In this study, this noise handling was carried out by utilizing a deep learning model, more precisely a combination of CNN and RNN-based architectures. The model with this basis was chosen because of the consideration of the advantages possessed by each type. CNN-based architectures can handle spatial data, have better generalization, and can do inference quickly. RNN-based architectures are specialists in dealing with ordered data. The built model is trained to be able to identify the noise contained in the noise signal in the form of a spectrogram. To find out how good the performance of the model is, this blended model is compared with CNN-based models and RNN-based models. Especially for models based on a combination of CNN and RNN as the main model, experiments will be carried out to get the best configuration of the model architecture. The metric that will be used is PESQ (perceptual evaluation of speech quality), which is a metric that is built to resemble how humans evaluate signal quality and assess the difference between the noise signal and the reference clean signal. The experimental results show that the best configuration for the CNN and RNN combination model is the model with the CNN U-Net architecture, the activation function is PRELU, and the RNN layer type is GRU. After conducting the training, it was found that the model based on the combination of CNN-RNN, CNN, and RNN had PESQ values of 2.23, 1.95, and 1.58, respectively. The results of the follow-up research show that the blended model succeeded in providing better output than the other two models on data with SNR levels within its training range. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	Babble noise or chattering noise has become a type of sound signal noise that is difficult to handle. In this study, this noise handling was carried out by utilizing a deep learning model, more precisely a combination of CNN and RNN-based architectures. The model with this basis was chosen because of the consideration of the advantages possessed by each type. CNN-based architectures can handle spatial data, have better generalization, and can do inference quickly. RNN-based architectures are specialists in dealing with ordered data. The built model is trained to be able to identify the noise contained in the noise signal in the form of a spectrogram. To find out how good the performance of the model is, this blended model is compared with CNN-based models and RNN-based models. Especially for models based on a combination of CNN and RNN as the main model, experiments will be carried out to get the best configuration of the model architecture. The metric that will be used is PESQ (perceptual evaluation of speech quality), which is a metric that is built to resemble how humans evaluate signal quality and assess the difference between the noise signal and the reference clean signal. The experimental results show that the best configuration for the CNN and RNN combination model is the model with the CNN U-Net architecture, the activation function is PRELU, and the RNN layer type is GRU. After conducting the training, it was found that the model based on the combination of CNN-RNN, CNN, and RNN had PESQ values of 2.23, 1.95, and 1.58, respectively. The results of the follow-up research show that the blended model succeeded in providing better output than the other two models on data with SNR levels within its training range.
format	Final Project
author	Nurul Hukmi, Imam
spellingShingle	Nurul Hukmi, Imam DEEP DENOISING BABBLE NOISE USING A COMBINATION OF CNN AND RNN
author_facet	Nurul Hukmi, Imam
author_sort	Nurul Hukmi, Imam
title	DEEP DENOISING BABBLE NOISE USING A COMBINATION OF CNN AND RNN
title_short	DEEP DENOISING BABBLE NOISE USING A COMBINATION OF CNN AND RNN
title_full	DEEP DENOISING BABBLE NOISE USING A COMBINATION OF CNN AND RNN
title_fullStr	DEEP DENOISING BABBLE NOISE USING A COMBINATION OF CNN AND RNN
title_full_unstemmed	DEEP DENOISING BABBLE NOISE USING A COMBINATION OF CNN AND RNN
title_sort	deep denoising babble noise using a combination of cnn and rnn
url	https://digilib.itb.ac.id/gdl/view/76866
_version_	1822995089667915776

DEEP DENOISING BABBLE NOISE USING A COMBINATION OF CNN AND RNN

Similar Items