Acoustic event detection with binarized neural network

Implementation of deep learning for Acoustic Event Detection (AED) on embedded systems is challenging due to constraints on memory, computational resources and, power dissipation. Various solutions to overcome this limitation have been proposed. One of the latest methods to overcome this limitation...

Full description

Saved in:

Bibliographic Details
Main Author:	Wong, Kah Liang
Format:	Thesis
Language:	English
Published:	2020
Subjects:	TK Electrical engineering. Electronics Nuclear engineering
Online Access:	http://eprints.utm.my/id/eprint/93005/1/WongKahLiangMSKE2020.pdf http://eprints.utm.my/id/eprint/93005/ http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:135894
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Universiti Teknologi Malaysia
Language:	English

id	my.utm.93005
record_format	eprints
spelling	my.utm.930052021-11-07T06:00:22Z http://eprints.utm.my/id/eprint/93005/ Acoustic event detection with binarized neural network Wong, Kah Liang TK Electrical engineering. Electronics Nuclear engineering Implementation of deep learning for Acoustic Event Detection (AED) on embedded systems is challenging due to constraints on memory, computational resources and, power dissipation. Various solutions to overcome this limitation have been proposed. One of the latest methods to overcome this limitation is by using Binarized Neural Network (BNN) which has been proven to achieve approximately 32x memory savings and 58x lower computational resources. XNOR-Net is a type of BNN which uses the XNOR gate to perform a logical function on the input data and give all outputs in binary form. In this project, the XNOR-Net model is constructed and trained for the AED task using urban sound (UrbanSound8K) and bird sound (Xeno-Canto) datasets. Prior to performing the training, the datasets were pre-processed through audio segmentation to produce 1-second sound files. Each audio file is converted from the time domain to Mel-Spectrogram in the frequency domain and thresholding was implemented to convert each spectrogram into a binary image. The images are then reshaped to 32x32 pixels before being used for the training procedure. A performance comparison between BinaryNet and XNOR-Net in terms of the number of hidden layers used was performed and one binary convolutional layer structure XNOR-Net was determined and constructed. The block structure and hyperparameters of the XNOR-Net were analyzed and optimized to achieve a training accuracy of 96.06% and validation accuracy of 94.08%. 2020 Thesis NonPeerReviewed application/pdf en http://eprints.utm.my/id/eprint/93005/1/WongKahLiangMSKE2020.pdf Wong, Kah Liang (2020) Acoustic event detection with binarized neural network. Masters thesis, Universiti Teknologi Malaysia, Faculty of Engineering - School of Electrical Engineering. http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:135894
institution	Universiti Teknologi Malaysia
building	UTM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Teknologi Malaysia
content_source	UTM Institutional Repository
url_provider	http://eprints.utm.my/
language	English
topic	TK Electrical engineering. Electronics Nuclear engineering
spellingShingle	TK Electrical engineering. Electronics Nuclear engineering Wong, Kah Liang Acoustic event detection with binarized neural network
description	Implementation of deep learning for Acoustic Event Detection (AED) on embedded systems is challenging due to constraints on memory, computational resources and, power dissipation. Various solutions to overcome this limitation have been proposed. One of the latest methods to overcome this limitation is by using Binarized Neural Network (BNN) which has been proven to achieve approximately 32x memory savings and 58x lower computational resources. XNOR-Net is a type of BNN which uses the XNOR gate to perform a logical function on the input data and give all outputs in binary form. In this project, the XNOR-Net model is constructed and trained for the AED task using urban sound (UrbanSound8K) and bird sound (Xeno-Canto) datasets. Prior to performing the training, the datasets were pre-processed through audio segmentation to produce 1-second sound files. Each audio file is converted from the time domain to Mel-Spectrogram in the frequency domain and thresholding was implemented to convert each spectrogram into a binary image. The images are then reshaped to 32x32 pixels before being used for the training procedure. A performance comparison between BinaryNet and XNOR-Net in terms of the number of hidden layers used was performed and one binary convolutional layer structure XNOR-Net was determined and constructed. The block structure and hyperparameters of the XNOR-Net were analyzed and optimized to achieve a training accuracy of 96.06% and validation accuracy of 94.08%.
format	Thesis
author	Wong, Kah Liang
author_facet	Wong, Kah Liang
author_sort	Wong, Kah Liang
title	Acoustic event detection with binarized neural network
title_short	Acoustic event detection with binarized neural network
title_full	Acoustic event detection with binarized neural network
title_fullStr	Acoustic event detection with binarized neural network
title_full_unstemmed	Acoustic event detection with binarized neural network
title_sort	acoustic event detection with binarized neural network
publishDate	2020
url	http://eprints.utm.my/id/eprint/93005/1/WongKahLiangMSKE2020.pdf http://eprints.utm.my/id/eprint/93005/ http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:135894
_version_	1717093406023352320

Acoustic event detection with binarized neural network

Similar Items