Audio intelligent monitoring at the edge (AIME) for polyphonic sound sources

Urban sound monitoring remains imperative as an effort to control and mitigate noise pollution, especially in urban areas. With the advancement in the field of artificial intelligence (AI) and edge computing, the development of intelligent machine listening systems for real-time noise monitoring has...

Full description

Saved in:
Bibliographic Details
Main Author: Lim, Victor
Other Authors: Gan Woon Seng
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/176730
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-176730
record_format dspace
spelling sg-ntu-dr.10356-1767302024-05-24T15:50:46Z Audio intelligent monitoring at the edge (AIME) for polyphonic sound sources Lim, Victor Gan Woon Seng School of Electrical and Electronic Engineering EWSGAN@ntu.edu.sg Engineering Audio Machine learning Sound event detection Urban sound Urban sound monitoring remains imperative as an effort to control and mitigate noise pollution, especially in urban areas. With the advancement in the field of artificial intelligence (AI) and edge computing, the development of intelligent machine listening systems for real-time noise monitoring has been a prominent focus in modern sound monitoring systems. Urban sound is characterized by a multitude of sound sources, including vehicular traffic, industrial activities, construction work, and human activities. The overlapping nature of these sound sources often creates complex polyphonic environments, where multiple sounds occur simultaneously, introducing challenges and limitations to the traditional monitoring system. In this project, we aim to address the challenges posed by polyphonic urban sound environments through the utilization of deep learning models for audio tagging and sound event detection. Development of these sound models primarily focuses on the usage of SINGA:PURA dataset, a strongly labelled polyphonic urban sound dataset with spatiotemporal context recorded in Singapore. We explore the usage transfer learning, pre-trained audio embedding, together with Convolutional Recurrent Neural Network (CRNN) architecture to perform sound detection and audio tagging, leveraging on the strong and weak labels from the openly available dataset. Leveraging on the ensemble of models, we enhance the robustness and accuracy of our system by utilizing the predictions from multiple specialized models. Additionally, we explore quantization techniques to enhance efficiency and enable the deployment of our sound detection models in resource-constrained environments. Bachelor's degree 2024-05-20T01:56:55Z 2024-05-20T01:56:55Z 2024 Final Year Project (FYP) Lim, V. (2024). Audio intelligent monitoring at the edge (AIME) for polyphonic sound sources. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/176730 https://hdl.handle.net/10356/176730 en A3059-231 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering
Audio
Machine learning
Sound event detection
Urban sound
spellingShingle Engineering
Audio
Machine learning
Sound event detection
Urban sound
Lim, Victor
Audio intelligent monitoring at the edge (AIME) for polyphonic sound sources
description Urban sound monitoring remains imperative as an effort to control and mitigate noise pollution, especially in urban areas. With the advancement in the field of artificial intelligence (AI) and edge computing, the development of intelligent machine listening systems for real-time noise monitoring has been a prominent focus in modern sound monitoring systems. Urban sound is characterized by a multitude of sound sources, including vehicular traffic, industrial activities, construction work, and human activities. The overlapping nature of these sound sources often creates complex polyphonic environments, where multiple sounds occur simultaneously, introducing challenges and limitations to the traditional monitoring system. In this project, we aim to address the challenges posed by polyphonic urban sound environments through the utilization of deep learning models for audio tagging and sound event detection. Development of these sound models primarily focuses on the usage of SINGA:PURA dataset, a strongly labelled polyphonic urban sound dataset with spatiotemporal context recorded in Singapore. We explore the usage transfer learning, pre-trained audio embedding, together with Convolutional Recurrent Neural Network (CRNN) architecture to perform sound detection and audio tagging, leveraging on the strong and weak labels from the openly available dataset. Leveraging on the ensemble of models, we enhance the robustness and accuracy of our system by utilizing the predictions from multiple specialized models. Additionally, we explore quantization techniques to enhance efficiency and enable the deployment of our sound detection models in resource-constrained environments.
author2 Gan Woon Seng
author_facet Gan Woon Seng
Lim, Victor
format Final Year Project
author Lim, Victor
author_sort Lim, Victor
title Audio intelligent monitoring at the edge (AIME) for polyphonic sound sources
title_short Audio intelligent monitoring at the edge (AIME) for polyphonic sound sources
title_full Audio intelligent monitoring at the edge (AIME) for polyphonic sound sources
title_fullStr Audio intelligent monitoring at the edge (AIME) for polyphonic sound sources
title_full_unstemmed Audio intelligent monitoring at the edge (AIME) for polyphonic sound sources
title_sort audio intelligent monitoring at the edge (aime) for polyphonic sound sources
publisher Nanyang Technological University
publishDate 2024
url https://hdl.handle.net/10356/176730
_version_ 1806059787483348992