Urban noise tagging and perceptually informed unsupervised clustering

The rapid urbanisation process has led to an increase in noise pollution in urban environments, impacting the well-being and quality of life of city residents. To address this issue, researchers have focused on understanding and categorising urban noise to mitigate its effects effectively. Convoluti...

Full description

Saved in:
Bibliographic Details
Main Author: Quek, Gordon
Other Authors: Gan Woon Seng
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/176981
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The rapid urbanisation process has led to an increase in noise pollution in urban environments, impacting the well-being and quality of life of city residents. To address this issue, researchers have focused on understanding and categorising urban noise to mitigate its effects effectively. Convolutional Neural Networks (CNNs) have emerged as a popular choice for audio classification tasks due to their ability to learn structural patterns from audio data. Despite efforts in the field, there is a lack of research on urban audio data from specific locations, such as Singapore, which has unique acoustic features influenced by local factors. This project aims to enhance the performance of CNNs for audio classification using the SINGA:PURA dataset, which consists of audio recordings from the Singapore environment. Using an initial code for machine learning audio classification adapted from Kenneth's DCASE Task1B from 2020 and modified to adapt to the SINGA:PURA dataset. Experiments are conducted to modify model parameters, data preprocessing techniques, and CNN architectures. Results show that a learning rate of 0.001, a batch size of 60, and employing 2 to 4 convolutional layers with 160 mel filters yield optimal performance for classifying Singaporean acoustic environments. However, despite achieving a training accuracy of 80% and a testing accuracy of 46%, overfitting is observed, indicating the need for further research. Recommendations for future work include addressing dataset imbalance through additional data augmentation techniques, refining data preprocessing methods, exploring a broader range of hyperparameters, and leveraging pre-trained CNN models such as VGG16, ResNet, or InceptionV3. These approaches could provide valuable insights into optimising model performance and improving classification accuracy for urban noise tagging and mitigation efforts.