Surveillance of sound environment by machine learning

Environmental sound recognition and classification is an important topic in the field of sound event study. Computer can be used to simulate the way that human ear's hearing function works to recognize transient sound signal and assign with corresponding category label. Environmental sounds con...

Full description

Saved in:
Bibliographic Details
Main Author: Yu, Xiang
Other Authors: Jiang Xudong
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/163693
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-163693
record_format dspace
spelling sg-ntu-dr.10356-1636932023-07-07T19:18:35Z Surveillance of sound environment by machine learning Yu, Xiang Jiang Xudong School of Electrical and Electronic Engineering EXDJiang@ntu.edu.sg Engineering::Electrical and electronic engineering Environmental sound recognition and classification is an important topic in the field of sound event study. Computer can be used to simulate the way that human ear's hearing function works to recognize transient sound signal and assign with corresponding category label. Environmental sounds contain a lot of key information, acoustic scene classification and sound event detection are important technologies of natural acoustic scene calculation and analysis, which will be an essential part for modern applications such as smart robot, airport noise monitoring, unmanned driving, public security intelligent surveillance, etc. At present, the tasks of ambient sound recognition pose many challenges. On one hand, unlike speech and music, ambient sound has complex and changeable frequency domain features and time domain structures, especially in the scene with multiple sound events. In terms of frequency domain features, a sound pitch may have distinct peaks in the frequency spectrum such as an impact sound, or it may be with frequency distribution across the whole spectrum like the wind or noise. In terms of time domain structure, sound can be transient, continuous or intermittent. Therefore it is important and challenging to design a sound recognition system according to the various features of environment sounds, and how to make the computer perceive and understand the acoustic scene exactly like the human ear is a research hotspot in the field of audio signals processing. On the other hand, dataset of environment sound event from open source is very limited, how to make use of the limited dataset to ensure the model with accurate and effective performance is also important. Using the spectrogram, a sound signal can be visualized and quantified with a time-frequency spectral analysis of the magnitude spectrum in a 2D plane. This poses a challenge to sound event classification as spectral amplitudes alone are not sufficient for sound classification. In this project, a process called "Regularized 2D complex-log-Fourier transform" was introduced to resolve this problem. This method was first proposed by Professor Jiang Xudong and Professor Ren Jianfeng, which involves analyzing phase spectrum signal and amplitude spectrum signal for sound events classification. The "Principal Component Analysis" PCA was introduced to remove all the unnecessary sound features from the samples. Finally, the "Mahalanobis Distance" MD is also introduced and calculated for the sound class identification. Bachelor of Engineering (Electrical and Electronic Engineering) 2022-12-14T01:24:38Z 2022-12-14T01:24:38Z 2022 Final Year Project (FYP) Yu, X. (2022). Surveillance of sound environment by machine learning. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/163693 https://hdl.handle.net/10356/163693 en P3012-211 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Electrical and electronic engineering
spellingShingle Engineering::Electrical and electronic engineering
Yu, Xiang
Surveillance of sound environment by machine learning
description Environmental sound recognition and classification is an important topic in the field of sound event study. Computer can be used to simulate the way that human ear's hearing function works to recognize transient sound signal and assign with corresponding category label. Environmental sounds contain a lot of key information, acoustic scene classification and sound event detection are important technologies of natural acoustic scene calculation and analysis, which will be an essential part for modern applications such as smart robot, airport noise monitoring, unmanned driving, public security intelligent surveillance, etc. At present, the tasks of ambient sound recognition pose many challenges. On one hand, unlike speech and music, ambient sound has complex and changeable frequency domain features and time domain structures, especially in the scene with multiple sound events. In terms of frequency domain features, a sound pitch may have distinct peaks in the frequency spectrum such as an impact sound, or it may be with frequency distribution across the whole spectrum like the wind or noise. In terms of time domain structure, sound can be transient, continuous or intermittent. Therefore it is important and challenging to design a sound recognition system according to the various features of environment sounds, and how to make the computer perceive and understand the acoustic scene exactly like the human ear is a research hotspot in the field of audio signals processing. On the other hand, dataset of environment sound event from open source is very limited, how to make use of the limited dataset to ensure the model with accurate and effective performance is also important. Using the spectrogram, a sound signal can be visualized and quantified with a time-frequency spectral analysis of the magnitude spectrum in a 2D plane. This poses a challenge to sound event classification as spectral amplitudes alone are not sufficient for sound classification. In this project, a process called "Regularized 2D complex-log-Fourier transform" was introduced to resolve this problem. This method was first proposed by Professor Jiang Xudong and Professor Ren Jianfeng, which involves analyzing phase spectrum signal and amplitude spectrum signal for sound events classification. The "Principal Component Analysis" PCA was introduced to remove all the unnecessary sound features from the samples. Finally, the "Mahalanobis Distance" MD is also introduced and calculated for the sound class identification.
author2 Jiang Xudong
author_facet Jiang Xudong
Yu, Xiang
format Final Year Project
author Yu, Xiang
author_sort Yu, Xiang
title Surveillance of sound environment by machine learning
title_short Surveillance of sound environment by machine learning
title_full Surveillance of sound environment by machine learning
title_fullStr Surveillance of sound environment by machine learning
title_full_unstemmed Surveillance of sound environment by machine learning
title_sort surveillance of sound environment by machine learning
publisher Nanyang Technological University
publishDate 2022
url https://hdl.handle.net/10356/163693
_version_ 1772828294462308352