Recognizing sound-event by machine learning
Environmental sounds provide important context to events. Environmental sound recognition is made possible by developments in computing and statistics. One chief method of analyzing sound events is via the spectrogram. Multiple feature extraction techniques exist, however not all of them are suit...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/136924 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-136924 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1369242023-07-07T18:08:18Z Recognizing sound-event by machine learning Athaariq Ramadino Jiang Xudong School of Electrical and Electronic Engineering exdjiang@ntu.edu.sg Engineering::Electrical and electronic engineering::Electronic systems::Signal processing Engineering::Computer science and engineering::Computing methodologies::Pattern recognition Environmental sounds provide important context to events. Environmental sound recognition is made possible by developments in computing and statistics. One chief method of analyzing sound events is via the spectrogram. Multiple feature extraction techniques exist, however not all of them are suitable for environmental sound recognition. In this paper, a new technique, hereby termed “2D complex-log spectrum” is used. From the spectrogram, a second FFT is taken in the time dimension. Afterwards, the result is regularized in order to maximize discriminating features. The technique is applied to RWCP and NTU-SEC databases, and compared to other feature extraction techniques, with >95% recognition in the best-case scenario. Bachelor of Engineering (Electrical and Electronic Engineering) 2020-02-05T08:31:16Z 2020-02-05T08:31:16Z 2019 Final Year Project (FYP) https://hdl.handle.net/10356/136924 en A3302-182 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Electrical and electronic engineering::Electronic systems::Signal processing Engineering::Computer science and engineering::Computing methodologies::Pattern recognition |
spellingShingle |
Engineering::Electrical and electronic engineering::Electronic systems::Signal processing Engineering::Computer science and engineering::Computing methodologies::Pattern recognition Athaariq Ramadino Recognizing sound-event by machine learning |
description |
Environmental sounds provide important context to events. Environmental sound
recognition is made possible by developments in computing and statistics. One chief
method of analyzing sound events is via the spectrogram. Multiple feature extraction
techniques exist, however not all of them are suitable for environmental sound
recognition. In this paper, a new technique, hereby termed “2D complex-log
spectrum” is used. From the spectrogram, a second FFT is taken in the time
dimension. Afterwards, the result is regularized in order to maximize discriminating
features. The technique is applied to RWCP and NTU-SEC databases, and compared
to other feature extraction techniques, with >95% recognition in the best-case
scenario. |
author2 |
Jiang Xudong |
author_facet |
Jiang Xudong Athaariq Ramadino |
format |
Final Year Project |
author |
Athaariq Ramadino |
author_sort |
Athaariq Ramadino |
title |
Recognizing sound-event by machine learning |
title_short |
Recognizing sound-event by machine learning |
title_full |
Recognizing sound-event by machine learning |
title_fullStr |
Recognizing sound-event by machine learning |
title_full_unstemmed |
Recognizing sound-event by machine learning |
title_sort |
recognizing sound-event by machine learning |
publisher |
Nanyang Technological University |
publishDate |
2020 |
url |
https://hdl.handle.net/10356/136924 |
_version_ |
1772827185277566976 |