DEVELOPMENT OF A MODEL FOR DETECTION AND LOCALIZATION OF OVERLAPPING NON-SPEECH SOUNDS BY ADDING SOUND SEPARATION TECHNIQUES

Sound recognition techniques have been developed in Informatics Engineering research, and the results have been implemented in many applications. However, the problems related to the development of sound recognition techniques are still being studied. One of the problems developed is the use of over...

Full description

Saved in:
Bibliographic Details
Main Author: Ranny
Format: Dissertations
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/81163
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Sound recognition techniques have been developed in Informatics Engineering research, and the results have been implemented in many applications. However, the problems related to the development of sound recognition techniques are still being studied. One of the problems developed is the use of overlapping sounds in sound recognition. Overlapping sounds themselves are a challenge in reference research because they often reduce the performance of the built sound recognition system. Therefore, in this study a technique will be developed that can improve the accuracy of sound recognition with overlapping sounds. The technique used is the overlapping sound separation technique, namely Nonnegative Matrix Factorization and Time Frequency Masking which is then carried out by sound recognition using techniques in Machine Learning, including Support Vector Machine and Artificial Neural Networks. The process of implementing techniques and testing uses augmented public data to increase the variants of overlapping sound types. The experimental outcomes of both NMF and SVM were assessed, wherein the parameter 'C' was used to signify the degree of overfitting within the resulting classification model. As the 'C' value increases, the level of overfitting also rises. In the classification model formation results, the mean 'C' value is 4, ranging from a minimum of 0 to a maximum of 20. Additionally, recognition accuracy was evaluated as a percentage derived from positive instances divided by the overall dataset outcomes, with an average accuracy of 83% achieved.Drawing from these measurement findings, the applicability of the NMF separation technique within SVM classification is evident. During the experimental phase, the disentanglement of T-F masking and ANN was evaluated through the utilization of F-score calculations and error rate analysis, spanning both detection and localization procedures. The average F-score for detection was computed at 71.1%, while for localization, it reached 81.5%. The error rate for detection was observed to be 0.41, and for localization, it was measured at 12.5. Encouragingly, testing with augmented overlapping sound data revealed a positive enhancement in the performance of the technique developed within the scope of this study.