DESIGN AND DEVELOPMENT OF PHONE-BASED SCAMMER DETECTION SYSTEM USING RANDOM FOREST CLASSIFIER ON MFCC AND FORMANTS AUDIO FEATURES
The advances of information and communication technology must be followed by improvement in humanity. The abuse of technology happens everywhere, and the victim is no other than humans. Even though there have been more enhancements in security systems, humans are still a big vulnerability that op...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/81154 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:81154 |
---|---|
spelling |
id-itb.:811542024-04-29T08:27:22ZDESIGN AND DEVELOPMENT OF PHONE-BASED SCAMMER DETECTION SYSTEM USING RANDOM FOREST CLASSIFIER ON MFCC AND FORMANTS AUDIO FEATURES Medrina Rahman, Yanisa Indonesia Theses phone scam, Random Forest, speaker classification, audio feature, MFCC, Formants INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/81154 The advances of information and communication technology must be followed by improvement in humanity. The abuse of technology happens everywhere, and the victim is no other than humans. Even though there have been more enhancements in security systems, humans are still a big vulnerability that opens to social engineering attacks. One of the social engineering attack forms is phone-based scam, which happens frequently in Indonesia. Numerous researches have been conducted with various methods to deal with phone scams. However, there are still gaps that lead to further research being conducted. This research is conducted by performing the design and development of phone-based scammer detection utilizing machine learning classification algorithm to recognize speakers in a phone call conversation. To determine what classifier to utilize, an experiment is conducted to four classifiers: Support Vector Machine (SVM), Gaussian Naive Bayes (GNB), Random Forest (RF), and Linear Discriminant Analysis (LDA). Among those four, RF managed to be the one with the highest performance in classifying speakers based on the Mel-Frequency Cepstral Coefficient (MFCC) and Formants audio features, with 89,45% accuracy, 91,47% precision, 89,45% recall, and 88,55% F1. The system designed and developed in this research managed to perform in real time with a success rate of 92,73% on classifying speakers from single-speaker audio recordings, and 87,27% on dialog recordings that consist of multiple speakers. The system also managed to distinguish between existing and new speakers using SVM on prediction probability with a 100% success rate. Using the audio features as an object to recognize speakers, this proposed system is not bound to phone numbers, which anyone can easily change text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
The advances of information and communication technology must be followed by
improvement in humanity. The abuse of technology happens everywhere, and the
victim is no other than humans. Even though there have been more enhancements
in security systems, humans are still a big vulnerability that opens to social
engineering attacks. One of the social engineering attack forms is phone-based
scam, which happens frequently in Indonesia. Numerous researches have been
conducted with various methods to deal with phone scams. However, there are still
gaps that lead to further research being conducted. This research is conducted by
performing the design and development of phone-based scammer detection utilizing
machine learning classification algorithm to recognize speakers in a phone call
conversation. To determine what classifier to utilize, an experiment is conducted to
four classifiers: Support Vector Machine (SVM), Gaussian Naive Bayes (GNB),
Random Forest (RF), and Linear Discriminant Analysis (LDA). Among those four,
RF managed to be the one with the highest performance in classifying speakers
based on the Mel-Frequency Cepstral Coefficient (MFCC) and Formants audio
features, with 89,45% accuracy, 91,47% precision, 89,45% recall, and 88,55% F1.
The system designed and developed in this research managed to perform in real
time with a success rate of 92,73% on classifying speakers from single-speaker
audio recordings, and 87,27% on dialog recordings that consist of multiple
speakers. The system also managed to distinguish between existing and new
speakers using SVM on prediction probability with a 100% success rate. Using the
audio features as an object to recognize speakers, this proposed system is not bound
to phone numbers, which anyone can easily change |
format |
Theses |
author |
Medrina Rahman, Yanisa |
spellingShingle |
Medrina Rahman, Yanisa DESIGN AND DEVELOPMENT OF PHONE-BASED SCAMMER DETECTION SYSTEM USING RANDOM FOREST CLASSIFIER ON MFCC AND FORMANTS AUDIO FEATURES |
author_facet |
Medrina Rahman, Yanisa |
author_sort |
Medrina Rahman, Yanisa |
title |
DESIGN AND DEVELOPMENT OF PHONE-BASED SCAMMER DETECTION SYSTEM USING RANDOM FOREST CLASSIFIER ON MFCC AND FORMANTS AUDIO FEATURES |
title_short |
DESIGN AND DEVELOPMENT OF PHONE-BASED SCAMMER DETECTION SYSTEM USING RANDOM FOREST CLASSIFIER ON MFCC AND FORMANTS AUDIO FEATURES |
title_full |
DESIGN AND DEVELOPMENT OF PHONE-BASED SCAMMER DETECTION SYSTEM USING RANDOM FOREST CLASSIFIER ON MFCC AND FORMANTS AUDIO FEATURES |
title_fullStr |
DESIGN AND DEVELOPMENT OF PHONE-BASED SCAMMER DETECTION SYSTEM USING RANDOM FOREST CLASSIFIER ON MFCC AND FORMANTS AUDIO FEATURES |
title_full_unstemmed |
DESIGN AND DEVELOPMENT OF PHONE-BASED SCAMMER DETECTION SYSTEM USING RANDOM FOREST CLASSIFIER ON MFCC AND FORMANTS AUDIO FEATURES |
title_sort |
design and development of phone-based scammer detection system using random forest classifier on mfcc and formants audio features |
url |
https://digilib.itb.ac.id/gdl/view/81154 |
_version_ |
1822997161033334784 |