DESIGN AND DEVELOPMENT OF PHONE-BASED SCAMMER DETECTION SYSTEM USING RANDOM FOREST CLASSIFIER ON MFCC AND FORMANTS AUDIO FEATURES

The advances of information and communication technology must be followed by improvement in humanity. The abuse of technology happens everywhere, and the victim is no other than humans. Even though there have been more enhancements in security systems, humans are still a big vulnerability that op...

Full description

Saved in:
Bibliographic Details
Main Author: Medrina Rahman, Yanisa
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/81154
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:81154
spelling id-itb.:811542024-04-29T08:27:22ZDESIGN AND DEVELOPMENT OF PHONE-BASED SCAMMER DETECTION SYSTEM USING RANDOM FOREST CLASSIFIER ON MFCC AND FORMANTS AUDIO FEATURES Medrina Rahman, Yanisa Indonesia Theses phone scam, Random Forest, speaker classification, audio feature, MFCC, Formants INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/81154 The advances of information and communication technology must be followed by improvement in humanity. The abuse of technology happens everywhere, and the victim is no other than humans. Even though there have been more enhancements in security systems, humans are still a big vulnerability that opens to social engineering attacks. One of the social engineering attack forms is phone-based scam, which happens frequently in Indonesia. Numerous researches have been conducted with various methods to deal with phone scams. However, there are still gaps that lead to further research being conducted. This research is conducted by performing the design and development of phone-based scammer detection utilizing machine learning classification algorithm to recognize speakers in a phone call conversation. To determine what classifier to utilize, an experiment is conducted to four classifiers: Support Vector Machine (SVM), Gaussian Naive Bayes (GNB), Random Forest (RF), and Linear Discriminant Analysis (LDA). Among those four, RF managed to be the one with the highest performance in classifying speakers based on the Mel-Frequency Cepstral Coefficient (MFCC) and Formants audio features, with 89,45% accuracy, 91,47% precision, 89,45% recall, and 88,55% F1. The system designed and developed in this research managed to perform in real time with a success rate of 92,73% on classifying speakers from single-speaker audio recordings, and 87,27% on dialog recordings that consist of multiple speakers. The system also managed to distinguish between existing and new speakers using SVM on prediction probability with a 100% success rate. Using the audio features as an object to recognize speakers, this proposed system is not bound to phone numbers, which anyone can easily change text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description The advances of information and communication technology must be followed by improvement in humanity. The abuse of technology happens everywhere, and the victim is no other than humans. Even though there have been more enhancements in security systems, humans are still a big vulnerability that opens to social engineering attacks. One of the social engineering attack forms is phone-based scam, which happens frequently in Indonesia. Numerous researches have been conducted with various methods to deal with phone scams. However, there are still gaps that lead to further research being conducted. This research is conducted by performing the design and development of phone-based scammer detection utilizing machine learning classification algorithm to recognize speakers in a phone call conversation. To determine what classifier to utilize, an experiment is conducted to four classifiers: Support Vector Machine (SVM), Gaussian Naive Bayes (GNB), Random Forest (RF), and Linear Discriminant Analysis (LDA). Among those four, RF managed to be the one with the highest performance in classifying speakers based on the Mel-Frequency Cepstral Coefficient (MFCC) and Formants audio features, with 89,45% accuracy, 91,47% precision, 89,45% recall, and 88,55% F1. The system designed and developed in this research managed to perform in real time with a success rate of 92,73% on classifying speakers from single-speaker audio recordings, and 87,27% on dialog recordings that consist of multiple speakers. The system also managed to distinguish between existing and new speakers using SVM on prediction probability with a 100% success rate. Using the audio features as an object to recognize speakers, this proposed system is not bound to phone numbers, which anyone can easily change
format Theses
author Medrina Rahman, Yanisa
spellingShingle Medrina Rahman, Yanisa
DESIGN AND DEVELOPMENT OF PHONE-BASED SCAMMER DETECTION SYSTEM USING RANDOM FOREST CLASSIFIER ON MFCC AND FORMANTS AUDIO FEATURES
author_facet Medrina Rahman, Yanisa
author_sort Medrina Rahman, Yanisa
title DESIGN AND DEVELOPMENT OF PHONE-BASED SCAMMER DETECTION SYSTEM USING RANDOM FOREST CLASSIFIER ON MFCC AND FORMANTS AUDIO FEATURES
title_short DESIGN AND DEVELOPMENT OF PHONE-BASED SCAMMER DETECTION SYSTEM USING RANDOM FOREST CLASSIFIER ON MFCC AND FORMANTS AUDIO FEATURES
title_full DESIGN AND DEVELOPMENT OF PHONE-BASED SCAMMER DETECTION SYSTEM USING RANDOM FOREST CLASSIFIER ON MFCC AND FORMANTS AUDIO FEATURES
title_fullStr DESIGN AND DEVELOPMENT OF PHONE-BASED SCAMMER DETECTION SYSTEM USING RANDOM FOREST CLASSIFIER ON MFCC AND FORMANTS AUDIO FEATURES
title_full_unstemmed DESIGN AND DEVELOPMENT OF PHONE-BASED SCAMMER DETECTION SYSTEM USING RANDOM FOREST CLASSIFIER ON MFCC AND FORMANTS AUDIO FEATURES
title_sort design and development of phone-based scammer detection system using random forest classifier on mfcc and formants audio features
url https://digilib.itb.ac.id/gdl/view/81154
_version_ 1822997161033334784