DESIGN AND DEVELOPMENT OF PHONE-BASED SCAMMER DETECTION SYSTEM USING RANDOM FOREST CLASSIFIER ON MFCC AND FORMANTS AUDIO FEATURES
The advances of information and communication technology must be followed by improvement in humanity. The abuse of technology happens everywhere, and the victim is no other than humans. Even though there have been more enhancements in security systems, humans are still a big vulnerability that op...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/81154 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | The advances of information and communication technology must be followed by
improvement in humanity. The abuse of technology happens everywhere, and the
victim is no other than humans. Even though there have been more enhancements
in security systems, humans are still a big vulnerability that opens to social
engineering attacks. One of the social engineering attack forms is phone-based
scam, which happens frequently in Indonesia. Numerous researches have been
conducted with various methods to deal with phone scams. However, there are still
gaps that lead to further research being conducted. This research is conducted by
performing the design and development of phone-based scammer detection utilizing
machine learning classification algorithm to recognize speakers in a phone call
conversation. To determine what classifier to utilize, an experiment is conducted to
four classifiers: Support Vector Machine (SVM), Gaussian Naive Bayes (GNB),
Random Forest (RF), and Linear Discriminant Analysis (LDA). Among those four,
RF managed to be the one with the highest performance in classifying speakers
based on the Mel-Frequency Cepstral Coefficient (MFCC) and Formants audio
features, with 89,45% accuracy, 91,47% precision, 89,45% recall, and 88,55% F1.
The system designed and developed in this research managed to perform in real
time with a success rate of 92,73% on classifying speakers from single-speaker
audio recordings, and 87,27% on dialog recordings that consist of multiple
speakers. The system also managed to distinguish between existing and new
speakers using SVM on prediction probability with a 100% success rate. Using the
audio features as an object to recognize speakers, this proposed system is not bound
to phone numbers, which anyone can easily change |
---|