MOBILE MONEY FRAUD DETECTION USING K-NEAREST NEIGHBORS

In this information era, online and mobile transactions are becoming more common every day. That being said, there are also more people taking advantage of these features by committing frauds. In response of such actions, many institutions are finding out how to detect it. One of the most used so...

Full description

Saved in:
Bibliographic Details
Main Author: Aristea Tantiono, William
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/38895
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:In this information era, online and mobile transactions are becoming more common every day. That being said, there are also more people taking advantage of these features by committing frauds. In response of such actions, many institutions are finding out how to detect it. One of the most used solutions offered is a system that uses expert judgement or rule-based system, which works well is most cases. However, as the nature of fraud changes constantly, sometimes the system may be late in detecting the fraud or miss the fraud altogether. On the other hand, we see that machine learning field advances each and every day. One of the machine learning algorithm that may be of use for detecting fraud is the K-Nearest Neighbor algorithm. KNN works by checking a data’s nearest ‘K’ number of neighbors and determine whether or not it is a fraud based on the characteristics of those neighbors. There are 3 parameters to be optimized in this algorithm, which are its ‘K’ value, distance metric, and weights. Besides its intuitiveness, as will be shown in this paper, the author has succeeded in implementing this algorithm to detect frauds in mobile transactions. As the author have found, the most optimal solution for the implementation of this algorithm is to use the hyperparameter as follows: k=4, Manhattan distance metric (Minkowski distance with p=1), with uniform weights. With these configurations, the author successfully yielded a 97.7% specificity rate and 90.5% recall rate on the test data.