MACHINE LEARNING: KNN AND CLUSTERING IMPLEMENTATION ON FRAUD DETECTION SYSTEM CASE
Financial Technology is rapidly developed and adapted in Industry 4.0 era. This technology enables people to do financial transactions and financial activities easier through several shapes including m-banking, Internet banking, and digital payment. The cause of the massive increase adoption of t...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/53772 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:53772 |
---|---|
spelling |
id-itb.:537722021-03-10T09:37:59ZMACHINE LEARNING: KNN AND CLUSTERING IMPLEMENTATION ON FRAUD DETECTION SYSTEM CASE Naufan Muharam, Athur Indonesia Final Project Machine learning, fraud detection system, fraud, nearest neighbors, KNN, KMeans, DBSCAN, OPTICS, semi-supervised learning INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/53772 Financial Technology is rapidly developed and adapted in Industry 4.0 era. This technology enables people to do financial transactions and financial activities easier through several shapes including m-banking, Internet banking, and digital payment. The cause of the massive increase adoption of this technology can be trace to several opportunities including the massive penetration of handheld devices, in specific smartphone. Thus, with easier access and seamless access of financial transactions, people do more transactions than before that the market transaction volume become much bigger than before. This leads to potential transaction security risk, including fraud. This final paper research focuses on implementing several machine learning combinations to build a fraud detection system to better prevent financial fraud. The machine learning algorithm that are being used and tested are KNN with the combination of clustering (DBSCAN, KMeans, OPTICS). Implementing these algorithms is use CRISP-DM methodology approach. Which includes, (i) defining business needs, (ii) understanding the data, (iii) data preprocessing, (iv) modelling and optimization, (v) and testing. On data processing phase, imbalance datasets are processed using under sampling technique and followed by feature scaling. On modelling and optimization phase, grid search with k-fold cross validation is being use for KNN algorithms and elbow methods is being used for clustering. Testing and evaluation are done using 7 metrics. Which are, false positive rate, area under curve, recall, precision, accuracy, F1 score, and duration. The result of the research shown that when the algorithms implemented on testing data PaySim, the KNN with KMeans algorithm combination give the best recall performance if we compare with other combination. KNN with KMeans have performance with metrics as followed: FPR 0.74%, area under curve 96.64%, recall 88.45%, precision 26.46%, accuracy 99.23%, F1 score 40.73%, and take 17.9 seconds. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Financial Technology is rapidly developed and adapted in Industry 4.0 era. This technology
enables people to do financial transactions and financial activities easier through several shapes
including m-banking, Internet banking, and digital payment. The cause of the massive increase
adoption of this technology can be trace to several opportunities including the massive penetration
of handheld devices, in specific smartphone. Thus, with easier access and seamless access of
financial transactions, people do more transactions than before that the market transaction volume
become much bigger than before. This leads to potential transaction security risk, including fraud.
This final paper research focuses on implementing several machine learning combinations to build
a fraud detection system to better prevent financial fraud. The machine learning algorithm that are
being used and tested are KNN with the combination of clustering (DBSCAN, KMeans, OPTICS).
Implementing these algorithms is use CRISP-DM methodology approach. Which includes, (i)
defining business needs, (ii) understanding the data, (iii) data preprocessing, (iv) modelling and
optimization, (v) and testing. On data processing phase, imbalance datasets are processed using
under sampling technique and followed by feature scaling. On modelling and optimization phase,
grid search with k-fold cross validation is being use for KNN algorithms and elbow methods is
being used for clustering. Testing and evaluation are done using 7 metrics. Which are, false
positive rate, area under curve, recall, precision, accuracy, F1 score, and duration. The result of
the research shown that when the algorithms implemented on testing data PaySim, the KNN with
KMeans algorithm combination give the best recall performance if we compare with other
combination. KNN with KMeans have performance with metrics as followed: FPR 0.74%, area
under curve 96.64%, recall 88.45%, precision 26.46%, accuracy 99.23%, F1 score 40.73%, and
take 17.9 seconds.
|
format |
Final Project |
author |
Naufan Muharam, Athur |
spellingShingle |
Naufan Muharam, Athur MACHINE LEARNING: KNN AND CLUSTERING IMPLEMENTATION ON FRAUD DETECTION SYSTEM CASE |
author_facet |
Naufan Muharam, Athur |
author_sort |
Naufan Muharam, Athur |
title |
MACHINE LEARNING: KNN AND CLUSTERING IMPLEMENTATION ON FRAUD DETECTION SYSTEM CASE |
title_short |
MACHINE LEARNING: KNN AND CLUSTERING IMPLEMENTATION ON FRAUD DETECTION SYSTEM CASE |
title_full |
MACHINE LEARNING: KNN AND CLUSTERING IMPLEMENTATION ON FRAUD DETECTION SYSTEM CASE |
title_fullStr |
MACHINE LEARNING: KNN AND CLUSTERING IMPLEMENTATION ON FRAUD DETECTION SYSTEM CASE |
title_full_unstemmed |
MACHINE LEARNING: KNN AND CLUSTERING IMPLEMENTATION ON FRAUD DETECTION SYSTEM CASE |
title_sort |
machine learning: knn and clustering implementation on fraud detection system case |
url |
https://digilib.itb.ac.id/gdl/view/53772 |
_version_ |
1822929420306874368 |