Classification of phishing websites using machine learning techniques

Phishing detection is a momentous problem which can be deliberated by many researchers with numerous advanced approaches. Current anti-phishing mechanisms such as blacklist-base anti-phishing, Heuristic-based anti-phishing does suffer low detection accuracy and high false alarm. There is need for ef...

Full description

Saved in:
Bibliographic Details
Main Author: Zamani, Hadi
Format: Thesis
Published: 2013
Subjects:
Online Access:http://eprints.utm.my/id/eprint/42243/
http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:75244
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
id my.utm.42243
record_format eprints
spelling my.utm.422432020-08-23T06:04:25Z http://eprints.utm.my/id/eprint/42243/ Classification of phishing websites using machine learning techniques Zamani, Hadi TK Electrical engineering. Electronics Nuclear engineering Phishing detection is a momentous problem which can be deliberated by many researchers with numerous advanced approaches. Current anti-phishing mechanisms such as blacklist-base anti-phishing, Heuristic-based anti-phishing does suffer low detection accuracy and high false alarm. There is need for efficient mechanism to protect user from phishing websites. The purpose of this study is to investigate the capability of 6 machine learning algorithms which they are Multi- Layer Perceptron (MLP), Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), Logistic Regression (LR) and Naïve Bayes (NB) to classify phishing and non-phishing websites. These algorithms were trained with two different groups of training in WEKA environment and then were tested in terms of accuracy, precision, TP rate, and FP rate on a 3 different sets of dataset which contains dissimilar portion of phishing and non-phishing instances. Results presented that Naïve Bayes classifier has better detection accuracy between other classifiers for predicting phishing websites while Multi-Layer Perceptron gave worst result in terms of detection accuracy. The result also showed that Support Vector machine has better FP rate between other classifier. In addition Random Forest, Decision Tree, and Naïve Bayes can classify all phishing websites as phishing correctly. It means that TP rate is 100% for these classifiers. In conclusion this project suggests using NB as the best classifier for predicting phishing and non-phishing websites. 2013 Thesis NonPeerReviewed Zamani, Hadi (2013) Classification of phishing websites using machine learning techniques. Masters thesis, Universiti Teknologi Malaysia, Faculty of Computing. http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:75244
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
topic TK Electrical engineering. Electronics Nuclear engineering
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
Zamani, Hadi
Classification of phishing websites using machine learning techniques
description Phishing detection is a momentous problem which can be deliberated by many researchers with numerous advanced approaches. Current anti-phishing mechanisms such as blacklist-base anti-phishing, Heuristic-based anti-phishing does suffer low detection accuracy and high false alarm. There is need for efficient mechanism to protect user from phishing websites. The purpose of this study is to investigate the capability of 6 machine learning algorithms which they are Multi- Layer Perceptron (MLP), Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), Logistic Regression (LR) and Naïve Bayes (NB) to classify phishing and non-phishing websites. These algorithms were trained with two different groups of training in WEKA environment and then were tested in terms of accuracy, precision, TP rate, and FP rate on a 3 different sets of dataset which contains dissimilar portion of phishing and non-phishing instances. Results presented that Naïve Bayes classifier has better detection accuracy between other classifiers for predicting phishing websites while Multi-Layer Perceptron gave worst result in terms of detection accuracy. The result also showed that Support Vector machine has better FP rate between other classifier. In addition Random Forest, Decision Tree, and Naïve Bayes can classify all phishing websites as phishing correctly. It means that TP rate is 100% for these classifiers. In conclusion this project suggests using NB as the best classifier for predicting phishing and non-phishing websites.
format Thesis
author Zamani, Hadi
author_facet Zamani, Hadi
author_sort Zamani, Hadi
title Classification of phishing websites using machine learning techniques
title_short Classification of phishing websites using machine learning techniques
title_full Classification of phishing websites using machine learning techniques
title_fullStr Classification of phishing websites using machine learning techniques
title_full_unstemmed Classification of phishing websites using machine learning techniques
title_sort classification of phishing websites using machine learning techniques
publishDate 2013
url http://eprints.utm.my/id/eprint/42243/
http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:75244
_version_ 1677781071817605120