Identifying the Most Effective Feature Category in Machine Learning-based Phishing Website Detection

This paper proposes an improved approach to categorise phishing features into precise categories. Existing features are surveyed from the current phishing detection works and grouped according to the improved categorisation approach. The performances of various feature sets are evaluated using the C...

Full description

Saved in:
Bibliographic Details
Main Authors: Tan, Choon Lin, Chiew, Kang Leng, Nadianatra, Musa, Dayang Hanani, Abang Ibrahim
Format: Article
Language:English
Published: Science Publishing Corporation 2018
Subjects:
Online Access:http://ir.unimas.my/id/eprint/25776/1/Identifying%20the%20Most%20Effective%20Feature%20Category%20in%20Machine%20Learning-based%20Phishing%20Website%20Detection%20%28abstract%29.pdf
http://ir.unimas.my/id/eprint/25776/
https://www.sciencepubco.com/index.php/ijet/article/view/23331
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Malaysia Sarawak
Language: English
Description
Summary:This paper proposes an improved approach to categorise phishing features into precise categories. Existing features are surveyed from the current phishing detection works and grouped according to the improved categorisation approach. The performances of various feature sets are evaluated using the C4.5 classifier, whereby the content URL obfuscation category is found to perform the best, achieving an accuracy of 95.97%. Additional benchmarking is conducted to compare the performance of the winning feature set against other feature sets utilised in existing phishing detection techniques. Results suggest that the winning feature set is indeed an effective feature category which has contributed significantly to the performance of existing machine learning-based phishing detection systems.