Identifying the Most Effective Feature Category in Machine Learning-based Phishing Website Detection

This paper proposes an improved approach to categorise phishing features into precise categories. Existing features are surveyed from the current phishing detection works and grouped according to the improved categorisation approach. The performances of various feature sets are evaluated using the C...

Full description

Saved in:
Bibliographic Details
Main Authors: Tan, Choon Lin, Chiew, Kang Leng, Nadianatra, Musa, Dayang Hanani, Abang Ibrahim
Format: Article
Language:English
Published: Science Publishing Corporation 2018
Subjects:
Online Access:http://ir.unimas.my/id/eprint/25776/1/Identifying%20the%20Most%20Effective%20Feature%20Category%20in%20Machine%20Learning-based%20Phishing%20Website%20Detection%20%28abstract%29.pdf
http://ir.unimas.my/id/eprint/25776/
https://www.sciencepubco.com/index.php/ijet/article/view/23331
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Malaysia Sarawak
Language: English
id my.unimas.ir.25776
record_format eprints
spelling my.unimas.ir.257762023-03-29T03:11:28Z http://ir.unimas.my/id/eprint/25776/ Identifying the Most Effective Feature Category in Machine Learning-based Phishing Website Detection Tan, Choon Lin Chiew, Kang Leng Nadianatra, Musa Dayang Hanani, Abang Ibrahim T Technology (General) This paper proposes an improved approach to categorise phishing features into precise categories. Existing features are surveyed from the current phishing detection works and grouped according to the improved categorisation approach. The performances of various feature sets are evaluated using the C4.5 classifier, whereby the content URL obfuscation category is found to perform the best, achieving an accuracy of 95.97%. Additional benchmarking is conducted to compare the performance of the winning feature set against other feature sets utilised in existing phishing detection techniques. Results suggest that the winning feature set is indeed an effective feature category which has contributed significantly to the performance of existing machine learning-based phishing detection systems. Science Publishing Corporation 2018 Article PeerReviewed text en http://ir.unimas.my/id/eprint/25776/1/Identifying%20the%20Most%20Effective%20Feature%20Category%20in%20Machine%20Learning-based%20Phishing%20Website%20Detection%20%28abstract%29.pdf Tan, Choon Lin and Chiew, Kang Leng and Nadianatra, Musa and Dayang Hanani, Abang Ibrahim (2018) Identifying the Most Effective Feature Category in Machine Learning-based Phishing Website Detection. International Journal of Engineering & Technology, 7 (4.31). pp. 1-6. ISSN 2227-524X https://www.sciencepubco.com/index.php/ijet/article/view/23331
institution Universiti Malaysia Sarawak
building Centre for Academic Information Services (CAIS)
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Sarawak
content_source UNIMAS Institutional Repository
url_provider http://ir.unimas.my/
language English
topic T Technology (General)
spellingShingle T Technology (General)
Tan, Choon Lin
Chiew, Kang Leng
Nadianatra, Musa
Dayang Hanani, Abang Ibrahim
Identifying the Most Effective Feature Category in Machine Learning-based Phishing Website Detection
description This paper proposes an improved approach to categorise phishing features into precise categories. Existing features are surveyed from the current phishing detection works and grouped according to the improved categorisation approach. The performances of various feature sets are evaluated using the C4.5 classifier, whereby the content URL obfuscation category is found to perform the best, achieving an accuracy of 95.97%. Additional benchmarking is conducted to compare the performance of the winning feature set against other feature sets utilised in existing phishing detection techniques. Results suggest that the winning feature set is indeed an effective feature category which has contributed significantly to the performance of existing machine learning-based phishing detection systems.
format Article
author Tan, Choon Lin
Chiew, Kang Leng
Nadianatra, Musa
Dayang Hanani, Abang Ibrahim
author_facet Tan, Choon Lin
Chiew, Kang Leng
Nadianatra, Musa
Dayang Hanani, Abang Ibrahim
author_sort Tan, Choon Lin
title Identifying the Most Effective Feature Category in Machine Learning-based Phishing Website Detection
title_short Identifying the Most Effective Feature Category in Machine Learning-based Phishing Website Detection
title_full Identifying the Most Effective Feature Category in Machine Learning-based Phishing Website Detection
title_fullStr Identifying the Most Effective Feature Category in Machine Learning-based Phishing Website Detection
title_full_unstemmed Identifying the Most Effective Feature Category in Machine Learning-based Phishing Website Detection
title_sort identifying the most effective feature category in machine learning-based phishing website detection
publisher Science Publishing Corporation
publishDate 2018
url http://ir.unimas.my/id/eprint/25776/1/Identifying%20the%20Most%20Effective%20Feature%20Category%20in%20Machine%20Learning-based%20Phishing%20Website%20Detection%20%28abstract%29.pdf
http://ir.unimas.my/id/eprint/25776/
https://www.sciencepubco.com/index.php/ijet/article/view/23331
_version_ 1761675229608804352