Phishing Detection With Identity Keywords and Target Domain Name

This thesis describes the research work carried out to address the problem of phishing detection and the weaknesses in existing anti-phishing methods. Phishing works by luring users to counterfeit websites, where highly confidential credentials are requested. To safeguard Internet users against phis...

Full description

Saved in:
Bibliographic Details
Main Author: Tan, Colin Choon Lin
Format: Thesis
Language:English
English
Published: unimas 2015
Subjects:
Online Access:http://ir.unimas.my/id/eprint/21452/1/Phishing%20Detection%20With%20Identity%20Keywords%2024pgs.pdf
http://ir.unimas.my/id/eprint/21452/8/Colin.pdf
http://ir.unimas.my/id/eprint/21452/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Malaysia Sarawak
Language: English
English
Description
Summary:This thesis describes the research work carried out to address the problem of phishing detection and the weaknesses in existing anti-phishing methods. Phishing works by luring users to counterfeit websites, where highly confidential credentials are requested. To safeguard Internet users against phishing attacks, a hybrid anti-phishing method consisting of text-based, search engine-based and identity-based methods are proposed, where the differences between the target and actual identities of a webpage are exploited for classification. The proposed method can be divided into three phases. The first phase extracts identity keywords from the textual contents of the website, where a novel weighted URL tokens system based on the N-gram model is proposed. The second phase finds the target domain name by using a search engine, and the target domain name is selected based on identity-relevant features. In the final phase, a 3-tier identity matching system exploits indirect identity relationships to conclude the legitimacy of the query webpage. Experiments were conducted over 10,000 datasets, where true positive rate of 99.68% and true negative rate of 92.52% were achieved. Benchmarking results also suggest that the proposed method achieves comparable overall accuracy with three selected conventional methods. In summary, the proposed method has the key advantage of identifying phishing webpages accurately. This key advantage is highly desirable in anti-phishing applications.