A comparative evaluation of machine learning approaches in SMS spam detection
Spam detection is a significant problem which is considered by many researchers by various developed strategies. In this study, the popular performance measure is a classification accuracy which deals with false positive, false negative and accuracy. These metrics were evaluated under applying three...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2011
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/32801/5/SaberSalehiMFSKSM2011.pdf http://eprints.utm.my/id/eprint/32801/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Teknologi Malaysia |
Language: | English |
id |
my.utm.32801 |
---|---|
record_format |
eprints |
spelling |
my.utm.328012018-05-27T07:54:55Z http://eprints.utm.my/id/eprint/32801/ A comparative evaluation of machine learning approaches in SMS spam detection Salehi, Saber HD Industries. Land use. Labor Spam detection is a significant problem which is considered by many researchers by various developed strategies. In this study, the popular performance measure is a classification accuracy which deals with false positive, false negative and accuracy. These metrics were evaluated under applying three supervised learning algorithm (Hybrid of Simple Artificial Immune System (SAIS) and Particle Swarm Optimization (PSO), Naive Bayes Classifier (NBC), Enhanced Genetic Algorithm (EGA)) based on classification of SMS contents were evaluated and compared. In this research, SAIS was hybridized by particle swarm optimization (PSO) for optimizing the performance of SAIS for spam filtering. PSO was used with mutation to reinforce the immune system’s searches to find the best class in exemplar for classification. Results were improved using Hybrid SAIS and PSO. The proposed EGA was to achieve the best chromosomes which were grouped by the keywords. Then, the best chromosome with highest fitness value was selected as classifier. Simulated annealing (SA) was used with classical mutation and crossover to reinforce the efficiency of genetic searches. Achieved results represent the enhanced GA is markedly superior to that of a classical GA. These algorithms were trained and tested on a set of 4601 SMS messages in which 1813 were spams and 2788 were non-spams. Results showed that the proposed EGA technique gave better result compare to the hybrid SAIS and PSO and NBC techniques. Results also showed that the proposed EGA technique gave 99.87% accuracy, and the proposed NBC, hybrid of SAIS and PSO techniques gave 97.457% and 88.33% accuracy, respectively. 2011-07 Thesis NonPeerReviewed application/pdf en http://eprints.utm.my/id/eprint/32801/5/SaberSalehiMFSKSM2011.pdf Salehi, Saber (2011) A comparative evaluation of machine learning approaches in SMS spam detection. Masters thesis, Universiti Teknologi Malaysia, Faculty of Computer Science and Information System. |
institution |
Universiti Teknologi Malaysia |
building |
UTM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Malaysia |
content_source |
UTM Institutional Repository |
url_provider |
http://eprints.utm.my/ |
language |
English |
topic |
HD Industries. Land use. Labor |
spellingShingle |
HD Industries. Land use. Labor Salehi, Saber A comparative evaluation of machine learning approaches in SMS spam detection |
description |
Spam detection is a significant problem which is considered by many researchers by various developed strategies. In this study, the popular performance measure is a classification accuracy which deals with false positive, false negative and accuracy. These metrics were evaluated under applying three supervised learning algorithm (Hybrid of Simple Artificial Immune System (SAIS) and Particle Swarm Optimization (PSO), Naive Bayes Classifier (NBC), Enhanced Genetic Algorithm (EGA)) based on classification of SMS contents were evaluated and compared. In this research, SAIS was hybridized by particle swarm optimization (PSO) for optimizing the performance of SAIS for spam filtering. PSO was used with mutation to reinforce the immune system’s searches to find the best class in exemplar for classification. Results were improved using Hybrid SAIS and PSO. The proposed EGA was to achieve the best chromosomes which were grouped by the keywords. Then, the best chromosome with highest fitness value was selected as classifier. Simulated annealing (SA) was used with classical mutation and crossover to reinforce the efficiency of genetic searches. Achieved results represent the enhanced GA is markedly superior to that of a classical GA. These algorithms were trained and tested on a set of 4601 SMS messages in which 1813 were spams and 2788 were non-spams. Results showed that the proposed EGA technique gave better result compare to the hybrid SAIS and PSO and NBC techniques. Results also showed that the proposed EGA technique gave 99.87% accuracy, and the proposed NBC, hybrid of SAIS and PSO techniques gave 97.457% and 88.33% accuracy, respectively. |
format |
Thesis |
author |
Salehi, Saber |
author_facet |
Salehi, Saber |
author_sort |
Salehi, Saber |
title |
A comparative evaluation of machine learning approaches in SMS spam detection |
title_short |
A comparative evaluation of machine learning approaches in SMS spam detection |
title_full |
A comparative evaluation of machine learning approaches in SMS spam detection |
title_fullStr |
A comparative evaluation of machine learning approaches in SMS spam detection |
title_full_unstemmed |
A comparative evaluation of machine learning approaches in SMS spam detection |
title_sort |
comparative evaluation of machine learning approaches in sms spam detection |
publishDate |
2011 |
url |
http://eprints.utm.my/id/eprint/32801/5/SaberSalehiMFSKSM2011.pdf http://eprints.utm.my/id/eprint/32801/ |
_version_ |
1643649145020350464 |