A comparative evaluation of machine learning approaches in SMS spam detection

Spam detection is a significant problem which is considered by many researchers by various developed strategies. In this study, the popular performance measure is a classification accuracy which deals with false positive, false negative and accuracy. These metrics were evaluated under applying three...

Full description

Saved in:
Bibliographic Details
Main Author: Salehi, Saber
Format: Thesis
Language:English
Published: 2011
Subjects:
Online Access:http://eprints.utm.my/id/eprint/32801/5/SaberSalehiMFSKSM2011.pdf
http://eprints.utm.my/id/eprint/32801/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
Language: English
id my.utm.32801
record_format eprints
spelling my.utm.328012018-05-27T07:54:55Z http://eprints.utm.my/id/eprint/32801/ A comparative evaluation of machine learning approaches in SMS spam detection Salehi, Saber HD Industries. Land use. Labor Spam detection is a significant problem which is considered by many researchers by various developed strategies. In this study, the popular performance measure is a classification accuracy which deals with false positive, false negative and accuracy. These metrics were evaluated under applying three supervised learning algorithm (Hybrid of Simple Artificial Immune System (SAIS) and Particle Swarm Optimization (PSO), Naive Bayes Classifier (NBC), Enhanced Genetic Algorithm (EGA)) based on classification of SMS contents were evaluated and compared. In this research, SAIS was hybridized by particle swarm optimization (PSO) for optimizing the performance of SAIS for spam filtering. PSO was used with mutation to reinforce the immune system’s searches to find the best class in exemplar for classification. Results were improved using Hybrid SAIS and PSO. The proposed EGA was to achieve the best chromosomes which were grouped by the keywords. Then, the best chromosome with highest fitness value was selected as classifier. Simulated annealing (SA) was used with classical mutation and crossover to reinforce the efficiency of genetic searches. Achieved results represent the enhanced GA is markedly superior to that of a classical GA. These algorithms were trained and tested on a set of 4601 SMS messages in which 1813 were spams and 2788 were non-spams. Results showed that the proposed EGA technique gave better result compare to the hybrid SAIS and PSO and NBC techniques. Results also showed that the proposed EGA technique gave 99.87% accuracy, and the proposed NBC, hybrid of SAIS and PSO techniques gave 97.457% and 88.33% accuracy, respectively. 2011-07 Thesis NonPeerReviewed application/pdf en http://eprints.utm.my/id/eprint/32801/5/SaberSalehiMFSKSM2011.pdf Salehi, Saber (2011) A comparative evaluation of machine learning approaches in SMS spam detection. Masters thesis, Universiti Teknologi Malaysia, Faculty of Computer Science and Information System.
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
language English
topic HD Industries. Land use. Labor
spellingShingle HD Industries. Land use. Labor
Salehi, Saber
A comparative evaluation of machine learning approaches in SMS spam detection
description Spam detection is a significant problem which is considered by many researchers by various developed strategies. In this study, the popular performance measure is a classification accuracy which deals with false positive, false negative and accuracy. These metrics were evaluated under applying three supervised learning algorithm (Hybrid of Simple Artificial Immune System (SAIS) and Particle Swarm Optimization (PSO), Naive Bayes Classifier (NBC), Enhanced Genetic Algorithm (EGA)) based on classification of SMS contents were evaluated and compared. In this research, SAIS was hybridized by particle swarm optimization (PSO) for optimizing the performance of SAIS for spam filtering. PSO was used with mutation to reinforce the immune system’s searches to find the best class in exemplar for classification. Results were improved using Hybrid SAIS and PSO. The proposed EGA was to achieve the best chromosomes which were grouped by the keywords. Then, the best chromosome with highest fitness value was selected as classifier. Simulated annealing (SA) was used with classical mutation and crossover to reinforce the efficiency of genetic searches. Achieved results represent the enhanced GA is markedly superior to that of a classical GA. These algorithms were trained and tested on a set of 4601 SMS messages in which 1813 were spams and 2788 were non-spams. Results showed that the proposed EGA technique gave better result compare to the hybrid SAIS and PSO and NBC techniques. Results also showed that the proposed EGA technique gave 99.87% accuracy, and the proposed NBC, hybrid of SAIS and PSO techniques gave 97.457% and 88.33% accuracy, respectively.
format Thesis
author Salehi, Saber
author_facet Salehi, Saber
author_sort Salehi, Saber
title A comparative evaluation of machine learning approaches in SMS spam detection
title_short A comparative evaluation of machine learning approaches in SMS spam detection
title_full A comparative evaluation of machine learning approaches in SMS spam detection
title_fullStr A comparative evaluation of machine learning approaches in SMS spam detection
title_full_unstemmed A comparative evaluation of machine learning approaches in SMS spam detection
title_sort comparative evaluation of machine learning approaches in sms spam detection
publishDate 2011
url http://eprints.utm.my/id/eprint/32801/5/SaberSalehiMFSKSM2011.pdf
http://eprints.utm.my/id/eprint/32801/
_version_ 1643649145020350464