On the effects of de-obfuscation on spam detection accuracy
Spam contributes to approximately two-thirds of the e-mail traffic over the Internet [4] and is fast becoming a major problem for IT users and network administrators. Spam costs billions in lost productivity [13] and results in more problems than mere annoyance of delayed and lost non-spam emai...
Saved in:
Main Authors: | , , |
---|---|
Format: | Book Section |
Published: |
Penerbit UTM
2007
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/13680/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Teknologi Malaysia |
id |
my.utm.13680 |
---|---|
record_format |
eprints |
spelling |
my.utm.136802017-10-08T01:13:15Z http://eprints.utm.my/id/eprint/13680/ On the effects of de-obfuscation on spam detection accuracy M. E. Rafiq, A. Newaz Marsono, Muhammad Nadzir Gebali, Fayez TK Electrical engineering. Electronics Nuclear engineering Spam contributes to approximately two-thirds of the e-mail traffic over the Internet [4] and is fast becoming a major problem for IT users and network administrators. Spam costs billions in lost productivity [13] and results in more problems than mere annoyance of delayed and lost non-spam emails. Naive Bayes classification has widely been used for spam detection and several variations have been proposed [19], [1], [5]. In e-mail content classification (as other supervised-learning techniques), the accuracy (of spam detection) depends on the frequency of spam features observed during training. Spam continuously evolves to circumvent systems and is becoming much more sophisticated [6]. Spammers obfuscate wellknown spam features in different ways to circumvent spam detection [12]. Obfuscating spam features (even by substituting a character with a visually similar one) reduces the frequency and size of features observed during learning. Hence, if obfuscated spam features can be de-obfuscated first before the detection, then the accuracy of spam detection would increase. This statement is proved in this chapter by experimenting with real spam e-mails. Penerbit UTM 2007 Book Section PeerReviewed M. E. Rafiq, A. Newaz and Marsono, Muhammad Nadzir and Gebali, Fayez (2007) On the effects of de-obfuscation on spam detection accuracy. In: Advances In Digital Signal Processing Applications. Penerbit UTM , Johor, pp. 159-172. ISBN 978-983-52-0652-8 |
institution |
Universiti Teknologi Malaysia |
building |
UTM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Malaysia |
content_source |
UTM Institutional Repository |
url_provider |
http://eprints.utm.my/ |
topic |
TK Electrical engineering. Electronics Nuclear engineering |
spellingShingle |
TK Electrical engineering. Electronics Nuclear engineering M. E. Rafiq, A. Newaz Marsono, Muhammad Nadzir Gebali, Fayez On the effects of de-obfuscation on spam detection accuracy |
description |
Spam contributes to approximately two-thirds of the e-mail traffic over the Internet [4] and is fast becoming a major problem for IT users and network administrators. Spam costs billions in lost productivity [13] and results in more problems than mere annoyance of delayed and lost non-spam emails. Naive Bayes classification has widely been used for spam detection and several variations have been proposed [19], [1], [5]. In e-mail content classification (as other supervised-learning techniques), the accuracy (of spam detection) depends on the frequency of spam features observed during training. Spam continuously evolves to circumvent systems and is becoming much more sophisticated [6]. Spammers obfuscate wellknown spam features in different ways to circumvent spam detection [12]. Obfuscating spam features (even by substituting a character with a visually similar one) reduces the frequency and size of features observed during learning. Hence, if obfuscated spam features can be de-obfuscated first before the detection, then the accuracy of spam detection would increase. This statement is proved in this chapter by experimenting with real spam e-mails. |
format |
Book Section |
author |
M. E. Rafiq, A. Newaz Marsono, Muhammad Nadzir Gebali, Fayez |
author_facet |
M. E. Rafiq, A. Newaz Marsono, Muhammad Nadzir Gebali, Fayez |
author_sort |
M. E. Rafiq, A. Newaz |
title |
On the effects of de-obfuscation on spam detection accuracy |
title_short |
On the effects of de-obfuscation on spam detection accuracy |
title_full |
On the effects of de-obfuscation on spam detection accuracy |
title_fullStr |
On the effects of de-obfuscation on spam detection accuracy |
title_full_unstemmed |
On the effects of de-obfuscation on spam detection accuracy |
title_sort |
on the effects of de-obfuscation on spam detection accuracy |
publisher |
Penerbit UTM |
publishDate |
2007 |
url |
http://eprints.utm.my/id/eprint/13680/ |
_version_ |
1643646252625166336 |