Incremental adaptive spam mail filtering using naïve Bayesian classification

Most content based spam filters are rule based or trained off-line. Handling new spam tactics is difficult and prone to high misclassification rate. This paper proposes an incremental adaptive spam mail filtering using Naïve Bayesian classification which gives good performance, simplicity and adapta...

Full description

Saved in:
Bibliographic Details
Main Authors: Phimphaka Taninpong, Sudsanguan Ngamsuriyaroj
Other Authors: Mahidol University
Format: Conference or Workshop Item
Published: 2018
Subjects:
Online Access:https://repository.li.mahidol.ac.th/handle/123456789/27448
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Mahidol University
Description
Summary:Most content based spam filters are rule based or trained off-line. Handling new spam tactics is difficult and prone to high misclassification rate. This paper proposes an incremental adaptive spam mail filtering using Naïve Bayesian classification which gives good performance, simplicity and adaptability. We model an incremental scheme that receives a stream of emails, and applies the concept of sliding window to train only the last w emails for testing new incoming messages. Subsequently, the new features of tested messages are added to the existing features so that the model will be adaptive to future incoming emails. The proposed model is tested on two corpora: Trec05p-1 [11] and Trec06p [12]. The parameters are the window size and the number of features, and the evaluation metrics are the processing time per message, and the ham and spam misclassification rates. The experimental results show that the number of features has little impact whereas the window size has significant effects on misclassification rates and the processing time. In addition, the overall accuracy is even better than that obtained from the batch off-line training and the processing time is reduced significantly. © 2009 IEEE.