Distributed layer-3 e-mail classification for spam control

This paper proposes a distributed layer-3 e-mail classification for spam control. E-mail packets are inferred in transit and tagged with an intra-packet spam score to indicate whether the packet forms a legitimate or spam e-mail. During e-mail packet reassembly, tags for an e-mail are aggregated to...

Full description

Saved in:
Bibliographic Details
Main Authors: Marsono, Muhammad N., El-Kharashi, M. Watheq, Gebali, Fayez, Ganti, Sudhakar
Format: Book Section
Published: IEEE Explore 2007
Subjects:
Online Access:http://eprints.utm.my/id/eprint/17110/
http://dx.doi.org/10.1109/CCECE.2006.277810
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
Description
Summary:This paper proposes a distributed layer-3 e-mail classification for spam control. E-mail packets are inferred in transit and tagged with an intra-packet spam score to indicate whether the packet forms a legitimate or spam e-mail. During e-mail packet reassembly, tags for an e-mail are aggregated to give an inter-packet spam score. The naive Bayes inference technique is used to evaluate the performance of the proposed approach compared to the full e-mail classification approach. Our simulation results show that the proposed approach exhibits a comparable spam precision (and confidence) to the full e-mail classification approach. Spam recall increases from 63% to 85% depending to the maximum transmission unit size, approaching the 87% of the full e-mail classification. For 67% spam-to-legitimate ratio, we obtain reduction of end servers's workload by 42% to 57% (across all maximum transmission unit sizes tested) of the total e-mail traffic. Thus, the proposed approach can complement existing anti-spam systems by pre-processing e-mail packets on upstream nodes. Layer-3 e-mail processing requires reduced processing complexity as compared to layer-7 processing and is viable for high throughput hardware-based implementations.