Comparative Analysis of Combinations of Dimension Reduction and Data Mining Techniques for Malware Detection
Many malware detectors utilize data mining techniques as primary tools for pattern recognition. As the number of new and evolving malware continues to rise, there is an increasing need for faster and more accurate detectors. However, for a given malware detector, detection speed and accuracy are usu...
Saved in:
Main Authors: | , , |
---|---|
Format: | text |
Published: |
Archīum Ateneo
2010
|
Subjects: | |
Online Access: | https://archium.ateneo.edu/discs-faculty-pubs/198 https://archium.ateneo.edu/cgi/viewcontent.cgi?article=1197&context=discs-faculty-pubs |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Ateneo De Manila University |
Summary: | Many malware detectors utilize data mining techniques as primary tools for pattern recognition. As the number of new and evolving malware continues to rise, there is an increasing need for faster and more accurate detectors. However, for a given malware detector, detection speed and accuracy are usually inversely related. This study explores several configurations of classification combined with feature selection. An optimization function involving accuracy and processing time is used to evaluate each configuration. A real data set provided by Trend Micro Philippines is used for the study. Among 18 di↵erent configurations studied, it is shown that J4.8 without feature selection is best for cases where accuracy is extremely important. On the other hand, when time performance is more crucial, applying a Na¨ıve Bayes classifier on a reduced data set (using Gain Ratio Attribute Evaluation to select the top 35 features only) gives the best results. |
---|