Scalable malware clustering through coarse-grained behavior modeling

Anti-malware vendors receive several thousand new malware (malicious software) variants per day. Due to large volume of malware samples, it has become extremely important to group them based on their malicious characteristics. Grouping of malware variants that exhibit similar behavior helps to gener...

Full description

Saved in:
Bibliographic Details
Main Authors: CHANDRAMOHAN, Mahinthan, TAN, Hee Beng Kuan, SHAR, Lwin Khin
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2012
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/4782
https://ink.library.smu.edu.sg/context/sis_research/article/5785/viewcontent/Scalable_Malware_Clustering_through_Coarse_FSE12.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Anti-malware vendors receive several thousand new malware (malicious software) variants per day. Due to large volume of malware samples, it has become extremely important to group them based on their malicious characteristics. Grouping of malware variants that exhibit similar behavior helps to generate malware signatures more efficiently. Unfortunately, exponential growth of new malware variants and huge-dimensional feature space, as used in existing approaches, make the clustering task very challenging and difficult to scale. Furthermore, malware behavior modeling techniques proposed in the literature do not scale well, where malware feature space grows in proportion with the number of samples under examination. In this paper, we propose a scalable malware behavior modeling technique that models the interactions between malware and sensitive system resources in a coarse-grained manner. Coarsegrained behavior modeling enables us to generate malware feature space that does not grow in proportion with the number of samples under examination. A preliminary study shows that our approach generates 289 times less malware features and yet improves the average clustering accuracy by 6.20% in comparison to a state-of-the-art malware clustering technique.