Study of dynamic malware clustering and classification

Malware or malicious software is one of the major threats in the internet today and there are thousands of malware samples introduced every day. Antivirus vendors need to classify them as malicious and update the signature of potentially harmful malware in their databases. Machine learning is...

Full description

Saved in:
Bibliographic Details
Main Author: Malhotra, Dipanshu.
Other Authors: Chen Lihui
Format: Final Year Project
Language:English
Published: 2013
Subjects:
Online Access:http://hdl.handle.net/10356/54592
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Malware or malicious software is one of the major threats in the internet today and there are thousands of malware samples introduced every day. Antivirus vendors need to classify them as malicious and update the signature of potentially harmful malware in their databases. Machine learning is the study and creation of systems that have the ability to learn from the data provided to them. Machine Learning can be used for malware classification. But to do this, there data should first be embedded into a feature vector space. The project is aimed at performing a literature review of the malware analysis techniques, creating a trivial data representation after text processing and investigating the procedure to use a machine learning approach – unsupervised feature learning for creating a system to automatically learn from data and perform feature selections. A cross-validation tool has been developed in this project which can be used to check the accuracy of the unsupervised feature learning technique suggested. A framework for malware analysis is suggested in this project report. The report concludes with recommendations on malware analysis using unsupervised feature learning techniques and what further work can be done on this project to create a successful malware analysis tool.