Malware data collection and analysis

Malicious software, referred to as malware, is one of the main threats on the Internet in the present day. Millions of hosts on the Internet are infected with malware, ranging from classic computer viruses to Internet worms and bot networks. A huge increase in the number of malware samples are colle...

全面介紹

Saved in:
書目詳細資料
主要作者: Low, Song Chuan.
其他作者: Chen Lihui
格式: Final Year Project
語言:English
出版: 2013
主題:
在線閱讀:http://hdl.handle.net/10356/54413
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English
實物特徵
總結:Malicious software, referred to as malware, is one of the main threats on the Internet in the present day. Millions of hosts on the Internet are infected with malware, ranging from classic computer viruses to Internet worms and bot networks. A huge increase in the number of malware samples are collected by anti-virus vendors. In this project, malware data collection and analysis tools had been reviewed. A malware data report collection procedure has been successfully automated with CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) evading technique when submitting malware data set to various malware analysis tools. The details design and implementation of the evading CAPTCHA technique for various malware analysis tools were presented in the report. Simulations on data collections were conducted to demonstrate the success of the technique implemented. After the reports are collected, pre-processing of the reports are needed to clean the data which is an important for data representation. The process of pre-processing reports includes junk characters removal such as hash code, long string of symbol and numbers, while keeping the rest of the information in each report. Other than report collection and pre-processing of reports, separation of the dataset into training and testing dataset is needed for building machine learning classifier in malware data analysis.