Malware data collection and analysis
Malicious software, referred to as malware, is one of the main threats on the Internet in the present day. Millions of hosts on the Internet are infected with malware, ranging from classic computer viruses to Internet worms and bot networks. A huge increase in the number of malware samples are colle...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2013
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/54413 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-54413 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-544132023-07-07T17:02:19Z Malware data collection and analysis Low, Song Chuan. Chen Lihui School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering Malicious software, referred to as malware, is one of the main threats on the Internet in the present day. Millions of hosts on the Internet are infected with malware, ranging from classic computer viruses to Internet worms and bot networks. A huge increase in the number of malware samples are collected by anti-virus vendors. In this project, malware data collection and analysis tools had been reviewed. A malware data report collection procedure has been successfully automated with CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) evading technique when submitting malware data set to various malware analysis tools. The details design and implementation of the evading CAPTCHA technique for various malware analysis tools were presented in the report. Simulations on data collections were conducted to demonstrate the success of the technique implemented. After the reports are collected, pre-processing of the reports are needed to clean the data which is an important for data representation. The process of pre-processing reports includes junk characters removal such as hash code, long string of symbol and numbers, while keeping the rest of the information in each report. Other than report collection and pre-processing of reports, separation of the dataset into training and testing dataset is needed for building machine learning classifier in malware data analysis. Bachelor of Engineering 2013-06-20T02:44:45Z 2013-06-20T02:44:45Z 2013 2013 Final Year Project (FYP) http://hdl.handle.net/10356/54413 en Nanyang Technological University 70 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Electrical and electronic engineering |
spellingShingle |
DRNTU::Engineering::Electrical and electronic engineering Low, Song Chuan. Malware data collection and analysis |
description |
Malicious software, referred to as malware, is one of the main threats on the Internet in the present day. Millions of hosts on the Internet are infected with malware, ranging from classic computer viruses to Internet worms and bot networks. A huge increase in the number of malware samples are collected by anti-virus vendors.
In this project, malware data collection and analysis tools had been reviewed. A malware data report collection procedure has been successfully automated with CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) evading technique when submitting malware data set to various malware analysis tools. The details design and implementation of the evading CAPTCHA technique for various malware analysis tools were presented in the report. Simulations on data collections were conducted to demonstrate the success of the technique implemented.
After the reports are collected, pre-processing of the reports are needed to clean the data which is an important for data representation. The process of pre-processing reports includes junk characters removal such as hash code, long string of symbol and numbers, while keeping the rest of the information in each report.
Other than report collection and pre-processing of reports, separation of the dataset into training and testing dataset is needed for building machine learning classifier in malware data analysis. |
author2 |
Chen Lihui |
author_facet |
Chen Lihui Low, Song Chuan. |
format |
Final Year Project |
author |
Low, Song Chuan. |
author_sort |
Low, Song Chuan. |
title |
Malware data collection and analysis |
title_short |
Malware data collection and analysis |
title_full |
Malware data collection and analysis |
title_fullStr |
Malware data collection and analysis |
title_full_unstemmed |
Malware data collection and analysis |
title_sort |
malware data collection and analysis |
publishDate |
2013 |
url |
http://hdl.handle.net/10356/54413 |
_version_ |
1772825777484595200 |