Malware data collection and analysis

Malicious software, referred to as malware, is one of the main threats on the Internet in the present day. Millions of hosts on the Internet are infected with malware, ranging from classic computer viruses to Internet worms and bot networks. A huge increase in the number of malware samples are colle...

Full description

Saved in:
Bibliographic Details
Main Author: Low, Song Chuan.
Other Authors: Chen Lihui
Format: Final Year Project
Language:English
Published: 2013
Subjects:
Online Access:http://hdl.handle.net/10356/54413
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-54413
record_format dspace
spelling sg-ntu-dr.10356-544132023-07-07T17:02:19Z Malware data collection and analysis Low, Song Chuan. Chen Lihui School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering Malicious software, referred to as malware, is one of the main threats on the Internet in the present day. Millions of hosts on the Internet are infected with malware, ranging from classic computer viruses to Internet worms and bot networks. A huge increase in the number of malware samples are collected by anti-virus vendors. In this project, malware data collection and analysis tools had been reviewed. A malware data report collection procedure has been successfully automated with CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) evading technique when submitting malware data set to various malware analysis tools. The details design and implementation of the evading CAPTCHA technique for various malware analysis tools were presented in the report. Simulations on data collections were conducted to demonstrate the success of the technique implemented. After the reports are collected, pre-processing of the reports are needed to clean the data which is an important for data representation. The process of pre-processing reports includes junk characters removal such as hash code, long string of symbol and numbers, while keeping the rest of the information in each report. Other than report collection and pre-processing of reports, separation of the dataset into training and testing dataset is needed for building machine learning classifier in malware data analysis. Bachelor of Engineering 2013-06-20T02:44:45Z 2013-06-20T02:44:45Z 2013 2013 Final Year Project (FYP) http://hdl.handle.net/10356/54413 en Nanyang Technological University 70 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Electrical and electronic engineering
spellingShingle DRNTU::Engineering::Electrical and electronic engineering
Low, Song Chuan.
Malware data collection and analysis
description Malicious software, referred to as malware, is one of the main threats on the Internet in the present day. Millions of hosts on the Internet are infected with malware, ranging from classic computer viruses to Internet worms and bot networks. A huge increase in the number of malware samples are collected by anti-virus vendors. In this project, malware data collection and analysis tools had been reviewed. A malware data report collection procedure has been successfully automated with CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) evading technique when submitting malware data set to various malware analysis tools. The details design and implementation of the evading CAPTCHA technique for various malware analysis tools were presented in the report. Simulations on data collections were conducted to demonstrate the success of the technique implemented. After the reports are collected, pre-processing of the reports are needed to clean the data which is an important for data representation. The process of pre-processing reports includes junk characters removal such as hash code, long string of symbol and numbers, while keeping the rest of the information in each report. Other than report collection and pre-processing of reports, separation of the dataset into training and testing dataset is needed for building machine learning classifier in malware data analysis.
author2 Chen Lihui
author_facet Chen Lihui
Low, Song Chuan.
format Final Year Project
author Low, Song Chuan.
author_sort Low, Song Chuan.
title Malware data collection and analysis
title_short Malware data collection and analysis
title_full Malware data collection and analysis
title_fullStr Malware data collection and analysis
title_full_unstemmed Malware data collection and analysis
title_sort malware data collection and analysis
publishDate 2013
url http://hdl.handle.net/10356/54413
_version_ 1772825777484595200