MALWARE CLASSIFICATION USING SUPPORT VECTOR MACHINE ALGORITHM WITH LINEARSVC APPROACH

Malware or Malicious Software is designed to damage, steal important information or data, disrupt computer performance, and other criminal acts on computers or devices that can harm computer owners to large companies. Malware can infect computers via flash disk, links distributed via email, pirated...

Full description

Saved in:
Bibliographic Details
Main Author: Maryam, Zahrina
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/57191
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:57191
spelling id-itb.:571912021-07-29T08:55:49ZMALWARE CLASSIFICATION USING SUPPORT VECTOR MACHINE ALGORITHM WITH LINEARSVC APPROACH Maryam, Zahrina Indonesia Theses Malicious Software, EMBER, Support Vector Machine, LinearSVC INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/57191 Malware or Malicious Software is designed to damage, steal important information or data, disrupt computer performance, and other criminal acts on computers or devices that can harm computer owners to large companies. Malware can infect computers via flash disk, links distributed via email, pirated applications, pirated operating systems, advertisements, fake download buttons, and so on. Some examples of malware specifications based on the type or method of distribution and their impact are viruses, trojans, spyware, worms, adware, scareware, ransomware, and so on. The number of malwares every day continues to grow. The National Cybersecurity Operations Center for the National Cyber and Passwords Agency (BSSN) noted that 88,414,296 cyberattacks had occurred from January 1, 2020, to April 12, 2020. This of course greatly complicates the malware analysis and detection process. With these problems, we need a system that can detect malware automatically. One technique that can be used is machine learning (ML). The purpose of this thesis is to create a system that can detect malware automatically using machine learning. The classification system uses the Support Vector Machine (SVM) algorithm with a Linear SVC approach and is tested with the EMBER dataset. The first test scenario is to compare the accuracy results of the three approaches to SVM, namely SVC, NuSVC, and LinearSVC. The highest accuracy is obtained from the LinearSVC approach, which is 84.91% using 14710 train data samples and 10000 test data samples. In the second and third scenarios, it can be concluded that the amount of data used and the changed LinearSVC parameters can affect the accuracy, precision, recall, and f1score results. The more data, the performance will increase. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Malware or Malicious Software is designed to damage, steal important information or data, disrupt computer performance, and other criminal acts on computers or devices that can harm computer owners to large companies. Malware can infect computers via flash disk, links distributed via email, pirated applications, pirated operating systems, advertisements, fake download buttons, and so on. Some examples of malware specifications based on the type or method of distribution and their impact are viruses, trojans, spyware, worms, adware, scareware, ransomware, and so on. The number of malwares every day continues to grow. The National Cybersecurity Operations Center for the National Cyber and Passwords Agency (BSSN) noted that 88,414,296 cyberattacks had occurred from January 1, 2020, to April 12, 2020. This of course greatly complicates the malware analysis and detection process. With these problems, we need a system that can detect malware automatically. One technique that can be used is machine learning (ML). The purpose of this thesis is to create a system that can detect malware automatically using machine learning. The classification system uses the Support Vector Machine (SVM) algorithm with a Linear SVC approach and is tested with the EMBER dataset. The first test scenario is to compare the accuracy results of the three approaches to SVM, namely SVC, NuSVC, and LinearSVC. The highest accuracy is obtained from the LinearSVC approach, which is 84.91% using 14710 train data samples and 10000 test data samples. In the second and third scenarios, it can be concluded that the amount of data used and the changed LinearSVC parameters can affect the accuracy, precision, recall, and f1score results. The more data, the performance will increase.
format Theses
author Maryam, Zahrina
spellingShingle Maryam, Zahrina
MALWARE CLASSIFICATION USING SUPPORT VECTOR MACHINE ALGORITHM WITH LINEARSVC APPROACH
author_facet Maryam, Zahrina
author_sort Maryam, Zahrina
title MALWARE CLASSIFICATION USING SUPPORT VECTOR MACHINE ALGORITHM WITH LINEARSVC APPROACH
title_short MALWARE CLASSIFICATION USING SUPPORT VECTOR MACHINE ALGORITHM WITH LINEARSVC APPROACH
title_full MALWARE CLASSIFICATION USING SUPPORT VECTOR MACHINE ALGORITHM WITH LINEARSVC APPROACH
title_fullStr MALWARE CLASSIFICATION USING SUPPORT VECTOR MACHINE ALGORITHM WITH LINEARSVC APPROACH
title_full_unstemmed MALWARE CLASSIFICATION USING SUPPORT VECTOR MACHINE ALGORITHM WITH LINEARSVC APPROACH
title_sort malware classification using support vector machine algorithm with linearsvc approach
url https://digilib.itb.ac.id/gdl/view/57191
_version_ 1822274817436418048