Deep learning based malware detection using hardware performance counters

Studies in the past have investigated the feasibility of using HPCs (Hardware Performance Counters) as a metric to differentiate between benignware and malware. A major study titled “Hardware Performance Counters Can Detect Malware: Myth or Fact?” in 2018 concluded by using statistical models like R...

Full description

Saved in:
Bibliographic Details
Main Author: Quah, Yu Kiat
Other Authors: Zhang Tianwei
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/148121
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Studies in the past have investigated the feasibility of using HPCs (Hardware Performance Counters) as a metric to differentiate between benignware and malware. A major study titled “Hardware Performance Counters Can Detect Malware: Myth or Fact?” in 2018 concluded by using statistical models like Random Forest and Decision Tree that HPCs are not able to serve as a suitable metric. In the time since that study was published, newer deep learning models and techniques have been created. This paper first attempts to replicate the major study mentioned previously, then further investigate the feasibility of using HPCs as a metric with other models and techniques not used previously. LSTM (Long-Term Short Memory), Dense, and Ensemble models were investigated for their ability to use HPC values as a metric to differentiate between benignware and malware. This paper achieved results of ~80%, ~60%, and ~80% respectively for those models. Thus, this paper, based on the additional experiments done, supports the conclusion that HPCs are unable to reliably differentiate between benignware and malware. However, this paper provides the caveat that more data is needed for more experiments to be done to further support or contradict the conclusion that HPCs are an unsuitable metric. The source code used for this paper will also be made available to serve as an accessible base from which others can continue to build upon.