A Machine Learning Classification Approach to Detect TLS-based Malware using Entropy-based Flow Set Features

Transport Layer Security (TLS) based malware is one of the most hazardous malware types, as it relies on encryption to conceal connections. Due to the complexity of TLS traffic decryption, several anomaly-based detection studies have been conducted to detect TLS-based malware using different feature...

Full description

Saved in:

Bibliographic Details
Main Authors:	Keshkeh, Kinan, Jantan, Aman, Alieyan, Kamal
Format:	Article
Language:	English
Published:	Universiti Utara Malaysia Press 2022
Subjects:	QA75 Electronic computers. Computer science
Online Access:	https://repo.uum.edu.my/id/eprint/28740/1/JICT%2021%2003%202022%20279-313.pdf https://doi.org/10.32890/jict2022.21.3.1 https://repo.uum.edu.my/id/eprint/28740/ https://e-journal.uum.edu.my/index.php/jict/article/view/14434 https://doi.org/10.32890/jict2022.21.3.1
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.uum.repo.28740
record_format	eprints
spelling	my.uum.repo.287402023-02-08T01:33:19Z https://repo.uum.edu.my/id/eprint/28740/ A Machine Learning Classification Approach to Detect TLS-based Malware using Entropy-based Flow Set Features Keshkeh, Kinan Jantan, Aman Alieyan, Kamal QA75 Electronic computers. Computer science Transport Layer Security (TLS) based malware is one of the most hazardous malware types, as it relies on encryption to conceal connections. Due to the complexity of TLS traffic decryption, several anomaly-based detection studies have been conducted to detect TLS-based malware using different features and machine learning (ML) algorithms. However, most of these studies utilized flow features with no feature transformation or relied on inefficient flow feature transformations like frequency-based periodicity analysis and outliers percentage. This paper introduces TLSMalDetect, a TLS-based malware detection approach that integrates periodicity-independent entropy-based flow set (EFS) features generated by a flow feature transformation technique to solve flow feature utilization issues in related research. EFS features effectiveness was evaluated in two ways: (1) by comparing them to the corresponding outliers percentage and flow features using four feature importance methods, and (2) by analyzing classification performance with and without EFS features. Moreover, new Transmission Control Protocol features not explored in literature were incorporated into TLSMalDetect, and their contribution was assessed. This study’s results proved EFS features of the number of packets sent and received were superior to related outliers percentage and flow features and could remarkably increase the performance up to ~42% in the case of Support Vector Machine accuracy. Furthermore, using the basic features, TLSMalDetect achieved the highest accuracy of 93.69% by Naïve Bayes (NB) among the ML algorithms applied. Also, from a comparison view, TLSMalDetect’s Random Forest precision of 98.99% and NB recall of 92.91% exceeded the best relevant findings of previous studies. These comparative results demonstrated the TLSMalDetect’s ability to detect more malware flows out of total malicious flows than existing works. It could also generate more actual alerts from overall alerts than earlier research. Universiti Utara Malaysia Press 2022 Article PeerReviewed application/pdf en cc4_by https://repo.uum.edu.my/id/eprint/28740/1/JICT%2021%2003%202022%20279-313.pdf Keshkeh, Kinan and Jantan, Aman and Alieyan, Kamal (2022) A Machine Learning Classification Approach to Detect TLS-based Malware using Entropy-based Flow Set Features. Journal of Information and Communication Technology, 21 (03). pp. 279-313. ISSN 2180-3862 https://e-journal.uum.edu.my/index.php/jict/article/view/14434 https://doi.org/10.32890/jict2022.21.3.1 https://doi.org/10.32890/jict2022.21.3.1
institution	Universiti Utara Malaysia
building	UUM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Utara Malaysia
content_source	UUM Institutional Repository
url_provider	http://repo.uum.edu.my/
language	English
topic	QA75 Electronic computers. Computer science
spellingShingle	QA75 Electronic computers. Computer science Keshkeh, Kinan Jantan, Aman Alieyan, Kamal A Machine Learning Classification Approach to Detect TLS-based Malware using Entropy-based Flow Set Features
description	Transport Layer Security (TLS) based malware is one of the most hazardous malware types, as it relies on encryption to conceal connections. Due to the complexity of TLS traffic decryption, several anomaly-based detection studies have been conducted to detect TLS-based malware using different features and machine learning (ML) algorithms. However, most of these studies utilized flow features with no feature transformation or relied on inefficient flow feature transformations like frequency-based periodicity analysis and outliers percentage. This paper introduces TLSMalDetect, a TLS-based malware detection approach that integrates periodicity-independent entropy-based flow set (EFS) features generated by a flow feature transformation technique to solve flow feature utilization issues in related research. EFS features effectiveness was evaluated in two ways: (1) by comparing them to the corresponding outliers percentage and flow features using four feature importance methods, and (2) by analyzing classification performance with and without EFS features. Moreover, new Transmission Control Protocol features not explored in literature were incorporated into TLSMalDetect, and their contribution was assessed. This study’s results proved EFS features of the number of packets sent and received were superior to related outliers percentage and flow features and could remarkably increase the performance up to ~42% in the case of Support Vector Machine accuracy. Furthermore, using the basic features, TLSMalDetect achieved the highest accuracy of 93.69% by Naïve Bayes (NB) among the ML algorithms applied. Also, from a comparison view, TLSMalDetect’s Random Forest precision of 98.99% and NB recall of 92.91% exceeded the best relevant findings of previous studies. These comparative results demonstrated the TLSMalDetect’s ability to detect more malware flows out of total malicious flows than existing works. It could also generate more actual alerts from overall alerts than earlier research.
format	Article
author	Keshkeh, Kinan Jantan, Aman Alieyan, Kamal
author_facet	Keshkeh, Kinan Jantan, Aman Alieyan, Kamal
author_sort	Keshkeh, Kinan
title	A Machine Learning Classification Approach to Detect TLS-based Malware using Entropy-based Flow Set Features
title_short	A Machine Learning Classification Approach to Detect TLS-based Malware using Entropy-based Flow Set Features
title_full	A Machine Learning Classification Approach to Detect TLS-based Malware using Entropy-based Flow Set Features
title_fullStr	A Machine Learning Classification Approach to Detect TLS-based Malware using Entropy-based Flow Set Features
title_full_unstemmed	A Machine Learning Classification Approach to Detect TLS-based Malware using Entropy-based Flow Set Features
title_sort	machine learning classification approach to detect tls-based malware using entropy-based flow set features
publisher	Universiti Utara Malaysia Press
publishDate	2022
url	https://repo.uum.edu.my/id/eprint/28740/1/JICT%2021%2003%202022%20279-313.pdf https://doi.org/10.32890/jict2022.21.3.1 https://repo.uum.edu.my/id/eprint/28740/ https://e-journal.uum.edu.my/index.php/jict/article/view/14434 https://doi.org/10.32890/jict2022.21.3.1
_version_	1758580948225490944

A Machine Learning Classification Approach to Detect TLS-based Malware using Entropy-based Flow Set Features

Similar Items