A pseudo feedback-based annotated TF-IDF technique for dynamic crypto-ransomware pre-encryption boundary delineation and features extraction

The cryptography employed against user files makes the effect of crypto-ransomware attacks irreversible even after detection and removal. Thus, detecting such attacks early, i.e. during pre-encryption phase before the encryption takes place is necessary. Existing crypto-ransomware early detection so...

Full description

Saved in:
Bibliographic Details
Main Authors: Al-Rimy, Bander Ali Saleh, Maarof, Mohd. Aiziani, Alazab, Mamoun, Alsolami, Fawaz, Mohd. Shaid, Syed Zainudeen, Ghaleb, Fuad A., Al-Hadhrami, Tawfik, Ali, Abdullah Marish
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers Inc. 2020
Subjects:
Online Access:http://eprints.utm.my/id/eprint/90680/1/MohdAizianiMaarof2020_APseudoFeedbackBasedAnnotatedTFIDF.pdf
http://eprints.utm.my/id/eprint/90680/
http://dx.doi.org/10.1109/ACCESS.2020.3012674
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
Language: English
id my.utm.90680
record_format eprints
spelling my.utm.906802021-04-30T14:55:36Z http://eprints.utm.my/id/eprint/90680/ A pseudo feedback-based annotated TF-IDF technique for dynamic crypto-ransomware pre-encryption boundary delineation and features extraction Al-Rimy, Bander Ali Saleh Maarof, Mohd. Aiziani Alazab, Mamoun Alsolami, Fawaz Mohd. Shaid, Syed Zainudeen Ghaleb, Fuad A. Al-Hadhrami, Tawfik Ali, Abdullah Marish QA75 Electronic computers. Computer science The cryptography employed against user files makes the effect of crypto-ransomware attacks irreversible even after detection and removal. Thus, detecting such attacks early, i.e. during pre-encryption phase before the encryption takes place is necessary. Existing crypto-ransomware early detection solutions use a fixed time-based thresholding approach to determine the pre-encryption phase boundaries. However, the fixed time thresholding approach implies that all samples start the encryption at the same time. Such assumption does not necessarily hold for all samples as the time for the main sabotage to start varies among different crypto-ransomware families due to the obfuscation techniques employed by the malware to change its attack strategies and evade detection, which generates different attack behaviors. Additionally, the lack of sufficient data at the early phases of the attack adversely affects the ability of feature extraction techniques in early detection models to perceive the characteristics of the attacks, which, consequently, decreases the detection accuracy. Therefore, this paper proposes a Dynamic Pre-encryption Boundary Delineation and Feature Extraction (DPBD-FE) scheme that determines the boundary of the pre-encryption phase, from which the features are extracted and selected more accurately. Unlike the fixed thresholding employed by the extant works, DPBD-FE tracks the pre-encryption phase for each instance individually based on the first occurrence of any cryptography-related APIs. Then, an annotated Term Frequency-Inverse Document Frequency (aTF-IDF) technique was utilized to extract the features from runtime data generated during the pre-encryption phase of crypto-ransomware attacks. The aTF-IDF overcomes the challenge of insufficient attack patterns during the early phases of the attack lifecycle. The experimental evaluation shows that DPBD-FE was able to determine the pre-encryption boundaries and extract the features related to this phase more accurately compared to related works. Institute of Electrical and Electronics Engineers Inc. 2020-07 Article PeerReviewed application/pdf en http://eprints.utm.my/id/eprint/90680/1/MohdAizianiMaarof2020_APseudoFeedbackBasedAnnotatedTFIDF.pdf Al-Rimy, Bander Ali Saleh and Maarof, Mohd. Aiziani and Alazab, Mamoun and Alsolami, Fawaz and Mohd. Shaid, Syed Zainudeen and Ghaleb, Fuad A. and Al-Hadhrami, Tawfik and Ali, Abdullah Marish (2020) A pseudo feedback-based annotated TF-IDF technique for dynamic crypto-ransomware pre-encryption boundary delineation and features extraction. IEEE Access, 8 . pp. 140586-140598. ISSN 2169-3536 http://dx.doi.org/10.1109/ACCESS.2020.3012674 DOI:10.1109/ACCESS.2020.3012674
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Al-Rimy, Bander Ali Saleh
Maarof, Mohd. Aiziani
Alazab, Mamoun
Alsolami, Fawaz
Mohd. Shaid, Syed Zainudeen
Ghaleb, Fuad A.
Al-Hadhrami, Tawfik
Ali, Abdullah Marish
A pseudo feedback-based annotated TF-IDF technique for dynamic crypto-ransomware pre-encryption boundary delineation and features extraction
description The cryptography employed against user files makes the effect of crypto-ransomware attacks irreversible even after detection and removal. Thus, detecting such attacks early, i.e. during pre-encryption phase before the encryption takes place is necessary. Existing crypto-ransomware early detection solutions use a fixed time-based thresholding approach to determine the pre-encryption phase boundaries. However, the fixed time thresholding approach implies that all samples start the encryption at the same time. Such assumption does not necessarily hold for all samples as the time for the main sabotage to start varies among different crypto-ransomware families due to the obfuscation techniques employed by the malware to change its attack strategies and evade detection, which generates different attack behaviors. Additionally, the lack of sufficient data at the early phases of the attack adversely affects the ability of feature extraction techniques in early detection models to perceive the characteristics of the attacks, which, consequently, decreases the detection accuracy. Therefore, this paper proposes a Dynamic Pre-encryption Boundary Delineation and Feature Extraction (DPBD-FE) scheme that determines the boundary of the pre-encryption phase, from which the features are extracted and selected more accurately. Unlike the fixed thresholding employed by the extant works, DPBD-FE tracks the pre-encryption phase for each instance individually based on the first occurrence of any cryptography-related APIs. Then, an annotated Term Frequency-Inverse Document Frequency (aTF-IDF) technique was utilized to extract the features from runtime data generated during the pre-encryption phase of crypto-ransomware attacks. The aTF-IDF overcomes the challenge of insufficient attack patterns during the early phases of the attack lifecycle. The experimental evaluation shows that DPBD-FE was able to determine the pre-encryption boundaries and extract the features related to this phase more accurately compared to related works.
format Article
author Al-Rimy, Bander Ali Saleh
Maarof, Mohd. Aiziani
Alazab, Mamoun
Alsolami, Fawaz
Mohd. Shaid, Syed Zainudeen
Ghaleb, Fuad A.
Al-Hadhrami, Tawfik
Ali, Abdullah Marish
author_facet Al-Rimy, Bander Ali Saleh
Maarof, Mohd. Aiziani
Alazab, Mamoun
Alsolami, Fawaz
Mohd. Shaid, Syed Zainudeen
Ghaleb, Fuad A.
Al-Hadhrami, Tawfik
Ali, Abdullah Marish
author_sort Al-Rimy, Bander Ali Saleh
title A pseudo feedback-based annotated TF-IDF technique for dynamic crypto-ransomware pre-encryption boundary delineation and features extraction
title_short A pseudo feedback-based annotated TF-IDF technique for dynamic crypto-ransomware pre-encryption boundary delineation and features extraction
title_full A pseudo feedback-based annotated TF-IDF technique for dynamic crypto-ransomware pre-encryption boundary delineation and features extraction
title_fullStr A pseudo feedback-based annotated TF-IDF technique for dynamic crypto-ransomware pre-encryption boundary delineation and features extraction
title_full_unstemmed A pseudo feedback-based annotated TF-IDF technique for dynamic crypto-ransomware pre-encryption boundary delineation and features extraction
title_sort pseudo feedback-based annotated tf-idf technique for dynamic crypto-ransomware pre-encryption boundary delineation and features extraction
publisher Institute of Electrical and Electronics Engineers Inc.
publishDate 2020
url http://eprints.utm.my/id/eprint/90680/1/MohdAizianiMaarof2020_APseudoFeedbackBasedAnnotatedTFIDF.pdf
http://eprints.utm.my/id/eprint/90680/
http://dx.doi.org/10.1109/ACCESS.2020.3012674
_version_ 1698696970180231168