Evaluation of feature selection algorithm for android malware detection

This paper synthesizes an evaluation of feature selection algorithm by utilizing Term Frequency Inverse Document Frequency (TF-IDF) as the main algorithm in Android malware detection. The Android features were filtered before detection process using TF-IDF algorithm. However, IDF is unaware to the t...

Full description

Saved in:
Bibliographic Details
Main Authors: Mazlan, Nurul Hidayah, A Hamid, Isredza Rahmi
Format: Article
Published: Science Publishing Corporation 2018
Subjects:
Online Access:http://eprints.uthm.edu.my/5019/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Tun Hussein Onn Malaysia
Description
Summary:This paper synthesizes an evaluation of feature selection algorithm by utilizing Term Frequency Inverse Document Frequency (TF-IDF) as the main algorithm in Android malware detection. The Android features were filtered before detection process using TF-IDF algorithm. However, IDF is unaware to the training class labels and give incorrect weight value to some features. Therefore, the proposed approach modified the TF-IDF algorithm, where the algorithm focused on both sample and feature. Proposed algorithm applied considers the feature based on its level of importance. The related best features in the sample are selected using weight and priority ranking process. This increases the effect of important malware features selected in the Android application sample. These experiments are conducted on a sample collected from DREBIN dataset. The comparison between existing TF-IDF algorithm and modified TF-IDF (MTF-IDF) algorithm have been tested in various conditions such as different number of sample, different number of feature and combination of different types of feature. The analysis results show feature selection using MTF-IDF can improve malware detection analysis. MTF-IDF proved either using various kinds of feature or various kinds of dataset size, algorithm still effective for Android malware detection. MTF-IDF algorithm also proved that it could give appropriate scaling for all features in analyzing Android malware detection.