Experimental comparison of features and classifiers for Android malware detection

Android platform has dominated the smart phone market for years now and, consequently, gained a lot of attention from attackers. Malicious apps (malware) pose a serious threat to the security and privacy of Android smart phone users. Available approaches to detect mobile malware based on machine lea...

Full description

Saved in:

Bibliographic Details
Main Authors:	SHAR, Lwin Khin, DEMISSIE, Biniam Fisseha, CECCATO, Mariano, MINN, Wei
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2020
Subjects:	Malware detection machine learning deep learning Android Software Engineering
Online Access:	https://ink.library.smu.edu.sg/sis_research/5115 https://ink.library.smu.edu.sg/context/sis_research/article/6118/viewcontent/Experimental_Comparison_of_Features_and_Classifiers_for_Android_Malware_Detection.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-6118
record_format	dspace
spelling	sg-smu-ink.sis_research-61182021-07-01T00:36:01Z Experimental comparison of features and classifiers for Android malware detection SHAR, Lwin Khin DEMISSIE, Biniam Fisseha CECCATO, Mariano MINN, Wei Android platform has dominated the smart phone market for years now and, consequently, gained a lot of attention from attackers. Malicious apps (malware) pose a serious threat to the security and privacy of Android smart phone users. Available approaches to detect mobile malware based on machine learning rely on features extracted with static analysis or dynamic analysis techniques. Dif- ferent types of machine learning classi ers (such as support vector machine and random forest) deep learning classi ers (based on deep neural networks) are then trained on extracted features, to produce models that can be used to detect mobile malware. The usually-analyzed features include permissions requested/used, fre- quency of API calls, use of API calls, and sequence of API calls. The API calls are analyzed at various granularity levels such as method, class, package, and family. In the view of the proposals of di erent types of classi ers and the use of di erent types of features and di erent underlying analy- ses used for feature extraction, there is a need for a comprehensive evaluation on the e ectiveness of the current state-of-the-art stud- ies in malware detection on a common benchmark. In this work, we provide a baseline comparison of several conventional machine learning classi ers and deep learning classi ers, without ne tun- ing. We also provide the evaluation of di erent types of features that characterize the use of API calls at class level and the sequence of API calls at method level. Features have been extracted from a common benchmark of 4572 benign samples and 2399 malware samples, using both static analysis and dynamic analysis. Among other interesting ndings, we observed that classi ers trained on the use of API calls generally perform better than those trained on the sequence of API calls. Classi ers trained on static analysis-based features perform better than those trained on dy- namic analysis-based features. Deep learning classi ers, despite their sophistication, are not necessarily better than conventional classi ers, especially when they are not optimized. However, deep learning classi ers do perform better than conventional classi ers when trained on dynamic analysis-based features. 2020-10-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/5115 info:doi/10.1145/3387905.3388596 https://ink.library.smu.edu.sg/context/sis_research/article/6118/viewcontent/Experimental_Comparison_of_Features_and_Classifiers_for_Android_Malware_Detection.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Malware detection machine learning deep learning Android Software Engineering
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Malware detection machine learning deep learning Android Software Engineering
spellingShingle	Malware detection machine learning deep learning Android Software Engineering SHAR, Lwin Khin DEMISSIE, Biniam Fisseha CECCATO, Mariano MINN, Wei Experimental comparison of features and classifiers for Android malware detection
description	Android platform has dominated the smart phone market for years now and, consequently, gained a lot of attention from attackers. Malicious apps (malware) pose a serious threat to the security and privacy of Android smart phone users. Available approaches to detect mobile malware based on machine learning rely on features extracted with static analysis or dynamic analysis techniques. Dif- ferent types of machine learning classi ers (such as support vector machine and random forest) deep learning classi ers (based on deep neural networks) are then trained on extracted features, to produce models that can be used to detect mobile malware. The usually-analyzed features include permissions requested/used, fre- quency of API calls, use of API calls, and sequence of API calls. The API calls are analyzed at various granularity levels such as method, class, package, and family. In the view of the proposals of di erent types of classi ers and the use of di erent types of features and di erent underlying analy- ses used for feature extraction, there is a need for a comprehensive evaluation on the e ectiveness of the current state-of-the-art stud- ies in malware detection on a common benchmark. In this work, we provide a baseline comparison of several conventional machine learning classi ers and deep learning classi ers, without ne tun- ing. We also provide the evaluation of di erent types of features that characterize the use of API calls at class level and the sequence of API calls at method level. Features have been extracted from a common benchmark of 4572 benign samples and 2399 malware samples, using both static analysis and dynamic analysis. Among other interesting ndings, we observed that classi ers trained on the use of API calls generally perform better than those trained on the sequence of API calls. Classi ers trained on static analysis-based features perform better than those trained on dy- namic analysis-based features. Deep learning classi ers, despite their sophistication, are not necessarily better than conventional classi ers, especially when they are not optimized. However, deep learning classi ers do perform better than conventional classi ers when trained on dynamic analysis-based features.
format	text
author	SHAR, Lwin Khin DEMISSIE, Biniam Fisseha CECCATO, Mariano MINN, Wei
author_facet	SHAR, Lwin Khin DEMISSIE, Biniam Fisseha CECCATO, Mariano MINN, Wei
author_sort	SHAR, Lwin Khin
title	Experimental comparison of features and classifiers for Android malware detection
title_short	Experimental comparison of features and classifiers for Android malware detection
title_full	Experimental comparison of features and classifiers for Android malware detection
title_fullStr	Experimental comparison of features and classifiers for Android malware detection
title_full_unstemmed	Experimental comparison of features and classifiers for Android malware detection
title_sort	experimental comparison of features and classifiers for android malware detection
publisher	Institutional Knowledge at Singapore Management University
publishDate	2020
url	https://ink.library.smu.edu.sg/sis_research/5115 https://ink.library.smu.edu.sg/context/sis_research/article/6118/viewcontent/Experimental_Comparison_of_Features_and_Classifiers_for_Android_Malware_Detection.pdf
_version_	1770575223748296704

Experimental comparison of features and classifiers for Android malware detection

Similar Items