Combining Software Metrics and Text Features for Vulnerable File Prediction
In recent years, to help developers reduce time and effort required to build highly secure software, a number of prediction models which are built on different kinds of features have been proposed to identify vulnerable source code files. In this paper, we propose a novel approach VULPREDICTOR to pr...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2015
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/3097 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-4097 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-40972016-02-05T06:30:05Z Combining Software Metrics and Text Features for Vulnerable File Prediction ZHANG, Yun David LO, XIA, Xin XU, Bowen SUN, Jianling Sun LI, Shanping In recent years, to help developers reduce time and effort required to build highly secure software, a number of prediction models which are built on different kinds of features have been proposed to identify vulnerable source code files. In this paper, we propose a novel approach VULPREDICTOR to predict vulnerable files, it analyzes software metrics and text mining together to build a composite prediction model. VULPREDICTOR first builds 6 underlying classifiers on a training set of vulnerable and non-vulnerable files represented by their software metrics and text features, and then constructs a meta classifier to process the outputs of the 6 underlying classifiers. We evaluate our solution on datasets from three web applications including Drupal, PHPMyAdmin and Moodle which contain a total of 3,466 files and 223 vulnerabilities. The experiment results show that VULPREDICTOR can achieve F1 and EffectivenessRatio@20% scores of up to 0.683 and 75%, respectively. On average across the 3 projects, VULPREDICTOR improves the F1 and EffectivenessRatio@20% scores of the best performing state-of-the-art approaches proposed by Walden et al. by 46.53% and 14.93%, respectively. 2015-12-11T08:00:00Z text https://ink.library.smu.edu.sg/sis_research/3097 info:doi/10.1109/ICECCS.2015.15 Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Machine Learning Text Mining Vulnerable File Software Engineering |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Machine Learning Text Mining Vulnerable File Software Engineering |
spellingShingle |
Machine Learning Text Mining Vulnerable File Software Engineering ZHANG, Yun David LO, XIA, Xin XU, Bowen SUN, Jianling Sun LI, Shanping Combining Software Metrics and Text Features for Vulnerable File Prediction |
description |
In recent years, to help developers reduce time and effort required to build highly secure software, a number of prediction models which are built on different kinds of features have been proposed to identify vulnerable source code files. In this paper, we propose a novel approach VULPREDICTOR to predict vulnerable files, it analyzes software metrics and text mining together to build a composite prediction model. VULPREDICTOR first builds 6 underlying classifiers on a training set of vulnerable and non-vulnerable files represented by their software metrics and text features, and then constructs a meta classifier to process the outputs of the 6 underlying classifiers. We evaluate our solution on datasets from three web applications including Drupal, PHPMyAdmin and Moodle which contain a total of 3,466 files and 223 vulnerabilities. The experiment results show that VULPREDICTOR can achieve F1 and EffectivenessRatio@20% scores of up to 0.683 and 75%, respectively. On average across the 3 projects, VULPREDICTOR improves the F1 and EffectivenessRatio@20% scores of the best performing state-of-the-art approaches proposed by Walden et al. by 46.53% and 14.93%, respectively. |
format |
text |
author |
ZHANG, Yun David LO, XIA, Xin XU, Bowen SUN, Jianling Sun LI, Shanping |
author_facet |
ZHANG, Yun David LO, XIA, Xin XU, Bowen SUN, Jianling Sun LI, Shanping |
author_sort |
ZHANG, Yun |
title |
Combining Software Metrics and Text Features for Vulnerable File Prediction |
title_short |
Combining Software Metrics and Text Features for Vulnerable File Prediction |
title_full |
Combining Software Metrics and Text Features for Vulnerable File Prediction |
title_fullStr |
Combining Software Metrics and Text Features for Vulnerable File Prediction |
title_full_unstemmed |
Combining Software Metrics and Text Features for Vulnerable File Prediction |
title_sort |
combining software metrics and text features for vulnerable file prediction |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2015 |
url |
https://ink.library.smu.edu.sg/sis_research/3097 |
_version_ |
1770572808711045120 |