EFSPredictor: Predicting configuration bugs with ensemble feature selection

The configuration of a system determines the system behavior and wrong configuration settings can adversely impact system's availability, performance, and correctness. We refer to these wrong configuration settings as configuration bugs. The importance of configuration bugs has prompted many re...

Full description

Saved in:
Bibliographic Details
Main Authors: XU, Bowen, David LO, XIA, Xin, SUREKA, Ashish, LI, Shanping
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2016
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/3641
https://ink.library.smu.edu.sg/context/sis_research/article/4643/viewcontent/9644a206.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-4643
record_format dspace
spelling sg-smu-ink.sis_research-46432018-06-14T09:39:50Z EFSPredictor: Predicting configuration bugs with ensemble feature selection XU, Bowen David LO, XIA, Xin SUREKA, Ashish LI, Shanping The configuration of a system determines the system behavior and wrong configuration settings can adversely impact system's availability, performance, and correctness. We refer to these wrong configuration settings as configuration bugs. The importance of configuration bugs has prompted many researchers to study it, and past studies can be grouped into three categories: detection, localization, and fixing of configuration bugs. In the work, we focus on the detection of configuration bugs, in particular, we follow the line-of-work that tries to predict if a bug report is caused by a wrong configuration setting. Automatically prediction of whether a bug is a configuration bug can help developers reduce debugging effort. We propose a novel approach named EFSPredictor which applies ensemble feature selection on the natural-language description of a bug report. It uses different feature selection approaches (e.g., ChiSquare, GainRatio and Relief) which output different ranked lists of textual features. Next, to obtain a set of representative textual features, EFSPredictor first assigns different scores to the features outputted by these feature selection approaches. Next, for each feature, EFSPredictor sums up the scores outputted by the multiple ranked lists, and outputs the top features (e.g., 25% of the total number of features) as the selected features. Finally, EFSPredictor builds a prediction model based on the selected features. We conduct experiments on 5 bug report datasets (i.e., accumulo, activemq, camel, flume, and wicket) containing a total of 3,203 bugs. The experiment results show that, on average across the 5 projects, EFSPredictor achieves an F1-score to 0.57, which improves the state-of-the-art approach proposed by Xia et al. by 14%. 2016-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/3641 info:doi/10.1109/APSEC.2015.38 https://ink.library.smu.edu.sg/context/sis_research/article/4643/viewcontent/9644a206.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Configuration Bugs Data Mining Ensemble Feature Selection Databases and Information Systems Data Storage Systems
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Configuration Bugs
Data Mining
Ensemble Feature Selection
Databases and Information Systems
Data Storage Systems
spellingShingle Configuration Bugs
Data Mining
Ensemble Feature Selection
Databases and Information Systems
Data Storage Systems
XU, Bowen
David LO,
XIA, Xin
SUREKA, Ashish
LI, Shanping
EFSPredictor: Predicting configuration bugs with ensemble feature selection
description The configuration of a system determines the system behavior and wrong configuration settings can adversely impact system's availability, performance, and correctness. We refer to these wrong configuration settings as configuration bugs. The importance of configuration bugs has prompted many researchers to study it, and past studies can be grouped into three categories: detection, localization, and fixing of configuration bugs. In the work, we focus on the detection of configuration bugs, in particular, we follow the line-of-work that tries to predict if a bug report is caused by a wrong configuration setting. Automatically prediction of whether a bug is a configuration bug can help developers reduce debugging effort. We propose a novel approach named EFSPredictor which applies ensemble feature selection on the natural-language description of a bug report. It uses different feature selection approaches (e.g., ChiSquare, GainRatio and Relief) which output different ranked lists of textual features. Next, to obtain a set of representative textual features, EFSPredictor first assigns different scores to the features outputted by these feature selection approaches. Next, for each feature, EFSPredictor sums up the scores outputted by the multiple ranked lists, and outputs the top features (e.g., 25% of the total number of features) as the selected features. Finally, EFSPredictor builds a prediction model based on the selected features. We conduct experiments on 5 bug report datasets (i.e., accumulo, activemq, camel, flume, and wicket) containing a total of 3,203 bugs. The experiment results show that, on average across the 5 projects, EFSPredictor achieves an F1-score to 0.57, which improves the state-of-the-art approach proposed by Xia et al. by 14%.
format text
author XU, Bowen
David LO,
XIA, Xin
SUREKA, Ashish
LI, Shanping
author_facet XU, Bowen
David LO,
XIA, Xin
SUREKA, Ashish
LI, Shanping
author_sort XU, Bowen
title EFSPredictor: Predicting configuration bugs with ensemble feature selection
title_short EFSPredictor: Predicting configuration bugs with ensemble feature selection
title_full EFSPredictor: Predicting configuration bugs with ensemble feature selection
title_fullStr EFSPredictor: Predicting configuration bugs with ensemble feature selection
title_full_unstemmed EFSPredictor: Predicting configuration bugs with ensemble feature selection
title_sort efspredictor: predicting configuration bugs with ensemble feature selection
publisher Institutional Knowledge at Singapore Management University
publishDate 2016
url https://ink.library.smu.edu.sg/sis_research/3641
https://ink.library.smu.edu.sg/context/sis_research/article/4643/viewcontent/9644a206.pdf
_version_ 1770573369019727872