An Empirical Study of Classifier Combination on Cross-Project Defect Prediction

To help developers better allocate testing and debugging efforts, many software defect prediction techniques have been proposed in the literature. These techniques can be used to predict classes that are more likely to be buggy based on past history of buggy classes. These techniques work well as lo...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHANG, Yun, David LO, XIA, Xin, SUN, Jianling
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2015
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/3099
https://ink.library.smu.edu.sg/context/sis_research/article/4099/viewcontent/compsac15_combination.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-4099
record_format dspace
spelling sg-smu-ink.sis_research-40992019-06-10T09:07:05Z An Empirical Study of Classifier Combination on Cross-Project Defect Prediction ZHANG, Yun David LO, XIA, Xin SUN, Jianling To help developers better allocate testing and debugging efforts, many software defect prediction techniques have been proposed in the literature. These techniques can be used to predict classes that are more likely to be buggy based on past history of buggy classes. These techniques work well as long as a sufficient amount of data is available to train a prediction model. However, there is rarely enough training data for new software projects. To deal with this problem, cross-project defect prediction, which transfers a prediction model trained using data from one project to another, has been proposed and is regarded as a new challenge for defect prediction. So far, only a few cross-project defect prediction techniques have been proposed. To advance the state-of-the-art, in this work, we investigate 7 composite algorithms, which integrate multiple machine learning classifiers, to improve cross-project defect prediction. To evaluate the performance of the composite algorithms, we perform experiments on 10 open source software systems from the PROMISE repository which contain a total of 5,305 instances labeled as defective or clean. We compare the composite algorithms with CODEP Logistic, which is the latest cross-project defect prediction algorithm proposed by Panichella et al., in terms of two standard evaluation metrics: cost effectiveness and F-measure. Our experiment results show that several algorithms outperform CODEP Logistic: Max performs the best in terms of F-measure and its average F-measure outperforms that of CODEP Logistic by 36.88%. Bagging J48 performs the best in terms of cost effectiveness and its average cost effectiveness outperforms that of CODEP Logistic by 15.34%. 2015-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/3099 info:doi/10.1109/COMPSAC.2015.58 https://ink.library.smu.edu.sg/context/sis_research/article/4099/viewcontent/compsac15_combination.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Defect Prediction Cross-Project Classifier Combination Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Defect Prediction
Cross-Project
Classifier Combination
Software Engineering
spellingShingle Defect Prediction
Cross-Project
Classifier Combination
Software Engineering
ZHANG, Yun
David LO,
XIA, Xin
SUN, Jianling
An Empirical Study of Classifier Combination on Cross-Project Defect Prediction
description To help developers better allocate testing and debugging efforts, many software defect prediction techniques have been proposed in the literature. These techniques can be used to predict classes that are more likely to be buggy based on past history of buggy classes. These techniques work well as long as a sufficient amount of data is available to train a prediction model. However, there is rarely enough training data for new software projects. To deal with this problem, cross-project defect prediction, which transfers a prediction model trained using data from one project to another, has been proposed and is regarded as a new challenge for defect prediction. So far, only a few cross-project defect prediction techniques have been proposed. To advance the state-of-the-art, in this work, we investigate 7 composite algorithms, which integrate multiple machine learning classifiers, to improve cross-project defect prediction. To evaluate the performance of the composite algorithms, we perform experiments on 10 open source software systems from the PROMISE repository which contain a total of 5,305 instances labeled as defective or clean. We compare the composite algorithms with CODEP Logistic, which is the latest cross-project defect prediction algorithm proposed by Panichella et al., in terms of two standard evaluation metrics: cost effectiveness and F-measure. Our experiment results show that several algorithms outperform CODEP Logistic: Max performs the best in terms of F-measure and its average F-measure outperforms that of CODEP Logistic by 36.88%. Bagging J48 performs the best in terms of cost effectiveness and its average cost effectiveness outperforms that of CODEP Logistic by 15.34%.
format text
author ZHANG, Yun
David LO,
XIA, Xin
SUN, Jianling
author_facet ZHANG, Yun
David LO,
XIA, Xin
SUN, Jianling
author_sort ZHANG, Yun
title An Empirical Study of Classifier Combination on Cross-Project Defect Prediction
title_short An Empirical Study of Classifier Combination on Cross-Project Defect Prediction
title_full An Empirical Study of Classifier Combination on Cross-Project Defect Prediction
title_fullStr An Empirical Study of Classifier Combination on Cross-Project Defect Prediction
title_full_unstemmed An Empirical Study of Classifier Combination on Cross-Project Defect Prediction
title_sort empirical study of classifier combination on cross-project defect prediction
publisher Institutional Knowledge at Singapore Management University
publishDate 2015
url https://ink.library.smu.edu.sg/sis_research/3099
https://ink.library.smu.edu.sg/context/sis_research/article/4099/viewcontent/compsac15_combination.pdf
_version_ 1770572809190244352