Revisiting supervised and unsupervised methods for effort-aware cross-project defect prediction

Cross-project defect prediction (CPDP), aiming to apply defect prediction models built on source projects to a target project, has been an active research topic. A variety of supervised CPDP methods and some simple unsupervised CPDP methods have been proposed. In a recent study, Zhou et al. found th...

Full description

Saved in:

Bibliographic Details
Main Authors:	NI, Chao, XIA, Xin, LO, David, CHEN, Xiang, GU, Qing
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2020
Subjects:	Defect prediction supervised model unsupervised model cross-project Software Engineering
Online Access:	https://ink.library.smu.edu.sg/sis_research/5927 https://ink.library.smu.edu.sg/context/sis_research/article/6930/viewcontent/Revisiting_Supervised_Defect_2020_av.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-6930
record_format	dspace
spelling	sg-smu-ink.sis_research-69302021-05-12T01:23:23Z Revisiting supervised and unsupervised methods for effort-aware cross-project defect prediction NI, Chao XIA, Xin LO, David CHEN, Xiang GU, Qing Cross-project defect prediction (CPDP), aiming to apply defect prediction models built on source projects to a target project, has been an active research topic. A variety of supervised CPDP methods and some simple unsupervised CPDP methods have been proposed. In a recent study, Zhou et al. found that simple unsupervised CPDP methods (i.e., ManualDown and ManualUp) have a prediction performance comparable or even superior to complex supervised CPDP methods. Therefore, they suggested that the ManualDown should be treated as the baseline when considering non-effort-aware performance measures (NPMs) and the ManualUp should be treated as the baseline when considering effort-aware performance measures (EPMs) in future CPDP studies. However, in that work, these unsupervised methods are only compared with existing supervised CPDP methods in terms of one or two NPMs and the prediction results of baselines are directly collected from the primary literature. Besides, the comparison has not considered other recently proposed EPMs, which consider context switches and developer fatigue due to initial false alarms. These limitations may not give a holistic comparison between the supervised methods and unsupervised methods. In this paper, we aim to revisit Zhou et al.'s study. To the best of our knowledge, we are the first to make a comparison between the existing supervised CPDP methods and the unsupervised methods proposed by Zhou et al. in the same experimental setting, considering both NPMs and EPMs. We also propose an improved supervised CPDP method EASC and make a further comparison between this method and the unsupervised methods. According to the results on 82 projects in terms of 12 performance measures, we find that when considering NPMs, EASC can achieve similar results with the unsupervised method ManualDown without statistically significant difference in most cases. However, when considering EPMs, our proposed supervised method EASC can statistically significantly outperform the unsupervised method ManualUp with a large improvement in terms of Cliff's delta in most cases. Therefore, the supervised CPDP methods are more promising than the unsupervised method in practical application scenarios, since the limitation of testing resource and the impact on developers cannot be ignored in these scenarios. 2020-06-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/5927 info:doi/10.1109/TSE.2020.3001739 https://ink.library.smu.edu.sg/context/sis_research/article/6930/viewcontent/Revisiting_Supervised_Defect_2020_av.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Defect prediction supervised model unsupervised model cross-project Software Engineering
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Defect prediction supervised model unsupervised model cross-project Software Engineering
spellingShingle	Defect prediction supervised model unsupervised model cross-project Software Engineering NI, Chao XIA, Xin LO, David CHEN, Xiang GU, Qing Revisiting supervised and unsupervised methods for effort-aware cross-project defect prediction
description	Cross-project defect prediction (CPDP), aiming to apply defect prediction models built on source projects to a target project, has been an active research topic. A variety of supervised CPDP methods and some simple unsupervised CPDP methods have been proposed. In a recent study, Zhou et al. found that simple unsupervised CPDP methods (i.e., ManualDown and ManualUp) have a prediction performance comparable or even superior to complex supervised CPDP methods. Therefore, they suggested that the ManualDown should be treated as the baseline when considering non-effort-aware performance measures (NPMs) and the ManualUp should be treated as the baseline when considering effort-aware performance measures (EPMs) in future CPDP studies. However, in that work, these unsupervised methods are only compared with existing supervised CPDP methods in terms of one or two NPMs and the prediction results of baselines are directly collected from the primary literature. Besides, the comparison has not considered other recently proposed EPMs, which consider context switches and developer fatigue due to initial false alarms. These limitations may not give a holistic comparison between the supervised methods and unsupervised methods. In this paper, we aim to revisit Zhou et al.'s study. To the best of our knowledge, we are the first to make a comparison between the existing supervised CPDP methods and the unsupervised methods proposed by Zhou et al. in the same experimental setting, considering both NPMs and EPMs. We also propose an improved supervised CPDP method EASC and make a further comparison between this method and the unsupervised methods. According to the results on 82 projects in terms of 12 performance measures, we find that when considering NPMs, EASC can achieve similar results with the unsupervised method ManualDown without statistically significant difference in most cases. However, when considering EPMs, our proposed supervised method EASC can statistically significantly outperform the unsupervised method ManualUp with a large improvement in terms of Cliff's delta in most cases. Therefore, the supervised CPDP methods are more promising than the unsupervised method in practical application scenarios, since the limitation of testing resource and the impact on developers cannot be ignored in these scenarios.
format	text
author	NI, Chao XIA, Xin LO, David CHEN, Xiang GU, Qing
author_facet	NI, Chao XIA, Xin LO, David CHEN, Xiang GU, Qing
author_sort	NI, Chao
title	Revisiting supervised and unsupervised methods for effort-aware cross-project defect prediction
title_short	Revisiting supervised and unsupervised methods for effort-aware cross-project defect prediction
title_full	Revisiting supervised and unsupervised methods for effort-aware cross-project defect prediction
title_fullStr	Revisiting supervised and unsupervised methods for effort-aware cross-project defect prediction
title_full_unstemmed	Revisiting supervised and unsupervised methods for effort-aware cross-project defect prediction
title_sort	revisiting supervised and unsupervised methods for effort-aware cross-project defect prediction
publisher	Institutional Knowledge at Singapore Management University
publishDate	2020
url	https://ink.library.smu.edu.sg/sis_research/5927 https://ink.library.smu.edu.sg/context/sis_research/article/6930/viewcontent/Revisiting_Supervised_Defect_2020_av.pdf
_version_	1770575694807433216

Revisiting supervised and unsupervised methods for effort-aware cross-project defect prediction

Similar Items