The best of both worlds: integrating semantic features with expert features for defect prediction and localization

To improve software quality, just-in-time defect prediction (JIT-DP) (identifying defect-inducing commits) and just-in-time defect localization (JIT-DL) (identifying defect-inducing code lines in commits) have been widely studied by learning semantic features or expert features respectively, and ind...

Full description

Saved in:
Bibliographic Details
Main Authors: NI, Chao, WANG, Wei, YANG, Kaiwen, XIA, Xin, LIU, Kui, LO, David
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2022
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/7729
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-8732
record_format dspace
spelling sg-smu-ink.sis_research-87322023-01-10T02:00:04Z The best of both worlds: integrating semantic features with expert features for defect prediction and localization NI, Chao WANG, Wei YANG, Kaiwen XIA, Xin LIU, Kui LO, David To improve software quality, just-in-time defect prediction (JIT-DP) (identifying defect-inducing commits) and just-in-time defect localization (JIT-DL) (identifying defect-inducing code lines in commits) have been widely studied by learning semantic features or expert features respectively, and indeed achieved promising performance. Semantic features and expert features describe code change commits from different aspects, however, the best of the two features have not been fully explored together to boost the just-in-time defect prediction and localization in the literature yet. Additional, JIT-DP identifies defects at the coarse commit level, while as the consequent task of JIT-DP, JIT-DL cannot achieve the accurate localization of defect-inducing code lines in a commit without JIT-DP. We hypothesize that the two JIT tasks can be combined together to boost the accurate prediction and localization of defect-inducing commits by integrating semantic features with expert features. Therefore, we propose to build a unified model, JIT-Fine, for the just-in-time defect prediction and localization by leveraging the best of semantic features and expert features. To assess the feasibility of JIT-Fine, we first build a large-scale line-level manually labeled dataset, JIT-Defects4J. Then, we make a comprehensive comparison with six state-of-the-art baselines under various settings using ten performance measures grouped into two types: effort-agnostic and effort-aware. The experimental results indicate that JIT-Fine can outperform all state-of-the-art baselines on both JIT-DP and JITDL tasks in terms of ten performance measures with a substantial improvement (i.e., 10%-629% in terms of effort-agnostic measures on JIT-DP, 5%-54% in terms of effort-aware measures on JIT-DP, and 4%-117% in terms of effort-aware measures on JIT-DL). 2022-11-18T08:00:00Z text https://ink.library.smu.edu.sg/sis_research/7729 info:doi/10.1145/3540250.3549165 Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Software Engineering
spellingShingle Software Engineering
NI, Chao
WANG, Wei
YANG, Kaiwen
XIA, Xin
LIU, Kui
LO, David
The best of both worlds: integrating semantic features with expert features for defect prediction and localization
description To improve software quality, just-in-time defect prediction (JIT-DP) (identifying defect-inducing commits) and just-in-time defect localization (JIT-DL) (identifying defect-inducing code lines in commits) have been widely studied by learning semantic features or expert features respectively, and indeed achieved promising performance. Semantic features and expert features describe code change commits from different aspects, however, the best of the two features have not been fully explored together to boost the just-in-time defect prediction and localization in the literature yet. Additional, JIT-DP identifies defects at the coarse commit level, while as the consequent task of JIT-DP, JIT-DL cannot achieve the accurate localization of defect-inducing code lines in a commit without JIT-DP. We hypothesize that the two JIT tasks can be combined together to boost the accurate prediction and localization of defect-inducing commits by integrating semantic features with expert features. Therefore, we propose to build a unified model, JIT-Fine, for the just-in-time defect prediction and localization by leveraging the best of semantic features and expert features. To assess the feasibility of JIT-Fine, we first build a large-scale line-level manually labeled dataset, JIT-Defects4J. Then, we make a comprehensive comparison with six state-of-the-art baselines under various settings using ten performance measures grouped into two types: effort-agnostic and effort-aware. The experimental results indicate that JIT-Fine can outperform all state-of-the-art baselines on both JIT-DP and JITDL tasks in terms of ten performance measures with a substantial improvement (i.e., 10%-629% in terms of effort-agnostic measures on JIT-DP, 5%-54% in terms of effort-aware measures on JIT-DP, and 4%-117% in terms of effort-aware measures on JIT-DL).
format text
author NI, Chao
WANG, Wei
YANG, Kaiwen
XIA, Xin
LIU, Kui
LO, David
author_facet NI, Chao
WANG, Wei
YANG, Kaiwen
XIA, Xin
LIU, Kui
LO, David
author_sort NI, Chao
title The best of both worlds: integrating semantic features with expert features for defect prediction and localization
title_short The best of both worlds: integrating semantic features with expert features for defect prediction and localization
title_full The best of both worlds: integrating semantic features with expert features for defect prediction and localization
title_fullStr The best of both worlds: integrating semantic features with expert features for defect prediction and localization
title_full_unstemmed The best of both worlds: integrating semantic features with expert features for defect prediction and localization
title_sort best of both worlds: integrating semantic features with expert features for defect prediction and localization
publisher Institutional Knowledge at Singapore Management University
publishDate 2022
url https://ink.library.smu.edu.sg/sis_research/7729
_version_ 1770576422685900800