Automating change-level self-admitted technical debt determination

Self-Admitted Technical Debt (SATD) refers to technical debt that is introduced intentionally. Previous studies that identify SATD at the file-level in isolation cannot describe the TD context related to multiple files. Therefore, it is more beneficial to identify the SATD once a change is being mad...

全面介紹

Saved in:
書目詳細資料
Main Authors: YAN, Meng, XIA, Xin, SHIHAB, Emad, LO, David, YIN, Jianwei, YANG, Xiaohu
格式: text
語言:English
出版: Institutional Knowledge at Singapore Management University 2019
主題:
在線閱讀:https://ink.library.smu.edu.sg/sis_research/4352
https://ink.library.smu.edu.sg/context/sis_research/article/5355/viewcontent/Automating_change_level_self_admitted_tse_2018_afv.pdf
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Singapore Management University
語言: English
實物特徵
總結:Self-Admitted Technical Debt (SATD) refers to technical debt that is introduced intentionally. Previous studies that identify SATD at the file-level in isolation cannot describe the TD context related to multiple files. Therefore, it is more beneficial to identify the SATD once a change is being made. We refer to this type of TD identification as “Change-level SATD Determination”, and identifying SATD at the change-level can help to manage and control TD by understanding the TD context through tracing the introducing changes. In this paper, we propose a change-level SATD Determination mode by extracting 25 features from software changes that are divided into three dimensions, namely diffusion, history and message, respectively. To evaluate the effectiveness of our proposed model, we perform an empirical study on 7 open source projects containing a total of 100,011 software changes. The experimental results show that our model achieves a promising and better performance than four baselines in terms of AUC and cost-effectiveness. On average across the 7 experimental projects, our model achieves AUC of 0.82, cost-effectiveness of 0.80, which is a significant improvement over the comparison baselines used. In addition, we found that “Diffusion” is the most discriminative dimension for determining TD-introducing changes