Neural network based detection of self-admitted technical debt: From performance to explainability

Technical debt is a metaphor to reflect the tradeoff software engineers make between short term benefitsand long term stability. Self-admitted technical debt (SATD), a variant of technical debt, has been proposed to identify debt that is intentionally introduced during software development, e.g., te...

Full description

Saved in:
Bibliographic Details
Main Authors: REN, Xiaoxue, XING, Zhenchang, XIA, Xin, LO, David, WANG, Xinyu, GRUNDY, John
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2019
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/4476
https://ink.library.smu.edu.sg/context/sis_research/article/5479/viewcontent/NN_based_classification_technical_debt_tosem_2019_av.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-5479
record_format dspace
spelling sg-smu-ink.sis_research-54792019-12-19T07:05:18Z Neural network based detection of self-admitted technical debt: From performance to explainability REN, Xiaoxue XING, Zhenchang XIA, Xin LO, David WANG, Xinyu GRUNDY, John Technical debt is a metaphor to reflect the tradeoff software engineers make between short term benefitsand long term stability. Self-admitted technical debt (SATD), a variant of technical debt, has been proposed to identify debt that is intentionally introduced during software development, e.g., temporary fixes and workarounds. Previous studies have leveraged human-summarized patterns (which represent n-gram phrases that can be used to identify SATD) or text mining techniques to detect SATD in source code comments. However, several characteristics of SATD features in code comments, such as vocabulary diversity, project uniqueness, length and semantic variations, pose a big challenge to the accuracy of pattern or traditional text-mining based SATD detection, especially for cross-project deployment. Furthermore, although traditional text-mining based method outperforms pattern-based method in prediction accuracy, the text features it uses are less intuitive than human-summarized patterns, which makes the prediction results hard to explain. To improve the accuracy of SATD prediction, especially for cross-project prediction, we propose a Convolutional Neural Network (CNN)-based approach for classifying code comments as SATD or non-SATD. To improve the explainability of our model’s prediction results, we exploit the computational structure of CNNs to identify key phrases and patterns in code comments that are most relevant to SATD. We have conducted an extensive set of experiments with 62,566 code comments from 10 open-source projects and a user study with 150 comments of another three projects. Our evaluation confirms the effectiveness of different aspects of our approach and its superior performance, generalizability, adaptability and explainability over current state-of-the-art traditional text-mining based methods for SATD classification. 2019-03-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/4476 info:doi/10.1145/3324916 https://ink.library.smu.edu.sg/context/sis_research/article/5479/viewcontent/NN_based_classification_technical_debt_tosem_2019_av.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Self-admitted technical debt Convolutional Neural Network Cross project prediction Model explainability Model generalizability Model adaptability Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Self-admitted technical debt
Convolutional Neural Network
Cross project prediction
Model explainability
Model generalizability
Model adaptability
Software Engineering
spellingShingle Self-admitted technical debt
Convolutional Neural Network
Cross project prediction
Model explainability
Model generalizability
Model adaptability
Software Engineering
REN, Xiaoxue
XING, Zhenchang
XIA, Xin
LO, David
WANG, Xinyu
GRUNDY, John
Neural network based detection of self-admitted technical debt: From performance to explainability
description Technical debt is a metaphor to reflect the tradeoff software engineers make between short term benefitsand long term stability. Self-admitted technical debt (SATD), a variant of technical debt, has been proposed to identify debt that is intentionally introduced during software development, e.g., temporary fixes and workarounds. Previous studies have leveraged human-summarized patterns (which represent n-gram phrases that can be used to identify SATD) or text mining techniques to detect SATD in source code comments. However, several characteristics of SATD features in code comments, such as vocabulary diversity, project uniqueness, length and semantic variations, pose a big challenge to the accuracy of pattern or traditional text-mining based SATD detection, especially for cross-project deployment. Furthermore, although traditional text-mining based method outperforms pattern-based method in prediction accuracy, the text features it uses are less intuitive than human-summarized patterns, which makes the prediction results hard to explain. To improve the accuracy of SATD prediction, especially for cross-project prediction, we propose a Convolutional Neural Network (CNN)-based approach for classifying code comments as SATD or non-SATD. To improve the explainability of our model’s prediction results, we exploit the computational structure of CNNs to identify key phrases and patterns in code comments that are most relevant to SATD. We have conducted an extensive set of experiments with 62,566 code comments from 10 open-source projects and a user study with 150 comments of another three projects. Our evaluation confirms the effectiveness of different aspects of our approach and its superior performance, generalizability, adaptability and explainability over current state-of-the-art traditional text-mining based methods for SATD classification.
format text
author REN, Xiaoxue
XING, Zhenchang
XIA, Xin
LO, David
WANG, Xinyu
GRUNDY, John
author_facet REN, Xiaoxue
XING, Zhenchang
XIA, Xin
LO, David
WANG, Xinyu
GRUNDY, John
author_sort REN, Xiaoxue
title Neural network based detection of self-admitted technical debt: From performance to explainability
title_short Neural network based detection of self-admitted technical debt: From performance to explainability
title_full Neural network based detection of self-admitted technical debt: From performance to explainability
title_fullStr Neural network based detection of self-admitted technical debt: From performance to explainability
title_full_unstemmed Neural network based detection of self-admitted technical debt: From performance to explainability
title_sort neural network based detection of self-admitted technical debt: from performance to explainability
publisher Institutional Knowledge at Singapore Management University
publishDate 2019
url https://ink.library.smu.edu.sg/sis_research/4476
https://ink.library.smu.edu.sg/context/sis_research/article/5479/viewcontent/NN_based_classification_technical_debt_tosem_2019_av.pdf
_version_ 1770574851628597248