Just-In-Time TODO-missed commits detection

TODO comments play an important role in helping developers to manage their tasks and communicate with other team members. TODO comments are often introduced by developers as a type of technical debt, such as a reminder to add/remove features or a request to optimize the code implementations. These c...

Full description

Saved in:
Bibliographic Details
Main Authors: WANG, Haoye, GAO, Zhipeng, HU, Xing, LO, David, GRUNDY, John, WANG, Xinyu
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2024
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/9916
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-10916
record_format dspace
spelling sg-smu-ink.sis_research-109162025-01-02T08:03:58Z Just-In-Time TODO-missed commits detection WANG, Haoye GAO, Zhipeng HU, Xing LO, David GRUNDY, John WANG, Xinyu TODO comments play an important role in helping developers to manage their tasks and communicate with other team members. TODO comments are often introduced by developers as a type of technical debt, such as a reminder to add/remove features or a request to optimize the code implementations. These can all be considered as notifications for developers to revisit regarding the current suboptimal solutions. TODO comments often bring short-term benefits – higher productivity or shorter development cost – and indicate attention needs to be paid for the long-term software quality. Unfortunately, due to their lack of knowledge or experience and/or the time constraints, developers sometimes may forget or even not be aware of suboptimal implementations. The loss of the TODO comments for these suboptimal solutions may hurt the software quality and reliability in the long-term. Therefore it is beneficial to remind the developers of the suboptimal solutions whenever they change the code. In this work, we refer this problem to the task of detecting TODO-missed commits , and we propose a novel approach named TDReminder ( T O D O comment Reminder ) to address the task. With the help of TDReminder , developers can identify possible missing TODO commits just-in-time when submitting a commit. Our approach has two phases: offline training and online inference. We first embed code change and commit message into contextual vector representations using two neural encoders respectively. The association between these representations is learned by our model automatically. In the online inference phase, TDReminder leverages the trained model to compute the likelihood of a commit being a TODO-missed commit . We evaluate TDReminder on datasets crawled from 10k popular Python and Java repositories in GitHub respectively. Our experimental results show that TDReminder outperforms a set of benchmarks by a large margin in TODO-missed commits detection. Moreover, to better help developers use TDReminder in practice, we have incorporated Large Language Models (LLMs) with our approach to provide explainable recommendations. The user study shows that our tool can effectively inform developers not only “when” to add TODOs, but also “where” and “what” TODOs should be added, verifying the value of our tool in practical application. 2024-11-01T07:00:00Z text https://ink.library.smu.edu.sg/sis_research/9916 info:doi/10.1109/TSE.2024.3405005 Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Technical debt TODO comment Code-comment inconsistency Suboptimal implementation Databases and Information Systems
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Technical debt
TODO comment
Code-comment inconsistency
Suboptimal implementation
Databases and Information Systems
spellingShingle Technical debt
TODO comment
Code-comment inconsistency
Suboptimal implementation
Databases and Information Systems
WANG, Haoye
GAO, Zhipeng
HU, Xing
LO, David
GRUNDY, John
WANG, Xinyu
Just-In-Time TODO-missed commits detection
description TODO comments play an important role in helping developers to manage their tasks and communicate with other team members. TODO comments are often introduced by developers as a type of technical debt, such as a reminder to add/remove features or a request to optimize the code implementations. These can all be considered as notifications for developers to revisit regarding the current suboptimal solutions. TODO comments often bring short-term benefits – higher productivity or shorter development cost – and indicate attention needs to be paid for the long-term software quality. Unfortunately, due to their lack of knowledge or experience and/or the time constraints, developers sometimes may forget or even not be aware of suboptimal implementations. The loss of the TODO comments for these suboptimal solutions may hurt the software quality and reliability in the long-term. Therefore it is beneficial to remind the developers of the suboptimal solutions whenever they change the code. In this work, we refer this problem to the task of detecting TODO-missed commits , and we propose a novel approach named TDReminder ( T O D O comment Reminder ) to address the task. With the help of TDReminder , developers can identify possible missing TODO commits just-in-time when submitting a commit. Our approach has two phases: offline training and online inference. We first embed code change and commit message into contextual vector representations using two neural encoders respectively. The association between these representations is learned by our model automatically. In the online inference phase, TDReminder leverages the trained model to compute the likelihood of a commit being a TODO-missed commit . We evaluate TDReminder on datasets crawled from 10k popular Python and Java repositories in GitHub respectively. Our experimental results show that TDReminder outperforms a set of benchmarks by a large margin in TODO-missed commits detection. Moreover, to better help developers use TDReminder in practice, we have incorporated Large Language Models (LLMs) with our approach to provide explainable recommendations. The user study shows that our tool can effectively inform developers not only “when” to add TODOs, but also “where” and “what” TODOs should be added, verifying the value of our tool in practical application.
format text
author WANG, Haoye
GAO, Zhipeng
HU, Xing
LO, David
GRUNDY, John
WANG, Xinyu
author_facet WANG, Haoye
GAO, Zhipeng
HU, Xing
LO, David
GRUNDY, John
WANG, Xinyu
author_sort WANG, Haoye
title Just-In-Time TODO-missed commits detection
title_short Just-In-Time TODO-missed commits detection
title_full Just-In-Time TODO-missed commits detection
title_fullStr Just-In-Time TODO-missed commits detection
title_full_unstemmed Just-In-Time TODO-missed commits detection
title_sort just-in-time todo-missed commits detection
publisher Institutional Knowledge at Singapore Management University
publishDate 2024
url https://ink.library.smu.edu.sg/sis_research/9916
_version_ 1821237284697014272