Boosting just-in-time defect prediction with specific features of C/C++ programming languages in code changes

Just-in-time (JIT) defect prediction can identify changes as defect-inducing ones or clean ones and many approaches are proposed based on several programming language-independent change-level features. However, different programming languages have different characteristics and consequently may affec...

Full description

Saved in:
Bibliographic Details
Main Authors: NI, Chao, XU, Xiaodan, YANG, Kaiwen, LO, David
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2023
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/8623
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-9626
record_format dspace
spelling sg-smu-ink.sis_research-96262024-01-25T06:30:03Z Boosting just-in-time defect prediction with specific features of C/C++ programming languages in code changes NI, Chao XU, Xiaodan YANG, Kaiwen LO, David Just-in-time (JIT) defect prediction can identify changes as defect-inducing ones or clean ones and many approaches are proposed based on several programming language-independent change-level features. However, different programming languages have different characteristics and consequently may affect the quality of software projects. Meanwhile, the C programming language, one of the most popular ones, is widely used to develop foundation applications (i.e., operating system, database, compiler, etc.) in IT companies and its change-level characteristics on project quality have not been fully investigated. Additionally, whether open-source C projects have similar important features to commercial projects has not been studied much.To address the aforementioned limitations, in this paper, we investigate the impacts of programming language-specific features on the state-of-the-art JIT defect identification approach in an industrial setting. We collect and label the top-10 most starred C projects (i.e., 329,021 commits) on GitHub and 8 C projects in an ICT company (i.e., 12,983 commits). We also propose nine C-specific change-level features and focus our investigations on both open-source C projects on GitHub and C projects at the ICT company considering three aspects: (1) The effectiveness of C-specific change-level features in improving the performance of identification of defect-inducing changes, (2) The importance of features in the identification of defect-inducing changes between open-source C projects and commercial C projects, and (3) The effectiveness of combining language-independent features and C-specific features in a real-life setting at the ICT company. 2023-05-16T07:00:00Z text https://ink.library.smu.edu.sg/sis_research/8623 info:doi/10.1109/MSR59073.2023.00072 Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University C++ programming C/C++ programming language Code changes Defect prediction Just-in-time Language independents Open-source Quality of software Software project Supervised methods Databases and Information Systems Programming Languages and Compilers Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic C++ programming
C/C++ programming language
Code changes
Defect prediction
Just-in-time
Language independents
Open-source
Quality of software
Software project
Supervised methods
Databases and Information Systems
Programming Languages and Compilers
Software Engineering
spellingShingle C++ programming
C/C++ programming language
Code changes
Defect prediction
Just-in-time
Language independents
Open-source
Quality of software
Software project
Supervised methods
Databases and Information Systems
Programming Languages and Compilers
Software Engineering
NI, Chao
XU, Xiaodan
YANG, Kaiwen
LO, David
Boosting just-in-time defect prediction with specific features of C/C++ programming languages in code changes
description Just-in-time (JIT) defect prediction can identify changes as defect-inducing ones or clean ones and many approaches are proposed based on several programming language-independent change-level features. However, different programming languages have different characteristics and consequently may affect the quality of software projects. Meanwhile, the C programming language, one of the most popular ones, is widely used to develop foundation applications (i.e., operating system, database, compiler, etc.) in IT companies and its change-level characteristics on project quality have not been fully investigated. Additionally, whether open-source C projects have similar important features to commercial projects has not been studied much.To address the aforementioned limitations, in this paper, we investigate the impacts of programming language-specific features on the state-of-the-art JIT defect identification approach in an industrial setting. We collect and label the top-10 most starred C projects (i.e., 329,021 commits) on GitHub and 8 C projects in an ICT company (i.e., 12,983 commits). We also propose nine C-specific change-level features and focus our investigations on both open-source C projects on GitHub and C projects at the ICT company considering three aspects: (1) The effectiveness of C-specific change-level features in improving the performance of identification of defect-inducing changes, (2) The importance of features in the identification of defect-inducing changes between open-source C projects and commercial C projects, and (3) The effectiveness of combining language-independent features and C-specific features in a real-life setting at the ICT company.
format text
author NI, Chao
XU, Xiaodan
YANG, Kaiwen
LO, David
author_facet NI, Chao
XU, Xiaodan
YANG, Kaiwen
LO, David
author_sort NI, Chao
title Boosting just-in-time defect prediction with specific features of C/C++ programming languages in code changes
title_short Boosting just-in-time defect prediction with specific features of C/C++ programming languages in code changes
title_full Boosting just-in-time defect prediction with specific features of C/C++ programming languages in code changes
title_fullStr Boosting just-in-time defect prediction with specific features of C/C++ programming languages in code changes
title_full_unstemmed Boosting just-in-time defect prediction with specific features of C/C++ programming languages in code changes
title_sort boosting just-in-time defect prediction with specific features of c/c++ programming languages in code changes
publisher Institutional Knowledge at Singapore Management University
publishDate 2023
url https://ink.library.smu.edu.sg/sis_research/8623
_version_ 1789483293604839424