Simple or complex? Together for a more accurate just-in-time defect predictor

Just-In-Time (JIT) defect prediction aims to automatically predict whether a commit is defective or not, and has been widely studied in recent years. In general, most studies can be classified into two categories: 1) simple models using traditional machine learning classifiers with hand-crafted feat...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHOU, Xin, HAN, DongGyun, LO, David
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2022
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/7691
https://ink.library.smu.edu.sg/context/sis_research/article/8694/viewcontent/3524610.3527910.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-8694
record_format dspace
spelling sg-smu-ink.sis_research-86942023-01-10T03:14:23Z Simple or complex? Together for a more accurate just-in-time defect predictor ZHOU, Xin HAN, DongGyun LO, David Just-In-Time (JIT) defect prediction aims to automatically predict whether a commit is defective or not, and has been widely studied in recent years. In general, most studies can be classified into two categories: 1) simple models using traditional machine learning classifiers with hand-crafted features, and 2) complex models using deep learning techniques to automatically extract features. Hand-crafted features used by simple models are based on expert knowledge but may not fully represent the semantic meaning of the commits. On the other hand, deep learning-based features used by complex models represent the semantic meaning of commits but may not reflect useful expert knowledge. Simple models and complex models seem complementary to each other to some extent. To utilize the advantages of both simple and complex models, we propose a combined model namely SimCom by fusing the prediction scores of one simple and one complex model. The experimental results show that our approach can significantly outperform the state-of-the-art by 6.0-18.1%. In addition, our experimental results confirm that the simple model and complex model are complementary to each other. 2022-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7691 info:doi/10.1145/3524610.3527910 https://ink.library.smu.edu.sg/context/sis_research/article/8694/viewcontent/3524610.3527910.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Deep learning Semantics Predictive models Feature extraction Databases and Information Systems
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Deep learning
Semantics
Predictive models
Feature extraction
Databases and Information Systems
spellingShingle Deep learning
Semantics
Predictive models
Feature extraction
Databases and Information Systems
ZHOU, Xin
HAN, DongGyun
LO, David
Simple or complex? Together for a more accurate just-in-time defect predictor
description Just-In-Time (JIT) defect prediction aims to automatically predict whether a commit is defective or not, and has been widely studied in recent years. In general, most studies can be classified into two categories: 1) simple models using traditional machine learning classifiers with hand-crafted features, and 2) complex models using deep learning techniques to automatically extract features. Hand-crafted features used by simple models are based on expert knowledge but may not fully represent the semantic meaning of the commits. On the other hand, deep learning-based features used by complex models represent the semantic meaning of commits but may not reflect useful expert knowledge. Simple models and complex models seem complementary to each other to some extent. To utilize the advantages of both simple and complex models, we propose a combined model namely SimCom by fusing the prediction scores of one simple and one complex model. The experimental results show that our approach can significantly outperform the state-of-the-art by 6.0-18.1%. In addition, our experimental results confirm that the simple model and complex model are complementary to each other.
format text
author ZHOU, Xin
HAN, DongGyun
LO, David
author_facet ZHOU, Xin
HAN, DongGyun
LO, David
author_sort ZHOU, Xin
title Simple or complex? Together for a more accurate just-in-time defect predictor
title_short Simple or complex? Together for a more accurate just-in-time defect predictor
title_full Simple or complex? Together for a more accurate just-in-time defect predictor
title_fullStr Simple or complex? Together for a more accurate just-in-time defect predictor
title_full_unstemmed Simple or complex? Together for a more accurate just-in-time defect predictor
title_sort simple or complex? together for a more accurate just-in-time defect predictor
publisher Institutional Knowledge at Singapore Management University
publishDate 2022
url https://ink.library.smu.edu.sg/sis_research/7691
https://ink.library.smu.edu.sg/context/sis_research/article/8694/viewcontent/3524610.3527910.pdf
_version_ 1770576415048073216