How (not) to find bugs: The interplay between merge conflicts, co-changes, and bugs
Context: In a seminal work, Ball et al. [1] investigate if the information available in version control systems could be used to predict defect density, arguing that practitioners and researchers could better understand errors "if [our] version control system could talk". In the meanwhile,...
Saved in:
Main Authors: | , , , , , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2020
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/5625 https://ink.library.smu.edu.sg/context/sis_research/article/6628/viewcontent/HowNotTofindBugs_2020_pv.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
Summary: | Context: In a seminal work, Ball et al. [1] investigate if the information available in version control systems could be used to predict defect density, arguing that practitioners and researchers could better understand errors "if [our] version control system could talk". In the meanwhile, several research works have reported that conflict merge resolution is a time consuming and error-prone task, while other contributions diverge about the correlation between co-change dependencies and defect density. Problem: The correlation between conflicting merge scenarios and bugs has not been addressed before, whilst the correlation between co-change dependencies and bug density has been only investigated using a small number of case studies-which can compromise the generalization of the results. Goal: To address this gap in the literature, this paper presents the results of a comprehensive study whose goal is to understand whether or not (a) conflicting merge scenarios and (b) co-change dependencies are good predictors for bug density. Method: We first build a curated dataset comprising the source code history of 29 popular Java Apache projects and leverage the SZZ algorithm to collect the sets of bug-fixing and bug-introducing commits. We then combine the SZZ results with the set of past conflicting merge scenarios and co-change dependencies of the projects. Finally, we use exploratory data analysis and machine learning models to understand the strength of the correlation between conflict resolution and co-change dependencies with defect density. Findings: (a) conflicting merge scenarios are not more prone to introduce bugs than regular commits, (b) there is a negligible to a small correlation between co-change dependencies and defect density-contradicting previous studies in the literature. |
---|