Early prediction of merged code changes to prioritize reviewing tasks

Modern Code Review (MCR) has been widely used by open source and proprietary software projects. Inspecting code changes consumes reviewers much time and effort since they need to comprehend patches, and many reviewers are often assigned to review many code changes. Note that a code change might be e...

Full description

Saved in:
Bibliographic Details
Main Authors: FAN, Yuanrui, XIA, Xin, LO, David, LI, Shanping
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2018
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/3989
https://ink.library.smu.edu.sg/context/sis_research/article/4991/viewcontent/Early_prediction_of_merged_code_changes_to_prioritize_reviewing_tasks_afv.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-4991
record_format dspace
spelling sg-smu-ink.sis_research-49912020-01-15T05:45:19Z Early prediction of merged code changes to prioritize reviewing tasks FAN, Yuanrui XIA, Xin LO, David LI, Shanping Modern Code Review (MCR) has been widely used by open source and proprietary software projects. Inspecting code changes consumes reviewers much time and effort since they need to comprehend patches, and many reviewers are often assigned to review many code changes. Note that a code change might be eventually abandoned, which causes waste of time and effort. Thus, a tool that predicts early on whether a code change will be merged can help developers prioritize changes to inspect, accomplish more things given tight schedule, and not waste reviewing effort on low quality changes. In this paper, motivated by the above needs, we build a merged code change prediction tool. Our approach first extracts 34 features from code changes, which are grouped into 5 dimensions: code, file history, owner experience, collaboration network, and text. And then we leverage machine learning techniques such as random forest to build a prediction model. To evaluate the performance of our approach, we conduct experiments on three open source projects (i.e., Eclipse, LibreOffice, and OpenStack), containing a total of 166,215 code changes. Across three datasets, our approach statistically significantly improves random guess classifiers and two prediction models proposed by Jeong et al. (2009) and Gousios et al. (2014) in terms ofseveral evaluation metrics. Besides, we also study the important features which distinguishmerged code changes from abandoned ones. 2018-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/3989 info:doi/10.1007/s10664-018-9602-0 https://ink.library.smu.edu.sg/context/sis_research/article/4991/viewcontent/Early_prediction_of_merged_code_changes_to_prioritize_reviewing_tasks_afv.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Code review Predictive model Features Computer and Systems Architecture Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Code review
Predictive model
Features
Computer and Systems Architecture
Software Engineering
spellingShingle Code review
Predictive model
Features
Computer and Systems Architecture
Software Engineering
FAN, Yuanrui
XIA, Xin
LO, David
LI, Shanping
Early prediction of merged code changes to prioritize reviewing tasks
description Modern Code Review (MCR) has been widely used by open source and proprietary software projects. Inspecting code changes consumes reviewers much time and effort since they need to comprehend patches, and many reviewers are often assigned to review many code changes. Note that a code change might be eventually abandoned, which causes waste of time and effort. Thus, a tool that predicts early on whether a code change will be merged can help developers prioritize changes to inspect, accomplish more things given tight schedule, and not waste reviewing effort on low quality changes. In this paper, motivated by the above needs, we build a merged code change prediction tool. Our approach first extracts 34 features from code changes, which are grouped into 5 dimensions: code, file history, owner experience, collaboration network, and text. And then we leverage machine learning techniques such as random forest to build a prediction model. To evaluate the performance of our approach, we conduct experiments on three open source projects (i.e., Eclipse, LibreOffice, and OpenStack), containing a total of 166,215 code changes. Across three datasets, our approach statistically significantly improves random guess classifiers and two prediction models proposed by Jeong et al. (2009) and Gousios et al. (2014) in terms ofseveral evaluation metrics. Besides, we also study the important features which distinguishmerged code changes from abandoned ones.
format text
author FAN, Yuanrui
XIA, Xin
LO, David
LI, Shanping
author_facet FAN, Yuanrui
XIA, Xin
LO, David
LI, Shanping
author_sort FAN, Yuanrui
title Early prediction of merged code changes to prioritize reviewing tasks
title_short Early prediction of merged code changes to prioritize reviewing tasks
title_full Early prediction of merged code changes to prioritize reviewing tasks
title_fullStr Early prediction of merged code changes to prioritize reviewing tasks
title_full_unstemmed Early prediction of merged code changes to prioritize reviewing tasks
title_sort early prediction of merged code changes to prioritize reviewing tasks
publisher Institutional Knowledge at Singapore Management University
publishDate 2018
url https://ink.library.smu.edu.sg/sis_research/3989
https://ink.library.smu.edu.sg/context/sis_research/article/4991/viewcontent/Early_prediction_of_merged_code_changes_to_prioritize_reviewing_tasks_afv.pdf
_version_ 1770574112626835456