Cross-Project Build Co-change Prediction

Build systems orchestrate how human-readable source code is translated into executable programs. In a software project, source code changes can induce changes in the build system (aka. build co-changes). It is difficult for developers to identify when build co-changes are necessary due to the comple...

Full description

Saved in:
Bibliographic Details
Main Authors: XIA, Xin, David LO, MCINTOSH, Shane, SHIHAB, Emad, HASSAN, Ahmed
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2015
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/3078
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-4078
record_format dspace
spelling sg-smu-ink.sis_research-40782016-02-05T06:30:05Z Cross-Project Build Co-change Prediction XIA, Xin David LO, MCINTOSH, Shane SHIHAB, Emad HASSAN, Ahmed Build systems orchestrate how human-readable source code is translated into executable programs. In a software project, source code changes can induce changes in the build system (aka. build co-changes). It is difficult for developers to identify when build co-changes are necessary due to the complexity of build systems. Prediction of build co-changes works well if there is a sufficient amount of training data to build a model. However, in practice, for new projects, there exists a limited number of changes. Using training data from other projects to predict the build co-changes in a new project can help improve the performance of the build co-change prediction. We refer to this problem as cross-project build co-change prediction. In this paper, we propose CroBuild, a novel cross-project build co-change prediction approach that iteratively learns new classifiers. CroBuild constructs an ensemble of classifiers by iteratively building classifiers and assigning them weights according to its prediction error rate. Given that only a small proportion of code changes are build co-changing, we also propose an imbalance-aware approach that learns a threshold boundary between those code changes that are build co-changing and those that are not in order to construct classifiers in each iteration. To examine the benefits of CroBuild, we perform experiments on 4 large datasets including Mozilla, Eclipse-core, Lucene, and Jazz, comprising a total of 50,884 changes. On average, across the 4 datasets, CroBuild achieves a F1-score of up to 0.408. We also compare CroBuild with other approaches such as a basic model, AdaBoost proposed by Freund et al., and TrAdaBoost proposed by Dai et al.. On average, across the 4 datasets, the CroBuild approach yields an improvement in F1-scores of 41.54%, 36.63%, and 36.97% over the basic model, AdaBoost, and TrAdaBoost, respectively. 2015-03-06T08:00:00Z text https://ink.library.smu.edu.sg/sis_research/3078 info:doi/10.1109/SANER.2015.7081841 Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Software Engineering
spellingShingle Software Engineering
XIA, Xin
David LO,
MCINTOSH, Shane
SHIHAB, Emad
HASSAN, Ahmed
Cross-Project Build Co-change Prediction
description Build systems orchestrate how human-readable source code is translated into executable programs. In a software project, source code changes can induce changes in the build system (aka. build co-changes). It is difficult for developers to identify when build co-changes are necessary due to the complexity of build systems. Prediction of build co-changes works well if there is a sufficient amount of training data to build a model. However, in practice, for new projects, there exists a limited number of changes. Using training data from other projects to predict the build co-changes in a new project can help improve the performance of the build co-change prediction. We refer to this problem as cross-project build co-change prediction. In this paper, we propose CroBuild, a novel cross-project build co-change prediction approach that iteratively learns new classifiers. CroBuild constructs an ensemble of classifiers by iteratively building classifiers and assigning them weights according to its prediction error rate. Given that only a small proportion of code changes are build co-changing, we also propose an imbalance-aware approach that learns a threshold boundary between those code changes that are build co-changing and those that are not in order to construct classifiers in each iteration. To examine the benefits of CroBuild, we perform experiments on 4 large datasets including Mozilla, Eclipse-core, Lucene, and Jazz, comprising a total of 50,884 changes. On average, across the 4 datasets, CroBuild achieves a F1-score of up to 0.408. We also compare CroBuild with other approaches such as a basic model, AdaBoost proposed by Freund et al., and TrAdaBoost proposed by Dai et al.. On average, across the 4 datasets, the CroBuild approach yields an improvement in F1-scores of 41.54%, 36.63%, and 36.97% over the basic model, AdaBoost, and TrAdaBoost, respectively.
format text
author XIA, Xin
David LO,
MCINTOSH, Shane
SHIHAB, Emad
HASSAN, Ahmed
author_facet XIA, Xin
David LO,
MCINTOSH, Shane
SHIHAB, Emad
HASSAN, Ahmed
author_sort XIA, Xin
title Cross-Project Build Co-change Prediction
title_short Cross-Project Build Co-change Prediction
title_full Cross-Project Build Co-change Prediction
title_fullStr Cross-Project Build Co-change Prediction
title_full_unstemmed Cross-Project Build Co-change Prediction
title_sort cross-project build co-change prediction
publisher Institutional Knowledge at Singapore Management University
publishDate 2015
url https://ink.library.smu.edu.sg/sis_research/3078
_version_ 1770572802506620928