Collective personalized change classification with multiobjective search
Many change classification techniques have been proposed to identify defect-prone changes. These techniques consider all developers' historical change data to build a global prediction model. In practice, since developers have their own coding preferences and behavioral patterns, which causes d...
Saved in:
Main Authors: | , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2016
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/3610 https://ink.library.smu.edu.sg/context/sis_research/article/4611/viewcontent/CollectivePersonalizedChangeClassMultiobjectiveSearch_2016.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-4611 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-46112020-01-13T09:10:24Z Collective personalized change classification with multiobjective search XIA, Xin David LO, WANG, Xinyu YANG, Xiaohu Many change classification techniques have been proposed to identify defect-prone changes. These techniques consider all developers' historical change data to build a global prediction model. In practice, since developers have their own coding preferences and behavioral patterns, which causes different defect patterns, a separate change classification model for each developer can help to improve performance. Jiang, Tan, and Kim refer to this problem as personalized change classification, and they propose PCC+ to solve this problem. A software project has a number of developers; for a developer, building a prediction model not only based on his/her change data, but also on other relevant developers' change data can further improve the performance of change classification. In this paper, we propose a more accurate technique named collective personalized change classification (CPCC), which leverages a multiobjective genetic algorithm. For a project, CPCC first builds a personalized prediction model for each developer based on his/her historical data. Next, for each developer, CPCC combines these models by assigning different weights to these models with the purpose of maximizing two objective functions (i.e., F1-scores and cost effectiveness). To further improve the prediction accuracy, we propose CPCC+ by combining CPCC with PCC proposed by Jiang, Tan, and Kim To evaluate the benefits of CPCC+ and CPCC, we perform experiments on six large software projects from different communities: Eclipse JDT, Jackrabbit, Linux kernel, Lucene, PostgreSQL, and Xorg. The experiment results show that CPCC+ can discover up to 245 more bugs than PCC+ (468 versus 223 for PostgreSQL) if developers inspect the top 20% lines of code that are predicted buggy. In addition, CPCC+ can achieve F1-scores of 0.60-0.75, which are statistically significantly higher than those of PCC+ on all of the six projects. 2016-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/3610 info:doi/10.1109/TR.2016.2588139 https://ink.library.smu.edu.sg/context/sis_research/article/4611/viewcontent/CollectivePersonalizedChangeClassMultiobjectiveSearch_2016.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Cost effectiveness developer machine learning multiobjective genetic algorithm personalized change classification (PCC) Computer Sciences Software Engineering |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Cost effectiveness developer machine learning multiobjective genetic algorithm personalized change classification (PCC) Computer Sciences Software Engineering |
spellingShingle |
Cost effectiveness developer machine learning multiobjective genetic algorithm personalized change classification (PCC) Computer Sciences Software Engineering XIA, Xin David LO, WANG, Xinyu YANG, Xiaohu Collective personalized change classification with multiobjective search |
description |
Many change classification techniques have been proposed to identify defect-prone changes. These techniques consider all developers' historical change data to build a global prediction model. In practice, since developers have their own coding preferences and behavioral patterns, which causes different defect patterns, a separate change classification model for each developer can help to improve performance. Jiang, Tan, and Kim refer to this problem as personalized change classification, and they propose PCC+ to solve this problem. A software project has a number of developers; for a developer, building a prediction model not only based on his/her change data, but also on other relevant developers' change data can further improve the performance of change classification. In this paper, we propose a more accurate technique named collective personalized change classification (CPCC), which leverages a multiobjective genetic algorithm. For a project, CPCC first builds a personalized prediction model for each developer based on his/her historical data. Next, for each developer, CPCC combines these models by assigning different weights to these models with the purpose of maximizing two objective functions (i.e., F1-scores and cost effectiveness). To further improve the prediction accuracy, we propose CPCC+ by combining CPCC with PCC proposed by Jiang, Tan, and Kim To evaluate the benefits of CPCC+ and CPCC, we perform experiments on six large software projects from different communities: Eclipse JDT, Jackrabbit, Linux kernel, Lucene, PostgreSQL, and Xorg. The experiment results show that CPCC+ can discover up to 245 more bugs than PCC+ (468 versus 223 for PostgreSQL) if developers inspect the top 20% lines of code that are predicted buggy. In addition, CPCC+ can achieve F1-scores of 0.60-0.75, which are statistically significantly higher than those of PCC+ on all of the six projects. |
format |
text |
author |
XIA, Xin David LO, WANG, Xinyu YANG, Xiaohu |
author_facet |
XIA, Xin David LO, WANG, Xinyu YANG, Xiaohu |
author_sort |
XIA, Xin |
title |
Collective personalized change classification with multiobjective search |
title_short |
Collective personalized change classification with multiobjective search |
title_full |
Collective personalized change classification with multiobjective search |
title_fullStr |
Collective personalized change classification with multiobjective search |
title_full_unstemmed |
Collective personalized change classification with multiobjective search |
title_sort |
collective personalized change classification with multiobjective search |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2016 |
url |
https://ink.library.smu.edu.sg/sis_research/3610 https://ink.library.smu.edu.sg/context/sis_research/article/4611/viewcontent/CollectivePersonalizedChangeClassMultiobjectiveSearch_2016.pdf |
_version_ |
1770573346114633728 |