Collective personalized change classification with multiobjective search

Many change classification techniques have been proposed to identify defect-prone changes. These techniques consider all developers' historical change data to build a global prediction model. In practice, since developers have their own coding preferences and behavioral patterns, which causes d...

Full description

Saved in:
Bibliographic Details
Main Authors: XIA, Xin, David LO, WANG, Xinyu, YANG, Xiaohu
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2016
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/3610
https://ink.library.smu.edu.sg/context/sis_research/article/4611/viewcontent/CollectivePersonalizedChangeClassMultiobjectiveSearch_2016.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-4611
record_format dspace
spelling sg-smu-ink.sis_research-46112020-01-13T09:10:24Z Collective personalized change classification with multiobjective search XIA, Xin David LO, WANG, Xinyu YANG, Xiaohu Many change classification techniques have been proposed to identify defect-prone changes. These techniques consider all developers' historical change data to build a global prediction model. In practice, since developers have their own coding preferences and behavioral patterns, which causes different defect patterns, a separate change classification model for each developer can help to improve performance. Jiang, Tan, and Kim refer to this problem as personalized change classification, and they propose PCC+ to solve this problem. A software project has a number of developers; for a developer, building a prediction model not only based on his/her change data, but also on other relevant developers' change data can further improve the performance of change classification. In this paper, we propose a more accurate technique named collective personalized change classification (CPCC), which leverages a multiobjective genetic algorithm. For a project, CPCC first builds a personalized prediction model for each developer based on his/her historical data. Next, for each developer, CPCC combines these models by assigning different weights to these models with the purpose of maximizing two objective functions (i.e., F1-scores and cost effectiveness). To further improve the prediction accuracy, we propose CPCC+ by combining CPCC with PCC proposed by Jiang, Tan, and Kim To evaluate the benefits of CPCC+ and CPCC, we perform experiments on six large software projects from different communities: Eclipse JDT, Jackrabbit, Linux kernel, Lucene, PostgreSQL, and Xorg. The experiment results show that CPCC+ can discover up to 245 more bugs than PCC+ (468 versus 223 for PostgreSQL) if developers inspect the top 20% lines of code that are predicted buggy. In addition, CPCC+ can achieve F1-scores of 0.60-0.75, which are statistically significantly higher than those of PCC+ on all of the six projects. 2016-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/3610 info:doi/10.1109/TR.2016.2588139 https://ink.library.smu.edu.sg/context/sis_research/article/4611/viewcontent/CollectivePersonalizedChangeClassMultiobjectiveSearch_2016.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Cost effectiveness developer machine learning multiobjective genetic algorithm personalized change classification (PCC) Computer Sciences Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Cost effectiveness
developer
machine learning
multiobjective genetic algorithm
personalized change classification (PCC)
Computer Sciences
Software Engineering
spellingShingle Cost effectiveness
developer
machine learning
multiobjective genetic algorithm
personalized change classification (PCC)
Computer Sciences
Software Engineering
XIA, Xin
David LO,
WANG, Xinyu
YANG, Xiaohu
Collective personalized change classification with multiobjective search
description Many change classification techniques have been proposed to identify defect-prone changes. These techniques consider all developers' historical change data to build a global prediction model. In practice, since developers have their own coding preferences and behavioral patterns, which causes different defect patterns, a separate change classification model for each developer can help to improve performance. Jiang, Tan, and Kim refer to this problem as personalized change classification, and they propose PCC+ to solve this problem. A software project has a number of developers; for a developer, building a prediction model not only based on his/her change data, but also on other relevant developers' change data can further improve the performance of change classification. In this paper, we propose a more accurate technique named collective personalized change classification (CPCC), which leverages a multiobjective genetic algorithm. For a project, CPCC first builds a personalized prediction model for each developer based on his/her historical data. Next, for each developer, CPCC combines these models by assigning different weights to these models with the purpose of maximizing two objective functions (i.e., F1-scores and cost effectiveness). To further improve the prediction accuracy, we propose CPCC+ by combining CPCC with PCC proposed by Jiang, Tan, and Kim To evaluate the benefits of CPCC+ and CPCC, we perform experiments on six large software projects from different communities: Eclipse JDT, Jackrabbit, Linux kernel, Lucene, PostgreSQL, and Xorg. The experiment results show that CPCC+ can discover up to 245 more bugs than PCC+ (468 versus 223 for PostgreSQL) if developers inspect the top 20% lines of code that are predicted buggy. In addition, CPCC+ can achieve F1-scores of 0.60-0.75, which are statistically significantly higher than those of PCC+ on all of the six projects.
format text
author XIA, Xin
David LO,
WANG, Xinyu
YANG, Xiaohu
author_facet XIA, Xin
David LO,
WANG, Xinyu
YANG, Xiaohu
author_sort XIA, Xin
title Collective personalized change classification with multiobjective search
title_short Collective personalized change classification with multiobjective search
title_full Collective personalized change classification with multiobjective search
title_fullStr Collective personalized change classification with multiobjective search
title_full_unstemmed Collective personalized change classification with multiobjective search
title_sort collective personalized change classification with multiobjective search
publisher Institutional Knowledge at Singapore Management University
publishDate 2016
url https://ink.library.smu.edu.sg/sis_research/3610
https://ink.library.smu.edu.sg/context/sis_research/article/4611/viewcontent/CollectivePersonalizedChangeClassMultiobjectiveSearch_2016.pdf
_version_ 1770573346114633728