Assessing code clone harmfulness: Indicators, factors, and counter measures
Code clones are identical or similar code in software projects. On one hand, developers clone code to achieve higher productivity and thus clones inherently exist; on the other hand, code clones demand extra effort to maintain the consistency between clone instances and may introduce bugs, and thus...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2021
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/6193 https://doi.org/10.1109/SANER50967.2021.00029 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-7196 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-71962022-01-21T05:18:43Z Assessing code clone harmfulness: Indicators, factors, and counter measures HU, Bin WU, Yijian PENG, Xin SUN, Jun ZHAN, Nanjie WU, Jun Code clones are identical or similar code in software projects. On one hand, developers clone code to achieve higher productivity and thus clones inherently exist; on the other hand, code clones demand extra effort to maintain the consistency between clone instances and may introduce bugs, and thus are often considered harmful for software maintenance and quality. We believe that not all code clones have the same level of harmfulness. A systematic way of assessing the harmfulness level of cloned code would facilitate informed decisions on how to deal with clones. We propose a model for clone harmfulness level assessment with four quantitative indicators that can be extracted from the evolution history of the clones. Specifically, we gather information, such as code clone changes and bug- fixes related to clone divergence and re-synchronization, to find objective evidence that a clone harms the software quality or brings potential risks even if no bugs are found. The assessment model consists of four harmfulness levels of clones determined by the four indicators. We also derive three harmfulness factors from the intrinsic properties of clones that potentially affect the harmfulness of clones. We conduct a large-scale empirical study with five open-source and three industry systems and find that 61.0-84.7% of the clones are not harmful in terms of consistent maintenance overhead. We find evidence in the evolution history that several factors, such as spread of clone instances, number of clone instances, and number of developers, have non-trivial correlation with clone harmfulness levels. We also propose six counter measures for clone harmfulness mitigation based on the observation of the harmfulness factors, and have collected useful feedback from industrial software architects and senior developers through an interview meeting. 2021-03-01T08:00:00Z text https://ink.library.smu.edu.sg/sis_research/6193 info:doi/10.1109/SANER50967.2021.00029 https://doi.org/10.1109/SANER50967.2021.00029 Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University code clone harmfulness clone analysis clone evolution consistent changes Software Engineering |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
code clone harmfulness clone analysis clone evolution consistent changes Software Engineering |
spellingShingle |
code clone harmfulness clone analysis clone evolution consistent changes Software Engineering HU, Bin WU, Yijian PENG, Xin SUN, Jun ZHAN, Nanjie WU, Jun Assessing code clone harmfulness: Indicators, factors, and counter measures |
description |
Code clones are identical or similar code in software projects. On one hand, developers clone code to achieve higher productivity and thus clones inherently exist; on the other hand, code clones demand extra effort to maintain the consistency between clone instances and may introduce bugs, and thus are often considered harmful for software maintenance and quality. We believe that not all code clones have the same level of harmfulness. A systematic way of assessing the harmfulness level of cloned code would facilitate informed decisions on how to deal with clones. We propose a model for clone harmfulness level assessment with four quantitative indicators that can be extracted from the evolution history of the clones. Specifically, we gather information, such as code clone changes and bug- fixes related to clone divergence and re-synchronization, to find objective evidence that a clone harms the software quality or brings potential risks even if no bugs are found. The assessment model consists of four harmfulness levels of clones determined by the four indicators. We also derive three harmfulness factors from the intrinsic properties of clones that potentially affect the harmfulness of clones. We conduct a large-scale empirical study with five open-source and three industry systems and find that 61.0-84.7% of the clones are not harmful in terms of consistent maintenance overhead. We find evidence in the evolution history that several factors, such as spread of clone instances, number of clone instances, and number of developers, have non-trivial correlation with clone harmfulness levels. We also propose six counter measures for clone harmfulness mitigation based on the observation of the harmfulness factors, and have collected useful feedback from industrial software architects and senior developers through an interview meeting. |
format |
text |
author |
HU, Bin WU, Yijian PENG, Xin SUN, Jun ZHAN, Nanjie WU, Jun |
author_facet |
HU, Bin WU, Yijian PENG, Xin SUN, Jun ZHAN, Nanjie WU, Jun |
author_sort |
HU, Bin |
title |
Assessing code clone harmfulness: Indicators, factors, and counter measures |
title_short |
Assessing code clone harmfulness: Indicators, factors, and counter measures |
title_full |
Assessing code clone harmfulness: Indicators, factors, and counter measures |
title_fullStr |
Assessing code clone harmfulness: Indicators, factors, and counter measures |
title_full_unstemmed |
Assessing code clone harmfulness: Indicators, factors, and counter measures |
title_sort |
assessing code clone harmfulness: indicators, factors, and counter measures |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2021 |
url |
https://ink.library.smu.edu.sg/sis_research/6193 https://doi.org/10.1109/SANER50967.2021.00029 |
_version_ |
1770575845519261696 |