Assessing code clone harmfulness: Indicators, factors, and counter measures

Code clones are identical or similar code in software projects. On one hand, developers clone code to achieve higher productivity and thus clones inherently exist; on the other hand, code clones demand extra effort to maintain the consistency between clone instances and may introduce bugs, and thus...

Full description

Saved in:
Bibliographic Details
Main Authors: HU, Bin, WU, Yijian, PENG, Xin, SUN, Jun, ZHAN, Nanjie, WU, Jun
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2021
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6193
https://doi.org/10.1109/SANER50967.2021.00029
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-7196
record_format dspace
spelling sg-smu-ink.sis_research-71962022-01-21T05:18:43Z Assessing code clone harmfulness: Indicators, factors, and counter measures HU, Bin WU, Yijian PENG, Xin SUN, Jun ZHAN, Nanjie WU, Jun Code clones are identical or similar code in software projects. On one hand, developers clone code to achieve higher productivity and thus clones inherently exist; on the other hand, code clones demand extra effort to maintain the consistency between clone instances and may introduce bugs, and thus are often considered harmful for software maintenance and quality. We believe that not all code clones have the same level of harmfulness. A systematic way of assessing the harmfulness level of cloned code would facilitate informed decisions on how to deal with clones. We propose a model for clone harmfulness level assessment with four quantitative indicators that can be extracted from the evolution history of the clones. Specifically, we gather information, such as code clone changes and bug- fixes related to clone divergence and re-synchronization, to find objective evidence that a clone harms the software quality or brings potential risks even if no bugs are found. The assessment model consists of four harmfulness levels of clones determined by the four indicators. We also derive three harmfulness factors from the intrinsic properties of clones that potentially affect the harmfulness of clones. We conduct a large-scale empirical study with five open-source and three industry systems and find that 61.0-84.7% of the clones are not harmful in terms of consistent maintenance overhead. We find evidence in the evolution history that several factors, such as spread of clone instances, number of clone instances, and number of developers, have non-trivial correlation with clone harmfulness levels. We also propose six counter measures for clone harmfulness mitigation based on the observation of the harmfulness factors, and have collected useful feedback from industrial software architects and senior developers through an interview meeting. 2021-03-01T08:00:00Z text https://ink.library.smu.edu.sg/sis_research/6193 info:doi/10.1109/SANER50967.2021.00029 https://doi.org/10.1109/SANER50967.2021.00029 Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University code clone harmfulness clone analysis clone evolution consistent changes Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic code clone harmfulness
clone analysis
clone evolution
consistent changes
Software Engineering
spellingShingle code clone harmfulness
clone analysis
clone evolution
consistent changes
Software Engineering
HU, Bin
WU, Yijian
PENG, Xin
SUN, Jun
ZHAN, Nanjie
WU, Jun
Assessing code clone harmfulness: Indicators, factors, and counter measures
description Code clones are identical or similar code in software projects. On one hand, developers clone code to achieve higher productivity and thus clones inherently exist; on the other hand, code clones demand extra effort to maintain the consistency between clone instances and may introduce bugs, and thus are often considered harmful for software maintenance and quality. We believe that not all code clones have the same level of harmfulness. A systematic way of assessing the harmfulness level of cloned code would facilitate informed decisions on how to deal with clones. We propose a model for clone harmfulness level assessment with four quantitative indicators that can be extracted from the evolution history of the clones. Specifically, we gather information, such as code clone changes and bug- fixes related to clone divergence and re-synchronization, to find objective evidence that a clone harms the software quality or brings potential risks even if no bugs are found. The assessment model consists of four harmfulness levels of clones determined by the four indicators. We also derive three harmfulness factors from the intrinsic properties of clones that potentially affect the harmfulness of clones. We conduct a large-scale empirical study with five open-source and three industry systems and find that 61.0-84.7% of the clones are not harmful in terms of consistent maintenance overhead. We find evidence in the evolution history that several factors, such as spread of clone instances, number of clone instances, and number of developers, have non-trivial correlation with clone harmfulness levels. We also propose six counter measures for clone harmfulness mitigation based on the observation of the harmfulness factors, and have collected useful feedback from industrial software architects and senior developers through an interview meeting.
format text
author HU, Bin
WU, Yijian
PENG, Xin
SUN, Jun
ZHAN, Nanjie
WU, Jun
author_facet HU, Bin
WU, Yijian
PENG, Xin
SUN, Jun
ZHAN, Nanjie
WU, Jun
author_sort HU, Bin
title Assessing code clone harmfulness: Indicators, factors, and counter measures
title_short Assessing code clone harmfulness: Indicators, factors, and counter measures
title_full Assessing code clone harmfulness: Indicators, factors, and counter measures
title_fullStr Assessing code clone harmfulness: Indicators, factors, and counter measures
title_full_unstemmed Assessing code clone harmfulness: Indicators, factors, and counter measures
title_sort assessing code clone harmfulness: indicators, factors, and counter measures
publisher Institutional Knowledge at Singapore Management University
publishDate 2021
url https://ink.library.smu.edu.sg/sis_research/6193
https://doi.org/10.1109/SANER50967.2021.00029
_version_ 1770575845519261696