Automatic defect categorization

Defects are prevalent in software systems. In order to understand defects better, industry practitioners often categorize bugs into various types. One common kind of categorization is the IBM’s Orthogonal Defect Classification (ODC). ODC proposes various orthogonal classification of defects based on...

Full description

Saved in:
Bibliographic Details
Main Authors: THUNG, Ferdian, LO, David, JIANG, Lingxiao
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2012
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/1681
https://ink.library.smu.edu.sg/context/sis_research/article/2680/viewcontent/wcre12defects.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-2680
record_format dspace
spelling sg-smu-ink.sis_research-26802017-02-05T03:35:20Z Automatic defect categorization THUNG, Ferdian LO, David JIANG, Lingxiao Defects are prevalent in software systems. In order to understand defects better, industry practitioners often categorize bugs into various types. One common kind of categorization is the IBM’s Orthogonal Defect Classification (ODC). ODC proposes various orthogonal classification of defects based on much information about the defects, such as the symptoms and semantics of the defects, the root cause analysis of the defects, and many more. With these category labels, developers can better perform post-mortem analysis to find out what the common characteristics of the defects that plague a particular software project are. Albeit the benefits of having these categories, for many software systems, these category labels are often missing. To address this problem, we propose a text mining solution that can categorize defects into various types by analyzing both texts from bug reports and code features from bug fixes. To this end, we have manually analyzed the data about 500 defects from three software systems, and classified them according to ODC. In addition, we propose a classification-based approach that can automatically classify defects into three supercategories that are comprised of ODC categories: control and data flow, structural, and non-functional. Our empirical evaluation shows that the automatic classification approach is able to label defects with an average accuracy of 77.8% by using the SVM multiclass classification algorithm. 2012-10-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/1681 info:doi/10.1109/WCRE.2012.30 https://ink.library.smu.edu.sg/context/sis_research/article/2680/viewcontent/wcre12defects.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Software Engineering
spellingShingle Software Engineering
THUNG, Ferdian
LO, David
JIANG, Lingxiao
Automatic defect categorization
description Defects are prevalent in software systems. In order to understand defects better, industry practitioners often categorize bugs into various types. One common kind of categorization is the IBM’s Orthogonal Defect Classification (ODC). ODC proposes various orthogonal classification of defects based on much information about the defects, such as the symptoms and semantics of the defects, the root cause analysis of the defects, and many more. With these category labels, developers can better perform post-mortem analysis to find out what the common characteristics of the defects that plague a particular software project are. Albeit the benefits of having these categories, for many software systems, these category labels are often missing. To address this problem, we propose a text mining solution that can categorize defects into various types by analyzing both texts from bug reports and code features from bug fixes. To this end, we have manually analyzed the data about 500 defects from three software systems, and classified them according to ODC. In addition, we propose a classification-based approach that can automatically classify defects into three supercategories that are comprised of ODC categories: control and data flow, structural, and non-functional. Our empirical evaluation shows that the automatic classification approach is able to label defects with an average accuracy of 77.8% by using the SVM multiclass classification algorithm.
format text
author THUNG, Ferdian
LO, David
JIANG, Lingxiao
author_facet THUNG, Ferdian
LO, David
JIANG, Lingxiao
author_sort THUNG, Ferdian
title Automatic defect categorization
title_short Automatic defect categorization
title_full Automatic defect categorization
title_fullStr Automatic defect categorization
title_full_unstemmed Automatic defect categorization
title_sort automatic defect categorization
publisher Institutional Knowledge at Singapore Management University
publishDate 2012
url https://ink.library.smu.edu.sg/sis_research/1681
https://ink.library.smu.edu.sg/context/sis_research/article/2680/viewcontent/wcre12defects.pdf
_version_ 1770571454441586688