Automatic Fine-Grained Issue Report Reclassification

Issue tracking systems are valuable resources during software maintenance activities. These systems contain different categories of issue reports such as bug, request for improvement (RFE), documentation, refactoring, task etc. While logging issue reports into a tracking system, reporters can indica...

Full description

Saved in:
Bibliographic Details
Main Authors: Kochhar, Pavneet Singh, Thung, Ferdian, LO, David
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2014
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/2439
https://ink.library.smu.edu.sg/context/sis_research/article/3439/viewcontent/iceccs15_reclassification.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Issue tracking systems are valuable resources during software maintenance activities. These systems contain different categories of issue reports such as bug, request for improvement (RFE), documentation, refactoring, task etc. While logging issue reports into a tracking system, reporters can indicate the category of the reports. Herzig et al. Recently reported that more than 40% of issue reports are given wrong categories in issue tracking systems. Among issue reports that are marked as bugs, more than 30% of them are not bug reports. The misclassification of issue reports can adversely affects developers as they then need to manually identify the categories of various issue reports. To address this problem, in this paper we propose an automated technique that reclassifies an issue report into an appropriate category. Our approach extracts various feature values from a bug report and predicts if a bug report needs to be reclassified and its reclassified category. We have evaluated our approach to reclassify more than 7,000 bug reports from HTTP Client, Jackrabbit, Lucene-Java, Rhino, and Tomcat 5 into 1 out of 13 categories. Our experiments show that we can achieve a weighted precision, recall, and F1 (F-measure) score in the ranges of 0.58-0.71, 0.61-0.72, and 0.57-0.71 respectively. In terms of F1, which is the harmonic mean of precision and recall, our approach can substantially outperform several baselines by 28.88%-416.66%.