It's not a bug, it's a feature: Does misclassification affect bug localization?

Bug localization refers to the task of automatically processing bug reports to locate source code files that are responsible for the bugs. Many bug localization techniques have been proposed in the literature. These techniques are often evaluated on issue reports that are marked as bugs by their rep...

Full description

Saved in:
Bibliographic Details
Main Authors: KOCCHAR, Pavneet Singh, LE, Tien-Duy B., LO, David
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2014
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/2606
https://ink.library.smu.edu.sg/context/sis_research/article/3606/viewcontent/msr14_misclassification.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Bug localization refers to the task of automatically processing bug reports to locate source code files that are responsible for the bugs. Many bug localization techniques have been proposed in the literature. These techniques are often evaluated on issue reports that are marked as bugs by their reporters in issue tracking systems. However, recent findings by Herzig et al. find that a substantial number of issue reports marked as bugs, are not bugs but other kinds of issues like refactorings, request for enhancement, documentation changes, test case creation, and so on. Herzig et al. report that these misclassifications affect bug prediction, namely the task of predicting which files are likely to be buggy in the future. In this work, we investigate whether these misclassifications also affect bug localization. To do so, we analyze issue reports that have been manually categorized by Herzig et al. and apply a bug localization technique to recover a ranked list of candidate buggy files for each issue report. We then evaluate whether the quality of ranked lists of reports reported as bugs is the same as that of real bug reports. Our findings shed light that there is a need for additional cleaning steps to be performed on issue reports before they are used to evaluate bug localization techniques.