Will this localization tool be effective for this bug? Mitigating the impact of unreliability of information retrieval based bug localization tools

Information retrieval (IR) based bug localization approaches process a textual bug report and a collection of source code files to find buggy files. They output a ranked list of files sorted by their likelihood to contain the bug. Recently, several IR-based bug localization tools have been proposed....

Full description

Saved in:

Bibliographic Details
Main Authors:	LE, Tien-Duy B., THUNG, Ferdian, LO, David
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2017
Subjects:	Bug localization Bug reports Effectiveness prediction Information retrieval Text classification Computer Sciences Software Engineering
Online Access:	https://ink.library.smu.edu.sg/sis_research/3704 https://ink.library.smu.edu.sg/context/sis_research/article/4706/viewcontent/LocalizationToolBug_2017_afv.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-4706
record_format	dspace
spelling	sg-smu-ink.sis_research-47062020-01-23T08:15:40Z Will this localization tool be effective for this bug? Mitigating the impact of unreliability of information retrieval based bug localization tools LE, Tien-Duy B. THUNG, Ferdian LO, David Information retrieval (IR) based bug localization approaches process a textual bug report and a collection of source code files to find buggy files. They output a ranked list of files sorted by their likelihood to contain the bug. Recently, several IR-based bug localization tools have been proposed. However, there are no perfect tools that can successfully localize faults within a few number of most suspicious program elements for every single input bug report. Therefore, it is difficult for developers to decide which tool would be effective for a given bug report. Furthermore, for some bug reports, no bug localization tools would be useful. Even a state-of-the-art bug localization tool outputs many ranked lists where buggy files appear very low in the lists. This potentially causes developers to distrust bug localization tools. In this work, we build an oracle that can automatically predict whether a ranked list produced by an IR-based bug localization tool is likely to be effective or not. We consider a ranked list to be effective if a buggy file appears in the top-N position of the list. If a ranked list is unlikely to be effective, developers do not need to waste time in checking the recommended files one by one. In such cases, it is better for developers to use traditional debugging methods or request for further information to localize bugs. To build this oracle, our approach extracts features that can be divided into four categories: score features, textual features, topic model features, and metadata features. We build a separate prediction model for each category, and combine them to create a composite prediction model which is used as the oracle. We name this solution APRILE, which stands for Automated PRediction of IR-based Bug Localization’s Effectiveness. We further integrate APRILE with two other components that are learned using our bagging-based ensemble classification (BEC) method. We refer to the extension of APRILE as APRILE +. We have evaluated APRILE + to predict the effectiveness of three state-of-the-art IR-based bug localization tools on more than three thousands bug reports from AspectJ, Eclipse, SWT, and Tomcat. APRILE + can achieve an average precision, recall, and F-measure of 77.61 %, 88.94 %, and 82.09 %, respectively. Furthermore, APRILE + outperforms a baseline approach by Le and Lo and APRILE by up to a 17.43 % and 10.51 % increase in F-measure respectively. 2017-08-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/3704 info:doi/10.1007/s10664-016-9484-y https://ink.library.smu.edu.sg/context/sis_research/article/4706/viewcontent/LocalizationToolBug_2017_afv.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Bug localization Bug reports Effectiveness prediction Information retrieval Text classification Computer Sciences Software Engineering
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Bug localization Bug reports Effectiveness prediction Information retrieval Text classification Computer Sciences Software Engineering
spellingShingle	Bug localization Bug reports Effectiveness prediction Information retrieval Text classification Computer Sciences Software Engineering LE, Tien-Duy B. THUNG, Ferdian LO, David Will this localization tool be effective for this bug? Mitigating the impact of unreliability of information retrieval based bug localization tools
description	Information retrieval (IR) based bug localization approaches process a textual bug report and a collection of source code files to find buggy files. They output a ranked list of files sorted by their likelihood to contain the bug. Recently, several IR-based bug localization tools have been proposed. However, there are no perfect tools that can successfully localize faults within a few number of most suspicious program elements for every single input bug report. Therefore, it is difficult for developers to decide which tool would be effective for a given bug report. Furthermore, for some bug reports, no bug localization tools would be useful. Even a state-of-the-art bug localization tool outputs many ranked lists where buggy files appear very low in the lists. This potentially causes developers to distrust bug localization tools. In this work, we build an oracle that can automatically predict whether a ranked list produced by an IR-based bug localization tool is likely to be effective or not. We consider a ranked list to be effective if a buggy file appears in the top-N position of the list. If a ranked list is unlikely to be effective, developers do not need to waste time in checking the recommended files one by one. In such cases, it is better for developers to use traditional debugging methods or request for further information to localize bugs. To build this oracle, our approach extracts features that can be divided into four categories: score features, textual features, topic model features, and metadata features. We build a separate prediction model for each category, and combine them to create a composite prediction model which is used as the oracle. We name this solution APRILE, which stands for Automated PRediction of IR-based Bug Localization’s Effectiveness. We further integrate APRILE with two other components that are learned using our bagging-based ensemble classification (BEC) method. We refer to the extension of APRILE as APRILE +. We have evaluated APRILE + to predict the effectiveness of three state-of-the-art IR-based bug localization tools on more than three thousands bug reports from AspectJ, Eclipse, SWT, and Tomcat. APRILE + can achieve an average precision, recall, and F-measure of 77.61 %, 88.94 %, and 82.09 %, respectively. Furthermore, APRILE + outperforms a baseline approach by Le and Lo and APRILE by up to a 17.43 % and 10.51 % increase in F-measure respectively.
format	text
author	LE, Tien-Duy B. THUNG, Ferdian LO, David
author_facet	LE, Tien-Duy B. THUNG, Ferdian LO, David
author_sort	LE, Tien-Duy B.
title	Will this localization tool be effective for this bug? Mitigating the impact of unreliability of information retrieval based bug localization tools
title_short	Will this localization tool be effective for this bug? Mitigating the impact of unreliability of information retrieval based bug localization tools
title_full	Will this localization tool be effective for this bug? Mitigating the impact of unreliability of information retrieval based bug localization tools
title_fullStr	Will this localization tool be effective for this bug? Mitigating the impact of unreliability of information retrieval based bug localization tools
title_full_unstemmed	Will this localization tool be effective for this bug? Mitigating the impact of unreliability of information retrieval based bug localization tools
title_sort	will this localization tool be effective for this bug? mitigating the impact of unreliability of information retrieval based bug localization tools
publisher	Institutional Knowledge at Singapore Management University
publishDate	2017
url	https://ink.library.smu.edu.sg/sis_research/3704 https://ink.library.smu.edu.sg/context/sis_research/article/4706/viewcontent/LocalizationToolBug_2017_afv.pdf
_version_	1770573676553437184

Will this localization tool be effective for this bug? Mitigating the impact of unreliability of information retrieval based bug localization tools

Similar Items