A discriminative model approach for accurate duplicate bug report retrieval

Bug repositories are usually maintained in software projects. Testers or users submit bug reports to identify various issues with systems. Sometimes two or more bug reports correspond to the same defect. To address the problem with duplicate bug reports, a person called a triager needs to manually l...

Full description

Saved in:

Bibliographic Details
Main Authors:	SUN, Chengnian, LO, David, WANG, Xiaoyin, KHOO, Siau-Cheng
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2010
Subjects:	Distribution Maintenance Enhancement Management Reliability Software Engineering
Online Access:	https://ink.library.smu.edu.sg/sis_research/3721 https://ink.library.smu.edu.sg/context/sis_research/article/4723/viewcontent/A_discriminative_model_approach_for_accurate_dupli.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-4723
record_format	dspace
spelling	sg-smu-ink.sis_research-47232017-09-13T04:55:20Z A discriminative model approach for accurate duplicate bug report retrieval SUN, Chengnian LO, David WANG, Xiaoyin KHOO, Siau-Cheng Bug repositories are usually maintained in software projects. Testers or users submit bug reports to identify various issues with systems. Sometimes two or more bug reports correspond to the same defect. To address the problem with duplicate bug reports, a person called a triager needs to manually label these bug reports as duplicates, and link them to their "master" reports for subsequent maintenance work. However, in practice there are considerable duplicate bug reports sent daily; requesting triagers to manually label these bugs could be highly time consuming. To address this issue, recently, several techniques have be proposed using various similarity based metrics to detect candidate duplicate bug reports for manual verification. Automating triaging has been proved challenging as two reports of the same bug could be written in various ways. There is still much room for improvement in terms of accuracy of duplicate detection process. In this paper, we leverage recent advances on using discriminative models for information retrieval to detect duplicate bug reports more accurately. We have validated our approach on three large software bug repositories from Firefox, Eclipse, and OpenOffice. We show that our technique could result in 17--31%, 22--26%, and 35--43% relative improvement over state-of-the-art techniques in OpenOffice, Firefox, and Eclipse datasets respectively using commonly available natural language information only. 2010-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/3721 info:doi/10.1145/1806799.1806811 https://ink.library.smu.edu.sg/context/sis_research/article/4723/viewcontent/A_discriminative_model_approach_for_accurate_dupli.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Distribution Maintenance Enhancement Management Reliability Software Engineering
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Distribution Maintenance Enhancement Management Reliability Software Engineering
spellingShingle	Distribution Maintenance Enhancement Management Reliability Software Engineering SUN, Chengnian LO, David WANG, Xiaoyin KHOO, Siau-Cheng A discriminative model approach for accurate duplicate bug report retrieval
description	Bug repositories are usually maintained in software projects. Testers or users submit bug reports to identify various issues with systems. Sometimes two or more bug reports correspond to the same defect. To address the problem with duplicate bug reports, a person called a triager needs to manually label these bug reports as duplicates, and link them to their "master" reports for subsequent maintenance work. However, in practice there are considerable duplicate bug reports sent daily; requesting triagers to manually label these bugs could be highly time consuming. To address this issue, recently, several techniques have be proposed using various similarity based metrics to detect candidate duplicate bug reports for manual verification. Automating triaging has been proved challenging as two reports of the same bug could be written in various ways. There is still much room for improvement in terms of accuracy of duplicate detection process. In this paper, we leverage recent advances on using discriminative models for information retrieval to detect duplicate bug reports more accurately. We have validated our approach on three large software bug repositories from Firefox, Eclipse, and OpenOffice. We show that our technique could result in 17--31%, 22--26%, and 35--43% relative improvement over state-of-the-art techniques in OpenOffice, Firefox, and Eclipse datasets respectively using commonly available natural language information only.
format	text
author	SUN, Chengnian LO, David WANG, Xiaoyin KHOO, Siau-Cheng
author_facet	SUN, Chengnian LO, David WANG, Xiaoyin KHOO, Siau-Cheng
author_sort	SUN, Chengnian
title	A discriminative model approach for accurate duplicate bug report retrieval
title_short	A discriminative model approach for accurate duplicate bug report retrieval
title_full	A discriminative model approach for accurate duplicate bug report retrieval
title_fullStr	A discriminative model approach for accurate duplicate bug report retrieval
title_full_unstemmed	A discriminative model approach for accurate duplicate bug report retrieval
title_sort	discriminative model approach for accurate duplicate bug report retrieval
publisher	Institutional Knowledge at Singapore Management University
publishDate	2010
url	https://ink.library.smu.edu.sg/sis_research/3721 https://ink.library.smu.edu.sg/context/sis_research/article/4723/viewcontent/A_discriminative_model_approach_for_accurate_dupli.pdf
_version_	1770573701890179072

A discriminative model approach for accurate duplicate bug report retrieval

Similar Items