Does deep learning improve the performance of duplicate bug report detection? An empirical study

Do Deep Learning (DL) techniques actually help to improve the performance of duplicate bug report detection? Prior studies suggest that they do, if the duplicate bug report detection task is treated as a binary classification problem. However, in realistic scenarios, the task is often viewed as a ra...

Full description

Saved in:

Bibliographic Details
Main Authors:	JIANG, Yuan, SU, Xiaohong, TREUDE, Christoph, SHANG, Chao, WANG, Tiantian
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2023
Subjects:	Duplicate bug report detection Deep learning Information retrieval Similarity measure Realistic evaluation Software Engineering
Online Access:	https://ink.library.smu.edu.sg/sis_research/8785 https://ink.library.smu.edu.sg/context/sis_research/article/9788/viewcontent/yuanjiang23.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-9788
record_format	dspace
spelling	sg-smu-ink.sis_research-97882024-05-30T08:57:07Z Does deep learning improve the performance of duplicate bug report detection? An empirical study JIANG, Yuan SU, Xiaohong TREUDE, Christoph SHANG, Chao WANG, Tiantian Do Deep Learning (DL) techniques actually help to improve the performance of duplicate bug report detection? Prior studies suggest that they do, if the duplicate bug report detection task is treated as a binary classification problem. However, in realistic scenarios, the task is often viewed as a ranking problem, which predicts potential duplicate bug reports by ranking based on similarities with existing historical bug reports. There is little empirical evidence to support that DL can be effectively applied to detect duplicate bug reports in the ranking scenario. Therefore, in this paper, we investigate whether well-known DL-based methods outperform classic information retrieval (IR) based methods on the duplicate bug report detection task. In addition, we argue that both IR- and DL-based methods suffer from incompletely evaluating the similarity between bug reports, resulting in the loss of important information. To address this problem, we propose a new method that combines IR and DL techniques to compute textual similarity more comprehensively. Our experimental results show that the DL-based method itself does not yield high performance compared to IR-based methods. However, our proposed combined method improves on the MAP metric of classic IR-based methods by a median of 7.09%–11.34% and a maximum of 17.228%–28.97%. 2023-04-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8785 info:doi/10.1016/j.jss.2023.111607 https://ink.library.smu.edu.sg/context/sis_research/article/9788/viewcontent/yuanjiang23.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Duplicate bug report detection Deep learning Information retrieval Similarity measure Realistic evaluation Software Engineering
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Duplicate bug report detection Deep learning Information retrieval Similarity measure Realistic evaluation Software Engineering
spellingShingle	Duplicate bug report detection Deep learning Information retrieval Similarity measure Realistic evaluation Software Engineering JIANG, Yuan SU, Xiaohong TREUDE, Christoph SHANG, Chao WANG, Tiantian Does deep learning improve the performance of duplicate bug report detection? An empirical study
description	Do Deep Learning (DL) techniques actually help to improve the performance of duplicate bug report detection? Prior studies suggest that they do, if the duplicate bug report detection task is treated as a binary classification problem. However, in realistic scenarios, the task is often viewed as a ranking problem, which predicts potential duplicate bug reports by ranking based on similarities with existing historical bug reports. There is little empirical evidence to support that DL can be effectively applied to detect duplicate bug reports in the ranking scenario. Therefore, in this paper, we investigate whether well-known DL-based methods outperform classic information retrieval (IR) based methods on the duplicate bug report detection task. In addition, we argue that both IR- and DL-based methods suffer from incompletely evaluating the similarity between bug reports, resulting in the loss of important information. To address this problem, we propose a new method that combines IR and DL techniques to compute textual similarity more comprehensively. Our experimental results show that the DL-based method itself does not yield high performance compared to IR-based methods. However, our proposed combined method improves on the MAP metric of classic IR-based methods by a median of 7.09%–11.34% and a maximum of 17.228%–28.97%.
format	text
author	JIANG, Yuan SU, Xiaohong TREUDE, Christoph SHANG, Chao WANG, Tiantian
author_facet	JIANG, Yuan SU, Xiaohong TREUDE, Christoph SHANG, Chao WANG, Tiantian
author_sort	JIANG, Yuan
title	Does deep learning improve the performance of duplicate bug report detection? An empirical study
title_short	Does deep learning improve the performance of duplicate bug report detection? An empirical study
title_full	Does deep learning improve the performance of duplicate bug report detection? An empirical study
title_fullStr	Does deep learning improve the performance of duplicate bug report detection? An empirical study
title_full_unstemmed	Does deep learning improve the performance of duplicate bug report detection? An empirical study
title_sort	does deep learning improve the performance of duplicate bug report detection? an empirical study
publisher	Institutional Knowledge at Singapore Management University
publishDate	2023
url	https://ink.library.smu.edu.sg/sis_research/8785 https://ink.library.smu.edu.sg/context/sis_research/article/9788/viewcontent/yuanjiang23.pdf
_version_	1814047529534226432

Does deep learning improve the performance of duplicate bug report detection? An empirical study

Similar Items