Combining word embedding with information retrieval to recommend similar bug reports

Similar bugs are bugs that require handling of many common code files. Developers can often fix similar bugs with a shorter time and a higher quality since they can focus on fewer code files. Therefore, similar bug recommendation is a meaningful task which can improve development efficiency. Rocha e...

Full description

Saved in:
Bibliographic Details
Main Authors: YANG, Xinli, LO, David, XIA, Xin, BAO, Lingfeng, SUN, Jianling
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2016
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/3559
https://ink.library.smu.edu.sg/context/sis_research/article/4560/viewcontent/Combining_Word_Embedding_2016_av.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-4560
record_format dspace
spelling sg-smu-ink.sis_research-45602021-04-23T07:08:32Z Combining word embedding with information retrieval to recommend similar bug reports YANG, Xinli LO, David XIA, Xin BAO, Lingfeng SUN, Jianling Similar bugs are bugs that require handling of many common code files. Developers can often fix similar bugs with a shorter time and a higher quality since they can focus on fewer code files. Therefore, similar bug recommendation is a meaningful task which can improve development efficiency. Rocha et al. propose the first similar bug recommendation system named NextBug. Although NextBug performs better than a start-of-the-art duplicated bug detection technique REP, its performance is not optimal and thus more work is needed to improve its effectiveness. Technically, it is also rather simple as it relies only upon a standard information retrieval technique, i.e., cosine similarity. In the paper, we propose a novel approach to recommend similar bugs. The approach combines a traditional information retrieval technique and a word embedding technique, and takes bug titles and descriptions as well as bug product and component information into consideration. To evaluate the approach, we use datasets from two popular open-source projects, i.e., Eclipse and Mozilla, each of which contains bug reports whose bug ids range from [1,400000]. The results show that our approach improves the performance of NextBug statistically significantly and substantially for both projects. 2016-10-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/3559 info:doi/10.1109/ISSRE.2016.33 https://ink.library.smu.edu.sg/context/sis_research/article/4560/viewcontent/Combining_Word_Embedding_2016_av.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Information Retrieval Recommendation Systems Similar Bugs Word Embedding Computer Sciences Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Information Retrieval
Recommendation Systems
Similar Bugs
Word Embedding
Computer Sciences
Software Engineering
spellingShingle Information Retrieval
Recommendation Systems
Similar Bugs
Word Embedding
Computer Sciences
Software Engineering
YANG, Xinli
LO, David
XIA, Xin
BAO, Lingfeng
SUN, Jianling
Combining word embedding with information retrieval to recommend similar bug reports
description Similar bugs are bugs that require handling of many common code files. Developers can often fix similar bugs with a shorter time and a higher quality since they can focus on fewer code files. Therefore, similar bug recommendation is a meaningful task which can improve development efficiency. Rocha et al. propose the first similar bug recommendation system named NextBug. Although NextBug performs better than a start-of-the-art duplicated bug detection technique REP, its performance is not optimal and thus more work is needed to improve its effectiveness. Technically, it is also rather simple as it relies only upon a standard information retrieval technique, i.e., cosine similarity. In the paper, we propose a novel approach to recommend similar bugs. The approach combines a traditional information retrieval technique and a word embedding technique, and takes bug titles and descriptions as well as bug product and component information into consideration. To evaluate the approach, we use datasets from two popular open-source projects, i.e., Eclipse and Mozilla, each of which contains bug reports whose bug ids range from [1,400000]. The results show that our approach improves the performance of NextBug statistically significantly and substantially for both projects.
format text
author YANG, Xinli
LO, David
XIA, Xin
BAO, Lingfeng
SUN, Jianling
author_facet YANG, Xinli
LO, David
XIA, Xin
BAO, Lingfeng
SUN, Jianling
author_sort YANG, Xinli
title Combining word embedding with information retrieval to recommend similar bug reports
title_short Combining word embedding with information retrieval to recommend similar bug reports
title_full Combining word embedding with information retrieval to recommend similar bug reports
title_fullStr Combining word embedding with information retrieval to recommend similar bug reports
title_full_unstemmed Combining word embedding with information retrieval to recommend similar bug reports
title_sort combining word embedding with information retrieval to recommend similar bug reports
publisher Institutional Knowledge at Singapore Management University
publishDate 2016
url https://ink.library.smu.edu.sg/sis_research/3559
https://ink.library.smu.edu.sg/context/sis_research/article/4560/viewcontent/Combining_Word_Embedding_2016_av.pdf
_version_ 1770573328945250304