Automated query reformulation for efficient search based on query logs from stack overflow

As a popular Q&A site for programming, Stack Overflow is a treasure for developers. However, the amount of questions and answers on Stack Overflow make it difficult for developers to efficiently locate the information they are looking for. There are two gaps leading to poor search results: the g...

Full description

Saved in:
Bibliographic Details
Main Authors: CAO, Kaibo, CHEN, Chunyang, BALTES, Sebastian, TREUDE, Christoph, CHEN, Xiang
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2021
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/8848
https://ink.library.smu.edu.sg/context/sis_research/article/9851/viewcontent/icse21a.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-9851
record_format dspace
spelling sg-smu-ink.sis_research-98512024-06-13T09:17:11Z Automated query reformulation for efficient search based on query logs from stack overflow CAO, Kaibo CHEN, Chunyang BALTES, Sebastian TREUDE, Christoph CHEN, Xiang As a popular Q&A site for programming, Stack Overflow is a treasure for developers. However, the amount of questions and answers on Stack Overflow make it difficult for developers to efficiently locate the information they are looking for. There are two gaps leading to poor search results: the gap between the user's intention and the textual query, and the semantic gap between the query and the post content. Therefore, developers have to constantly reformulate their queries by correcting misspelled words, adding limitations to certain programming languages or platforms, etc. As query reformulation is tedious for developers, especially for novices, we propose an automated software-specific query reformulation approach based on deep learning. With query logs provided by Stack Overflow, we construct a large-scale query reformulation corpus, including the original queries and corresponding reformulated ones. Our approach trains a Transformer model that can automatically generate candidate reformulated queries when given the user's original query. The evaluation results show that our approach outperforms five state-of-the-art baselines, and achieves a 5.6% to 33.5% boost in terms of ExactMatch and a 4.8% to 14.4% boost in terms of GLEU. 2021-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8848 info:doi/10.1109/ICSE43902.2021.00116 https://ink.library.smu.edu.sg/context/sis_research/article/9851/viewcontent/icse21a.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Data Mining Deep Learning Query Logs Query Reformulation Stack Overflow Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Data Mining
Deep Learning
Query Logs
Query Reformulation
Stack Overflow
Software Engineering
spellingShingle Data Mining
Deep Learning
Query Logs
Query Reformulation
Stack Overflow
Software Engineering
CAO, Kaibo
CHEN, Chunyang
BALTES, Sebastian
TREUDE, Christoph
CHEN, Xiang
Automated query reformulation for efficient search based on query logs from stack overflow
description As a popular Q&A site for programming, Stack Overflow is a treasure for developers. However, the amount of questions and answers on Stack Overflow make it difficult for developers to efficiently locate the information they are looking for. There are two gaps leading to poor search results: the gap between the user's intention and the textual query, and the semantic gap between the query and the post content. Therefore, developers have to constantly reformulate their queries by correcting misspelled words, adding limitations to certain programming languages or platforms, etc. As query reformulation is tedious for developers, especially for novices, we propose an automated software-specific query reformulation approach based on deep learning. With query logs provided by Stack Overflow, we construct a large-scale query reformulation corpus, including the original queries and corresponding reformulated ones. Our approach trains a Transformer model that can automatically generate candidate reformulated queries when given the user's original query. The evaluation results show that our approach outperforms five state-of-the-art baselines, and achieves a 5.6% to 33.5% boost in terms of ExactMatch and a 4.8% to 14.4% boost in terms of GLEU.
format text
author CAO, Kaibo
CHEN, Chunyang
BALTES, Sebastian
TREUDE, Christoph
CHEN, Xiang
author_facet CAO, Kaibo
CHEN, Chunyang
BALTES, Sebastian
TREUDE, Christoph
CHEN, Xiang
author_sort CAO, Kaibo
title Automated query reformulation for efficient search based on query logs from stack overflow
title_short Automated query reformulation for efficient search based on query logs from stack overflow
title_full Automated query reformulation for efficient search based on query logs from stack overflow
title_fullStr Automated query reformulation for efficient search based on query logs from stack overflow
title_full_unstemmed Automated query reformulation for efficient search based on query logs from stack overflow
title_sort automated query reformulation for efficient search based on query logs from stack overflow
publisher Institutional Knowledge at Singapore Management University
publishDate 2021
url https://ink.library.smu.edu.sg/sis_research/8848
https://ink.library.smu.edu.sg/context/sis_research/article/9851/viewcontent/icse21a.pdf
_version_ 1814047593261432832