Chatbot4QR: Interactive query refinement for technical question retrieval

Technical Q&A sites (e.g., Stack Overflow(SO)) are important resources for developers to search for knowledge about technical problems. Search engines provided in Q&A sites and information retrieval approaches have limited capabilities to retrieve relevant questions when queries are imprecis...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHANG, Neng, HUANG, Qiao, XIA, Xin, ZOU, Ying, LO, David, XING, Zhenchang
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2022
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/5926
https://ink.library.smu.edu.sg/context/sis_research/article/6929/viewcontent/Chatbot4QR_av.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-6929
record_format dspace
spelling sg-smu-ink.sis_research-69292022-04-18T10:50:52Z Chatbot4QR: Interactive query refinement for technical question retrieval ZHANG, Neng HUANG, Qiao XIA, Xin ZOU, Ying LO, David XING, Zhenchang Technical Q&A sites (e.g., Stack Overflow(SO)) are important resources for developers to search for knowledge about technical problems. Search engines provided in Q&A sites and information retrieval approaches have limited capabilities to retrieve relevant questions when queries are imprecisely specified, such as missing important technical details (e.g., the user's preferred programming languages). Although many automatic query expansion approaches have been proposed to improve the quality of queries by expanding queries with relevant terms, the information missed is not identified. Moreover, without user involvement, the existing query expansion approaches may introduce unexpected terms and lead to undesired results. In this paper, we propose an interactive query refinement approach for question retrieval, named Chatbot4QR, which assists users in recognizing and clarifying technical details missed in queries and thus retrieve more relevant questions for users. Chatbot4QR automatically detects missing technical details in a query and generates several clarification questions (CQs) to interact with the user to capture their overlooked technical details. To ensure the accuracy of CQs, we design a heuristic-based approach for CQ generation after building two kinds of technical knowledge bases: a manually categorized result of 1,841 technical tags in SO and the multiple version-frequency information of the tags. We collect 1.88 million SO questions as the repository for question retrieval. To evaluate Chatbot4QR, we conduct six user studies with 25 participants on 50 experimental queries. The results show that: (1) On average 60.8% of the CQs generated for a query are useful for helping the participants recognize missing technical details; (2) Chatbot4QR can rapidly respond to the participants after receiving a query within ~1.3 seconds; (3) The refined queries contribute to retrieving more relevant SO questions than nine baseline approaches. For more than 70% of the participants who have preferred techniques on the query tasks, Chatbot4QR significantly outperforms the state-of-the-art word embedding-based retrieval approach with an improvement of at least 54.6% in terms of Pre@k and NDCG@k; and (4)For 48%-88% of the assigned query tasks, the participants obtain more desired results after interacting with Chatbot4QR than directly searching from Web search engines (e.g., the SO search engine and Google) using the original queries. 2022-04-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/5926 info:doi/10.1109/TSE.2020.3016006 https://ink.library.smu.edu.sg/context/sis_research/article/6929/viewcontent/Chatbot4QR_av.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Stack Overflow Chatbot Interactive Query Refinement Question Retrieval Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Stack Overflow
Chatbot
Interactive Query Refinement
Question Retrieval
Software Engineering
spellingShingle Stack Overflow
Chatbot
Interactive Query Refinement
Question Retrieval
Software Engineering
ZHANG, Neng
HUANG, Qiao
XIA, Xin
ZOU, Ying
LO, David
XING, Zhenchang
Chatbot4QR: Interactive query refinement for technical question retrieval
description Technical Q&A sites (e.g., Stack Overflow(SO)) are important resources for developers to search for knowledge about technical problems. Search engines provided in Q&A sites and information retrieval approaches have limited capabilities to retrieve relevant questions when queries are imprecisely specified, such as missing important technical details (e.g., the user's preferred programming languages). Although many automatic query expansion approaches have been proposed to improve the quality of queries by expanding queries with relevant terms, the information missed is not identified. Moreover, without user involvement, the existing query expansion approaches may introduce unexpected terms and lead to undesired results. In this paper, we propose an interactive query refinement approach for question retrieval, named Chatbot4QR, which assists users in recognizing and clarifying technical details missed in queries and thus retrieve more relevant questions for users. Chatbot4QR automatically detects missing technical details in a query and generates several clarification questions (CQs) to interact with the user to capture their overlooked technical details. To ensure the accuracy of CQs, we design a heuristic-based approach for CQ generation after building two kinds of technical knowledge bases: a manually categorized result of 1,841 technical tags in SO and the multiple version-frequency information of the tags. We collect 1.88 million SO questions as the repository for question retrieval. To evaluate Chatbot4QR, we conduct six user studies with 25 participants on 50 experimental queries. The results show that: (1) On average 60.8% of the CQs generated for a query are useful for helping the participants recognize missing technical details; (2) Chatbot4QR can rapidly respond to the participants after receiving a query within ~1.3 seconds; (3) The refined queries contribute to retrieving more relevant SO questions than nine baseline approaches. For more than 70% of the participants who have preferred techniques on the query tasks, Chatbot4QR significantly outperforms the state-of-the-art word embedding-based retrieval approach with an improvement of at least 54.6% in terms of Pre@k and NDCG@k; and (4)For 48%-88% of the assigned query tasks, the participants obtain more desired results after interacting with Chatbot4QR than directly searching from Web search engines (e.g., the SO search engine and Google) using the original queries.
format text
author ZHANG, Neng
HUANG, Qiao
XIA, Xin
ZOU, Ying
LO, David
XING, Zhenchang
author_facet ZHANG, Neng
HUANG, Qiao
XIA, Xin
ZOU, Ying
LO, David
XING, Zhenchang
author_sort ZHANG, Neng
title Chatbot4QR: Interactive query refinement for technical question retrieval
title_short Chatbot4QR: Interactive query refinement for technical question retrieval
title_full Chatbot4QR: Interactive query refinement for technical question retrieval
title_fullStr Chatbot4QR: Interactive query refinement for technical question retrieval
title_full_unstemmed Chatbot4QR: Interactive query refinement for technical question retrieval
title_sort chatbot4qr: interactive query refinement for technical question retrieval
publisher Institutional Knowledge at Singapore Management University
publishDate 2022
url https://ink.library.smu.edu.sg/sis_research/5926
https://ink.library.smu.edu.sg/context/sis_research/article/6929/viewcontent/Chatbot4QR_av.pdf
_version_ 1770575694585135104