I know what you are searching for: Code snippet recommendation from Stack Overflow posts
Stack Overflow has been heavily used by software developers to seek programming-related information. More and more developers use Community Question and Answer forums, such as Stack Overflow, to search for code examples of how to accomplish a certain coding task. This is often considered to be more...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2023
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/8507 https://ink.library.smu.edu.sg/context/sis_research/article/9510/viewcontent/2210.15845.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-9510 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-95102024-01-22T15:11:45Z I know what you are searching for: Code snippet recommendation from Stack Overflow posts GAO, Zhipeng XIA, Xin LO, David GRUNDY, John C. ZHANG, Xindong XING, Zhenchang Stack Overflow has been heavily used by software developers to seek programming-related information. More and more developers use Community Question and Answer forums, such as Stack Overflow, to search for code examples of how to accomplish a certain coding task. This is often considered to be more efficient than working from source documentation, tutorials, or full worked examples. However, due to the complexity of these online Question and Answer forums and the very large volume of information they contain, developers can be overwhelmed by the sheer volume of available information. This makes it hard to find and/or even be aware of the most relevant code examples to meet their needs. To alleviate this issue, in this work, we present a query-driven code recommendation tool, named Que2Code, that identifies the best code snippets for a user query from Stack Overflow posts. Our approach has two main stages: (i) semantically equivalent question retrieval and (ii) best code snippet recommendation. During the first stage, for a given query question formulated by a developer, we first generate paraphrase questions for the input query as a way of query boosting and then retrieve the relevant Stack Overflow posted questions based on these generated questions. In the second stage, we collect all of the code snippets within questions retrieved in the first stage and develop a novel scheme to rank code snippet candidates from Stack Overflow posts via pairwise comparisons. To evaluate the performance of our proposed model, we conduct a large-scale experiment to evaluate the effectiveness of the semantically equivalent question retrieval task and best code snippet recommendation task separately on Python and Java datasets in Stack Overflow. We also perform a human study to measure how real-world developers perceive the results generated by our model. Both the automatic and human evaluation results demonstrate the promising performance of our model, and we have released our code and data to assist other researchers. 2023-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8507 info:doi/10.1145/3550150 https://ink.library.smu.edu.sg/context/sis_research/article/9510/viewcontent/2210.15845.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Software engineering Software evolution Maintaining software Artificial Intelligence and Robotics Software Engineering Theory and Algorithms |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Software engineering Software evolution Maintaining software Artificial Intelligence and Robotics Software Engineering Theory and Algorithms |
spellingShingle |
Software engineering Software evolution Maintaining software Artificial Intelligence and Robotics Software Engineering Theory and Algorithms GAO, Zhipeng XIA, Xin LO, David GRUNDY, John C. ZHANG, Xindong XING, Zhenchang I know what you are searching for: Code snippet recommendation from Stack Overflow posts |
description |
Stack Overflow has been heavily used by software developers to seek programming-related information. More and more developers use Community Question and Answer forums, such as Stack Overflow, to search for code examples of how to accomplish a certain coding task. This is often considered to be more efficient than working from source documentation, tutorials, or full worked examples. However, due to the complexity of these online Question and Answer forums and the very large volume of information they contain, developers can be overwhelmed by the sheer volume of available information. This makes it hard to find and/or even be aware of the most relevant code examples to meet their needs. To alleviate this issue, in this work, we present a query-driven code recommendation tool, named Que2Code, that identifies the best code snippets for a user query from Stack Overflow posts. Our approach has two main stages: (i) semantically equivalent question retrieval and (ii) best code snippet recommendation. During the first stage, for a given query question formulated by a developer, we first generate paraphrase questions for the input query as a way of query boosting and then retrieve the relevant Stack Overflow posted questions based on these generated questions. In the second stage, we collect all of the code snippets within questions retrieved in the first stage and develop a novel scheme to rank code snippet candidates from Stack Overflow posts via pairwise comparisons. To evaluate the performance of our proposed model, we conduct a large-scale experiment to evaluate the effectiveness of the semantically equivalent question retrieval task and best code snippet recommendation task separately on Python and Java datasets in Stack Overflow. We also perform a human study to measure how real-world developers perceive the results generated by our model. Both the automatic and human evaluation results demonstrate the promising performance of our model, and we have released our code and data to assist other researchers. |
format |
text |
author |
GAO, Zhipeng XIA, Xin LO, David GRUNDY, John C. ZHANG, Xindong XING, Zhenchang |
author_facet |
GAO, Zhipeng XIA, Xin LO, David GRUNDY, John C. ZHANG, Xindong XING, Zhenchang |
author_sort |
GAO, Zhipeng |
title |
I know what you are searching for: Code snippet recommendation from Stack Overflow posts |
title_short |
I know what you are searching for: Code snippet recommendation from Stack Overflow posts |
title_full |
I know what you are searching for: Code snippet recommendation from Stack Overflow posts |
title_fullStr |
I know what you are searching for: Code snippet recommendation from Stack Overflow posts |
title_full_unstemmed |
I know what you are searching for: Code snippet recommendation from Stack Overflow posts |
title_sort |
i know what you are searching for: code snippet recommendation from stack overflow posts |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2023 |
url |
https://ink.library.smu.edu.sg/sis_research/8507 https://ink.library.smu.edu.sg/context/sis_research/article/9510/viewcontent/2210.15845.pdf |
_version_ |
1789483255593959424 |