Generating question titles for Stack Overflow from mined code snippets

Stack Overflow has been heavily used by software developers as a popular way to seek programming-related information from peers via the internet. The Stack Overflow community recommends users to provide the related code snippet when they are creating a question to help others better understand it an...

Full description

Saved in:
Bibliographic Details
Main Authors: GAO, Zhipeng, XIA, Xin, GRUNDY, John, LO, David, LI, Yuan-Fang
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2020
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/5622
https://ink.library.smu.edu.sg/context/sis_research/article/6625/viewcontent/Generating_Question_Titles_2020_av.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-6625
record_format dspace
spelling sg-smu-ink.sis_research-66252021-05-11T08:43:02Z Generating question titles for Stack Overflow from mined code snippets GAO, Zhipeng XIA, Xin GRUNDY, John LO, David LI, Yuan-Fang Stack Overflow has been heavily used by software developers as a popular way to seek programming-related information from peers via the internet. The Stack Overflow community recommends users to provide the related code snippet when they are creating a question to help others better understand it and offer their help. Previous studies have shown that a significant number of these questions are of low-quality and not attractive to other potential experts in Stack Overflow. These poorly asked questions are less likely to receive useful answers and hinder the overall knowledge generation and sharing process. Considering one of the reasons for introducing low-quality questions in SO is that many developers may not be able to clarify and summarize the key problems behind their presented code snippets due to their lack of knowledge and terminology related to the problem, and/or their poor writing skills, in this study we propose an approach to assist developers in writing high-quality questions by automatically generating question titles for a code snippet using a deep sequence-to-sequence learning approach. Our approach is fully data-driven and uses an attention mechanism to perform better content selection, a copy mechanism to handle the rare-words problem and a coverage mechanism to eliminate word repetition problem. We evaluate our approach on Stack Overflow datasets over a variety of programming languages (e.g., Python, Java, Javascript, C# and SQL) and our experimental results show that our approach significantly outperforms several state-of-the-art baselines in both automatic and human evaluation. We have released our code and datasets to facilitate other researchers to verify their ideas and inspire the follow up work. 2020-10-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/5622 info:doi/10.1145/3401026 https://ink.library.smu.edu.sg/context/sis_research/article/6625/viewcontent/Generating_Question_Titles_2020_av.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Stack Overflow question generation question quality sequence-to-sequence Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Stack Overflow
question generation
question quality
sequence-to-sequence
Software Engineering
spellingShingle Stack Overflow
question generation
question quality
sequence-to-sequence
Software Engineering
GAO, Zhipeng
XIA, Xin
GRUNDY, John
LO, David
LI, Yuan-Fang
Generating question titles for Stack Overflow from mined code snippets
description Stack Overflow has been heavily used by software developers as a popular way to seek programming-related information from peers via the internet. The Stack Overflow community recommends users to provide the related code snippet when they are creating a question to help others better understand it and offer their help. Previous studies have shown that a significant number of these questions are of low-quality and not attractive to other potential experts in Stack Overflow. These poorly asked questions are less likely to receive useful answers and hinder the overall knowledge generation and sharing process. Considering one of the reasons for introducing low-quality questions in SO is that many developers may not be able to clarify and summarize the key problems behind their presented code snippets due to their lack of knowledge and terminology related to the problem, and/or their poor writing skills, in this study we propose an approach to assist developers in writing high-quality questions by automatically generating question titles for a code snippet using a deep sequence-to-sequence learning approach. Our approach is fully data-driven and uses an attention mechanism to perform better content selection, a copy mechanism to handle the rare-words problem and a coverage mechanism to eliminate word repetition problem. We evaluate our approach on Stack Overflow datasets over a variety of programming languages (e.g., Python, Java, Javascript, C# and SQL) and our experimental results show that our approach significantly outperforms several state-of-the-art baselines in both automatic and human evaluation. We have released our code and datasets to facilitate other researchers to verify their ideas and inspire the follow up work.
format text
author GAO, Zhipeng
XIA, Xin
GRUNDY, John
LO, David
LI, Yuan-Fang
author_facet GAO, Zhipeng
XIA, Xin
GRUNDY, John
LO, David
LI, Yuan-Fang
author_sort GAO, Zhipeng
title Generating question titles for Stack Overflow from mined code snippets
title_short Generating question titles for Stack Overflow from mined code snippets
title_full Generating question titles for Stack Overflow from mined code snippets
title_fullStr Generating question titles for Stack Overflow from mined code snippets
title_full_unstemmed Generating question titles for Stack Overflow from mined code snippets
title_sort generating question titles for stack overflow from mined code snippets
publisher Institutional Knowledge at Singapore Management University
publishDate 2020
url https://ink.library.smu.edu.sg/sis_research/5622
https://ink.library.smu.edu.sg/context/sis_research/article/6625/viewcontent/Generating_Question_Titles_2020_av.pdf
_version_ 1770575532228870144