Supporting information needs of developers through web Q&A discussions

Programming is evolving because of the prevalence of the Web. Nowadays, it is a common activity that developers search the Web to find information in order to solve the problems they encounter while working on software development tasks. However, existing studies investigated the information needs...

Full description

Saved in:
Bibliographic Details
Main Author: Li, Jing
Other Authors: Sun Aixin
Format: Theses and Dissertations
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/74202
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-74202
record_format dspace
spelling sg-ntu-dr.10356-742022023-03-04T00:52:58Z Supporting information needs of developers through web Q&A discussions Li, Jing Sun Aixin School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering Programming is evolving because of the prevalence of the Web. Nowadays, it is a common activity that developers search the Web to find information in order to solve the problems they encounter while working on software development tasks. However, existing studies investigated the information needs of developers on the Web via qualitative analysis and questionnaire survey. Unfortunately, little is known about the developers' micro-level information behaviors and needs on the Web during software development. For example, how often did the developers refine existing queries and/or create new queries? and how many web pages were opened after a search? To fill this gap, we conducted an empirical study to investigate the strategies that how developers seek and use web resources at the micro-level. The empirical study revealed three key insights: First, developers might have an incomplete or even incorrect understanding of their needs; Second, there is a gap between the producers and consumers of software documentation; Third, many important pieces of information that developers need are explicitly undocumented in software documentation. There insights motivated further studies of supporting developers' information needs. More specifically, the contributions of this thesis are: (1) Understanding information needs of developers: We developed a video scraping tool to automatically extract developers' behavioral data from the task videos. We conducted a micro-level quantitative analysis of the developers' information, including patterns of keyword sources, keyword refinement, web pages visited, context switching, and information flow. The outcomes of this micro-level quantitative analysis provided three important insights for supporting developers' information needs. (2) Discovering learning resources: To bridge the information gap in the first insight, we developed our LinkLive technique to recommend more correlated learning resources when developers know less. LinkLive uses multiple features, including hyperlink co-occurrences in web Q&A discussions, locations (e.g., question, answer, or comment) in which hyperlinks are referenced, and votes for posts/comments in which hyperlinks are referenced. A large-scale evaluation shows that our technique recommends correlated web resources with satisfactory precision and recall in an open setting. (3) Answering programming questions: To bridge the information gap in the second insight, we proposed a novel deep-learning-to-answer framework, named QDLinker, for answering programming questions with software documentation. QDLinker leverages the large volume of discussions in Community-based Question Answering (CQA) to bridge the semantic gap between programmers' questions and software documentation. Through extensive experiments, we show that QDLinker significantly outperforms the baselines based on traditional retrieval models and Web search services dedicated for software documentation. (4) Distilling crowdsourced negative caveats: To bridge the information gap in the third insight, we proposed DISCA, a novel approach to automatically distilling desirable Application Program Interface (API) negative caveats from unstructured web Q&A discussions. The quantitative and qualitative evaluations show that DISCA can greatly augment the official API documentation. Doctor of Philosophy (SCE) 2018-05-08T01:53:59Z 2018-05-08T01:53:59Z 2018 Thesis Li, J. (2018). Supporting information needs of developers through web Q&A discussions. Doctoral thesis, Nanyang Technological University, Singapore. http://hdl.handle.net/10356/74202 10.32657/10356/74202 en 175 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering
spellingShingle DRNTU::Engineering::Computer science and engineering
Li, Jing
Supporting information needs of developers through web Q&A discussions
description Programming is evolving because of the prevalence of the Web. Nowadays, it is a common activity that developers search the Web to find information in order to solve the problems they encounter while working on software development tasks. However, existing studies investigated the information needs of developers on the Web via qualitative analysis and questionnaire survey. Unfortunately, little is known about the developers' micro-level information behaviors and needs on the Web during software development. For example, how often did the developers refine existing queries and/or create new queries? and how many web pages were opened after a search? To fill this gap, we conducted an empirical study to investigate the strategies that how developers seek and use web resources at the micro-level. The empirical study revealed three key insights: First, developers might have an incomplete or even incorrect understanding of their needs; Second, there is a gap between the producers and consumers of software documentation; Third, many important pieces of information that developers need are explicitly undocumented in software documentation. There insights motivated further studies of supporting developers' information needs. More specifically, the contributions of this thesis are: (1) Understanding information needs of developers: We developed a video scraping tool to automatically extract developers' behavioral data from the task videos. We conducted a micro-level quantitative analysis of the developers' information, including patterns of keyword sources, keyword refinement, web pages visited, context switching, and information flow. The outcomes of this micro-level quantitative analysis provided three important insights for supporting developers' information needs. (2) Discovering learning resources: To bridge the information gap in the first insight, we developed our LinkLive technique to recommend more correlated learning resources when developers know less. LinkLive uses multiple features, including hyperlink co-occurrences in web Q&A discussions, locations (e.g., question, answer, or comment) in which hyperlinks are referenced, and votes for posts/comments in which hyperlinks are referenced. A large-scale evaluation shows that our technique recommends correlated web resources with satisfactory precision and recall in an open setting. (3) Answering programming questions: To bridge the information gap in the second insight, we proposed a novel deep-learning-to-answer framework, named QDLinker, for answering programming questions with software documentation. QDLinker leverages the large volume of discussions in Community-based Question Answering (CQA) to bridge the semantic gap between programmers' questions and software documentation. Through extensive experiments, we show that QDLinker significantly outperforms the baselines based on traditional retrieval models and Web search services dedicated for software documentation. (4) Distilling crowdsourced negative caveats: To bridge the information gap in the third insight, we proposed DISCA, a novel approach to automatically distilling desirable Application Program Interface (API) negative caveats from unstructured web Q&A discussions. The quantitative and qualitative evaluations show that DISCA can greatly augment the official API documentation.
author2 Sun Aixin
author_facet Sun Aixin
Li, Jing
format Theses and Dissertations
author Li, Jing
author_sort Li, Jing
title Supporting information needs of developers through web Q&A discussions
title_short Supporting information needs of developers through web Q&A discussions
title_full Supporting information needs of developers through web Q&A discussions
title_fullStr Supporting information needs of developers through web Q&A discussions
title_full_unstemmed Supporting information needs of developers through web Q&A discussions
title_sort supporting information needs of developers through web q&a discussions
publishDate 2018
url http://hdl.handle.net/10356/74202
_version_ 1759857140517830656