An empirical study on developer interactions in StackOverflow

StackOverflow provides a popular platform where developers post and answer questions. Recently, Treude et al. manually label 385 questions in StackOverflow and group them into 10 categories based on their contents. They also analyze how tags are used in StackOverflow. In this study, we extend their...

Full description

Saved in:
Bibliographic Details
Main Authors: WANG, Shaowei, LO, David, JIANG, Lingxiao
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2013
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/1811
https://ink.library.smu.edu.sg/context/sis_research/article/2810/viewcontent/sac13stackoverflow.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-2810
record_format dspace
spelling sg-smu-ink.sis_research-28102017-02-05T07:05:45Z An empirical study on developer interactions in StackOverflow WANG, Shaowei LO, David JIANG, Lingxiao StackOverflow provides a popular platform where developers post and answer questions. Recently, Treude et al. manually label 385 questions in StackOverflow and group them into 10 categories based on their contents. They also analyze how tags are used in StackOverflow. In this study, we extend their work to obtain a deeper understanding on how developers interact with one another on such a question and answer web site. First, we analyze the distributions of developers who ask and answer questions. We also investigate if there is a segregation of the StackOverflow community into questioners and answerers. We also perform automated text mining to find the various kinds of topics asked by developers. We use Latent Dirichlet Allocation (LDA), a well known topic modeling approach, to analyze the contents of tens of thousands of questions and answers, and produce five topics. Our topic modeling strategy provides an alternative perspective different from that of Treude et al. for categorizing StackOverflow questions. Each question can now be categorized into several topics with different probabilities, and the learned topic model could automatically assign a new question to several categories with varying probabilities. Last but not least, we show the distributions of questions and developers belonging to various topics generated by LDA. 2013-03-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/1811 info:doi/10.1145/2480362.2480557 https://ink.library.smu.edu.sg/context/sis_research/article/2810/viewcontent/sac13stackoverflow.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University developer forum mining latent dirichlet allocation (LDA) developer interaction mining Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic developer forum mining
latent dirichlet allocation (LDA)
developer interaction mining
Software Engineering
spellingShingle developer forum mining
latent dirichlet allocation (LDA)
developer interaction mining
Software Engineering
WANG, Shaowei
LO, David
JIANG, Lingxiao
An empirical study on developer interactions in StackOverflow
description StackOverflow provides a popular platform where developers post and answer questions. Recently, Treude et al. manually label 385 questions in StackOverflow and group them into 10 categories based on their contents. They also analyze how tags are used in StackOverflow. In this study, we extend their work to obtain a deeper understanding on how developers interact with one another on such a question and answer web site. First, we analyze the distributions of developers who ask and answer questions. We also investigate if there is a segregation of the StackOverflow community into questioners and answerers. We also perform automated text mining to find the various kinds of topics asked by developers. We use Latent Dirichlet Allocation (LDA), a well known topic modeling approach, to analyze the contents of tens of thousands of questions and answers, and produce five topics. Our topic modeling strategy provides an alternative perspective different from that of Treude et al. for categorizing StackOverflow questions. Each question can now be categorized into several topics with different probabilities, and the learned topic model could automatically assign a new question to several categories with varying probabilities. Last but not least, we show the distributions of questions and developers belonging to various topics generated by LDA.
format text
author WANG, Shaowei
LO, David
JIANG, Lingxiao
author_facet WANG, Shaowei
LO, David
JIANG, Lingxiao
author_sort WANG, Shaowei
title An empirical study on developer interactions in StackOverflow
title_short An empirical study on developer interactions in StackOverflow
title_full An empirical study on developer interactions in StackOverflow
title_fullStr An empirical study on developer interactions in StackOverflow
title_full_unstemmed An empirical study on developer interactions in StackOverflow
title_sort empirical study on developer interactions in stackoverflow
publisher Institutional Knowledge at Singapore Management University
publishDate 2013
url https://ink.library.smu.edu.sg/sis_research/1811
https://ink.library.smu.edu.sg/context/sis_research/article/2810/viewcontent/sac13stackoverflow.pdf
_version_ 1770571594670800896