Expert as a service: Software expert recommendation via knowledge domain embeddings in stack overflow

Question answering (Q&A) communities have gained momentum recently as an effective means of knowledge sharing over the crowds, where many users are experts in the real-world and can make quality contributions in certain domains or technologies. Although the massive user-generated Q&A data pr...

Full description

Saved in:
Bibliographic Details
Main Authors: HUANG, Chaoran, YAO, Lina, WANG, Xianzhi, BENATALLAH, Boualem, SHENG, Quan Z.
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2017
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/4046
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Question answering (Q&A) communities have gained momentum recently as an effective means of knowledge sharing over the crowds, where many users are experts in the real-world and can make quality contributions in certain domains or technologies. Although the massive user-generated Q&A data present a valuable source of human knowledge, a related challenging issue is how to find those expert users effectively. In this paper, we propose a framework for finding such experts in a collaborative network. Accredited with recent works on distributed word representations, we are able to summarize text chunks from the semantics perspective and infer knowledge domains by clustering pre-trained word vectors. In particular, we exploit a graph-based clustering method for knowledge domain extraction and discern the shared latent factors using matrix factorization techniques. The proposed clustering method features requiring no post-processing of clustering indicators and the matrix factorization method is combined with the semantic similarity of the historical answers to conduct expertise ranking of users given a query. We use Stack Overflow, a website with a large group of users and a large number of posts on topics related to computer programming, to evaluate the proposed approach and conduct extensively experiments to show the effectiveness of our approach.