CQARank: Jointly Model Topics and Expertise in Community Question Answering

Community Question Answering (CQA) websites, where people share expertise on open platforms, have become large repositories of valuable knowledge. To bring the best value out of these knowledge repositories, it is critically important for CQA services to know how to find the right experts, retrieve...

Full description

Saved in:
Bibliographic Details
Main Authors: YANG, Liu, QIU, Minghui, GOTTOPATI, Swapna, ZHU, Feida, JIANG, Jing, SUN, Huiping, CHEN, Zhong
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2013
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/2232
https://ink.library.smu.edu.sg/context/sis_research/article/3232/viewcontent/CIKM_13.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Community Question Answering (CQA) websites, where people share expertise on open platforms, have become large repositories of valuable knowledge. To bring the best value out of these knowledge repositories, it is critically important for CQA services to know how to find the right experts, retrieve archived similar questions and recommend best answers to new questions. To tackle this cluster of closely related problems in a principled approach, we proposed Topic Expertise Model (TEM), a novel probabilistic generative model with GMM hybrid, to jointly model topics and expertise by integrating textual content model and link structure analysis. Based on TEM results, we proposed CQARank to measure user interests and expertise score under different topics. Leveraging the question answering history based on long-term community reviews and voting, our method could find experts with both similar topical preference and high topical expertise. Experiments carried out on Stack Overflow data, the largest CQA focused on computer programming, show that our method achieves significant improvement over existing methods on multiple metrics.