EnTagRec(++): An enhanced tag recommendation system for software information sites

Software engineers share experiences with modern technologies using software information sites, such as Stack Overflow. These sites allow developers to label posted content, referred to as software objects, with short descriptions, known as tags. Tags help to improve the organization of questions an...

Full description

Saved in:
Bibliographic Details
Main Authors: WANG, Shawei, LO, David, VASILESCU, Bogdan, SEREBRENIK, Alexander
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2018
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/4127
https://ink.library.smu.edu.sg/context/sis_research/article/5130/viewcontent/EnTagRec_An_enhanced_tag_recommendation_system_for_software_information_sites.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-5130
record_format dspace
spelling sg-smu-ink.sis_research-51302018-09-21T03:21:33Z EnTagRec(++): An enhanced tag recommendation system for software information sites WANG, Shawei LO, David VASILESCU, Bogdan SEREBRENIK, Alexander Software engineers share experiences with modern technologies using software information sites, such as Stack Overflow. These sites allow developers to label posted content, referred to as software objects, with short descriptions, known as tags. Tags help to improve the organization of questions and simplify the browsing of questions for users. However, tags assigned to objects tend to be noisy and some objects are not well tagged. For instance, 14.7% of the questions that were posted in 2015 on Stack Overflow needed tag re-editing after the initial assignment. To improve the quality of tags in software information sites, we propose EnTagRec (++), which is an advanced version of our prior work EnTagRec. Different from EnTagRec, EnTagRec (++) does not only integrate the historical tag assignments to software objects, but also leverages the information of users, and an initial set of tags that a user may provide for tag recommendation. We evaluate its performance on five software information sites, Stack Overflow, Ask Ubuntu, Ask Different, Super User, and Freecode. We observe that even without considering an initial set of tags that a user provides, it achieves Recall@5 scores of 0.821, 0.822, 0.891, 0.818 and 0.651, and Recall@10 scores of 0.873, 0.886, 0.956, 0.887 and 0.761, on Stack Overflow, Ask Ubuntu, Ask Different, Super User, and Freecode, respectively. In terms of Recall@5 and Recall@10, averaging across the 5 datasets, it improves upon TagCombine, which is the prior state-of-the-art approach, by 29.3% and 14.5% respectively. Moreover, the performance of our approach is further boosted if users provide some initial tags that our approach can leverage to infer additional tags: when an initial set of tags is given, Recall@5 is improved by 10%. 2018-04-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/4127 info:doi/10.1007/s10664-017-9533-1 https://ink.library.smu.edu.sg/context/sis_research/article/5130/viewcontent/EnTagRec_An_enhanced_tag_recommendation_system_for_software_information_sites.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Software information sites;Recommendation systems;Tagging Computer and Systems Architecture Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Software information sites;Recommendation systems;Tagging
Computer and Systems Architecture
Software Engineering
spellingShingle Software information sites;Recommendation systems;Tagging
Computer and Systems Architecture
Software Engineering
WANG, Shawei
LO, David
VASILESCU, Bogdan
SEREBRENIK, Alexander
EnTagRec(++): An enhanced tag recommendation system for software information sites
description Software engineers share experiences with modern technologies using software information sites, such as Stack Overflow. These sites allow developers to label posted content, referred to as software objects, with short descriptions, known as tags. Tags help to improve the organization of questions and simplify the browsing of questions for users. However, tags assigned to objects tend to be noisy and some objects are not well tagged. For instance, 14.7% of the questions that were posted in 2015 on Stack Overflow needed tag re-editing after the initial assignment. To improve the quality of tags in software information sites, we propose EnTagRec (++), which is an advanced version of our prior work EnTagRec. Different from EnTagRec, EnTagRec (++) does not only integrate the historical tag assignments to software objects, but also leverages the information of users, and an initial set of tags that a user may provide for tag recommendation. We evaluate its performance on five software information sites, Stack Overflow, Ask Ubuntu, Ask Different, Super User, and Freecode. We observe that even without considering an initial set of tags that a user provides, it achieves Recall@5 scores of 0.821, 0.822, 0.891, 0.818 and 0.651, and Recall@10 scores of 0.873, 0.886, 0.956, 0.887 and 0.761, on Stack Overflow, Ask Ubuntu, Ask Different, Super User, and Freecode, respectively. In terms of Recall@5 and Recall@10, averaging across the 5 datasets, it improves upon TagCombine, which is the prior state-of-the-art approach, by 29.3% and 14.5% respectively. Moreover, the performance of our approach is further boosted if users provide some initial tags that our approach can leverage to infer additional tags: when an initial set of tags is given, Recall@5 is improved by 10%.
format text
author WANG, Shawei
LO, David
VASILESCU, Bogdan
SEREBRENIK, Alexander
author_facet WANG, Shawei
LO, David
VASILESCU, Bogdan
SEREBRENIK, Alexander
author_sort WANG, Shawei
title EnTagRec(++): An enhanced tag recommendation system for software information sites
title_short EnTagRec(++): An enhanced tag recommendation system for software information sites
title_full EnTagRec(++): An enhanced tag recommendation system for software information sites
title_fullStr EnTagRec(++): An enhanced tag recommendation system for software information sites
title_full_unstemmed EnTagRec(++): An enhanced tag recommendation system for software information sites
title_sort entagrec(++): an enhanced tag recommendation system for software information sites
publisher Institutional Knowledge at Singapore Management University
publishDate 2018
url https://ink.library.smu.edu.sg/sis_research/4127
https://ink.library.smu.edu.sg/context/sis_research/article/5130/viewcontent/EnTagRec_An_enhanced_tag_recommendation_system_for_software_information_sites.pdf
_version_ 1770574344854962176