Extracting similar technology comparisons from crowd discussion on stack overflow

Nowadays there are many technologies available for developers to choose from when deciding which ones to adopt for their software projects. Technologies that fall into the same categories can provide similar functionalities yet excel in different features when comparing with each other. When making...

Full description

Saved in:
Bibliographic Details
Main Author: Lin, Tian
Other Authors: Jiang Xudong
Format: Final Year Project
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/75383
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Nowadays there are many technologies available for developers to choose from when deciding which ones to adopt for their software projects. Technologies that fall into the same categories can provide similar functionalities yet excel in different features when comparing with each other. When making choices of technologies to adopt, developers tend to turn to online resources like assessment platforms or community reviews to get understandings of the technologies landscape. Reviews provided by these online platforms are most likely opinion-based and sometimes the information may be out of date. Besides that, online resources of reviews and comparisons are scattered all over the place, making it hard to have a centralized view of the aggregated opinions from the crowd. In this report, we will exploit the fact that posts on Stack Overflow are tagged by users with the most related technologies to increase the precision of searching results. Generated by users when they post contents online, tags classify contents and categorize them into groups for better content organization. A program based on Word2Vec model is developed to understand the relations among these tags, extract insights of similar tags that fall into the same comparable categories, perform comparative opinions mining to locate sentences containing pairs of similar tags and finally a website (https://similartagsheroku.herokuapp.com) is developed to enable the community to access, reference and evaluate our findings of similar technologies and comparative opinions reviews for certain technologies at any time. Based on the ideas above, this project will firstly gather, clean and organize data of tags in Stack Overflow. Next, the data will be feed into Word2Vec model to reconstruct the relationship between each tag, such that similar tags will be filtered by comparable categories and grouped per their proximity. Lastly, comparative opinion mining will be conducted on similar technology pairs to aggregate community opinions and reviews. In the final stage, a website will be deployed to hosted the findings from this project.