Product name and associated user sentiment retrieval from tweets

With a growing section of the web dedicated to the reviews, discussions and advertisement of products through micro-blogging, it has become imperative to develop techniques for automated Product name extraction from user generated short texts. In this report, we propose a system for mining Tweets to...

Full description

Saved in:
Bibliographic Details
Main Author: Saraf, Avnish
Other Authors: Gao Cong
Format: Final Year Project
Language:English
Published: 2015
Subjects:
Online Access:http://hdl.handle.net/10356/63473
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-63473
record_format dspace
spelling sg-ntu-dr.10356-634732023-03-03T20:53:28Z Product name and associated user sentiment retrieval from tweets Saraf, Avnish Gao Cong School of Computer Engineering DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval With a growing section of the web dedicated to the reviews, discussions and advertisement of products through micro-blogging, it has become imperative to develop techniques for automated Product name extraction from user generated short texts. In this report, we propose a system for mining Tweets to analyze and extract the product name mention and the corresponding sentiments towards the product. We survey the information retrieval research landscape and decide on using a hybrid method for product name extraction. Our novel method combines the fuzzy dictionary matching approach to a CRFbased Named Entity Recognition approach to handle the inconsistencies of user generated short texts during extraction. Further, we also probe the widely popular sentiment mining field. We begin by studying the existing works and then propose a Machine Learning based approach operating at the sentence level granularity adapted suitably for handling micro blogs. For evaluation, we generate a dataset of 2,032 Tweets, anually annotated with the associated sentiment and the product name mentions. Evaluation on this data shows that our Hybrid method outperforms all the existing methods and achieves a Precision of 0.95, Recall of 0.98 and F1 score of 0.97 along with a 69% accurate sentiment analysis. We also provide an extensive comparison of our algorithm with one of the most popular NER systems available, the Stanford NER and show that our method produces a 38% improvement over it for user generated micro text. A detailed analysis of the performance of the individual components is also provided to establish the synergic performance of the hybrid method as compared to the fuzzy dictionary matching method and the CRF method individually. Bachelor of Engineering (Computer Science) 2015-05-14T02:08:23Z 2015-05-14T02:08:23Z 2015 2015 Final Year Project (FYP) http://hdl.handle.net/10356/63473 en Nanyang Technological University 50 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval
spellingShingle DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval
Saraf, Avnish
Product name and associated user sentiment retrieval from tweets
description With a growing section of the web dedicated to the reviews, discussions and advertisement of products through micro-blogging, it has become imperative to develop techniques for automated Product name extraction from user generated short texts. In this report, we propose a system for mining Tweets to analyze and extract the product name mention and the corresponding sentiments towards the product. We survey the information retrieval research landscape and decide on using a hybrid method for product name extraction. Our novel method combines the fuzzy dictionary matching approach to a CRFbased Named Entity Recognition approach to handle the inconsistencies of user generated short texts during extraction. Further, we also probe the widely popular sentiment mining field. We begin by studying the existing works and then propose a Machine Learning based approach operating at the sentence level granularity adapted suitably for handling micro blogs. For evaluation, we generate a dataset of 2,032 Tweets, anually annotated with the associated sentiment and the product name mentions. Evaluation on this data shows that our Hybrid method outperforms all the existing methods and achieves a Precision of 0.95, Recall of 0.98 and F1 score of 0.97 along with a 69% accurate sentiment analysis. We also provide an extensive comparison of our algorithm with one of the most popular NER systems available, the Stanford NER and show that our method produces a 38% improvement over it for user generated micro text. A detailed analysis of the performance of the individual components is also provided to establish the synergic performance of the hybrid method as compared to the fuzzy dictionary matching method and the CRF method individually.
author2 Gao Cong
author_facet Gao Cong
Saraf, Avnish
format Final Year Project
author Saraf, Avnish
author_sort Saraf, Avnish
title Product name and associated user sentiment retrieval from tweets
title_short Product name and associated user sentiment retrieval from tweets
title_full Product name and associated user sentiment retrieval from tweets
title_fullStr Product name and associated user sentiment retrieval from tweets
title_full_unstemmed Product name and associated user sentiment retrieval from tweets
title_sort product name and associated user sentiment retrieval from tweets
publishDate 2015
url http://hdl.handle.net/10356/63473
_version_ 1759857277872898048