Product name and associated user sentiment retrieval from tweets
With a growing section of the web dedicated to the reviews, discussions and advertisement of products through micro-blogging, it has become imperative to develop techniques for automated Product name extraction from user generated short texts. In this report, we propose a system for mining Tweets to...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2015
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/63473 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-63473 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-634732023-03-03T20:53:28Z Product name and associated user sentiment retrieval from tweets Saraf, Avnish Gao Cong School of Computer Engineering DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval With a growing section of the web dedicated to the reviews, discussions and advertisement of products through micro-blogging, it has become imperative to develop techniques for automated Product name extraction from user generated short texts. In this report, we propose a system for mining Tweets to analyze and extract the product name mention and the corresponding sentiments towards the product. We survey the information retrieval research landscape and decide on using a hybrid method for product name extraction. Our novel method combines the fuzzy dictionary matching approach to a CRFbased Named Entity Recognition approach to handle the inconsistencies of user generated short texts during extraction. Further, we also probe the widely popular sentiment mining field. We begin by studying the existing works and then propose a Machine Learning based approach operating at the sentence level granularity adapted suitably for handling micro blogs. For evaluation, we generate a dataset of 2,032 Tweets, anually annotated with the associated sentiment and the product name mentions. Evaluation on this data shows that our Hybrid method outperforms all the existing methods and achieves a Precision of 0.95, Recall of 0.98 and F1 score of 0.97 along with a 69% accurate sentiment analysis. We also provide an extensive comparison of our algorithm with one of the most popular NER systems available, the Stanford NER and show that our method produces a 38% improvement over it for user generated micro text. A detailed analysis of the performance of the individual components is also provided to establish the synergic performance of the hybrid method as compared to the fuzzy dictionary matching method and the CRF method individually. Bachelor of Engineering (Computer Science) 2015-05-14T02:08:23Z 2015-05-14T02:08:23Z 2015 2015 Final Year Project (FYP) http://hdl.handle.net/10356/63473 en Nanyang Technological University 50 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval |
spellingShingle |
DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval Saraf, Avnish Product name and associated user sentiment retrieval from tweets |
description |
With a growing section of the web dedicated to the reviews, discussions and advertisement of products through micro-blogging, it has become imperative to develop techniques for automated Product name extraction from user generated short texts. In this report, we propose a system for mining Tweets to analyze and extract the product name mention and the corresponding sentiments towards the product. We survey the information retrieval research landscape and decide on using a hybrid method for product name extraction. Our novel method combines the fuzzy dictionary matching approach to a CRFbased Named Entity Recognition approach to handle the inconsistencies of user generated short texts during extraction. Further, we also probe the widely popular sentiment mining field. We begin by studying the existing works and then propose a Machine Learning based approach operating at the sentence level granularity adapted suitably for handling micro blogs. For evaluation, we generate a dataset of 2,032 Tweets, anually annotated with the associated sentiment and the product name mentions. Evaluation on this data shows that our Hybrid method outperforms all the existing methods and achieves a Precision of 0.95, Recall of 0.98 and F1 score of 0.97 along with a 69% accurate sentiment analysis. We also provide an extensive comparison of our algorithm with one of the most popular NER systems available, the Stanford NER and show that our method produces a 38% improvement over it for user generated micro text. A detailed analysis of the performance of the individual components is also provided to establish the synergic performance of the hybrid method as compared to the fuzzy dictionary matching method and the CRF method individually. |
author2 |
Gao Cong |
author_facet |
Gao Cong Saraf, Avnish |
format |
Final Year Project |
author |
Saraf, Avnish |
author_sort |
Saraf, Avnish |
title |
Product name and associated user sentiment retrieval from tweets |
title_short |
Product name and associated user sentiment retrieval from tweets |
title_full |
Product name and associated user sentiment retrieval from tweets |
title_fullStr |
Product name and associated user sentiment retrieval from tweets |
title_full_unstemmed |
Product name and associated user sentiment retrieval from tweets |
title_sort |
product name and associated user sentiment retrieval from tweets |
publishDate |
2015 |
url |
http://hdl.handle.net/10356/63473 |
_version_ |
1759857277872898048 |