Product name detection from user forums

People nowadays are strongly influenced to using acronym and not full names when referring to Smartphone or other products on forums. This leads to an increasing number of different naming convention (e.g., full names, acronym names or certain names referring to more than one product) used when refe...

Full description

Saved in:
Bibliographic Details
Main Author: Peh, Wei Leng
Other Authors: Sun Aixin
Format: Final Year Project
Language:English
Published: 2014
Subjects:
Online Access:http://hdl.handle.net/10356/58980
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-58980
record_format dspace
spelling sg-ntu-dr.10356-589802023-03-03T20:24:09Z Product name detection from user forums Peh, Wei Leng Sun Aixin School of Computer Engineering DRNTU::Engineering People nowadays are strongly influenced to using acronym and not full names when referring to Smartphone or other products on forums. This leads to an increasing number of different naming convention (e.g., full names, acronym names or certain names referring to more than one product) used when referring to a certain product. Therefore, this report presents the proposed techniques to automatically identify the product names from online forums. The first proposed technique combines the usage of natural language processing tools, standard matching of noun phrases with a list of phone database and acronyms together with rule based method to further filter the output list of phone names after extraction. The second technique uses the users’ pattern analysis model to extract the possible phone names from forum. From the results, more than 75% of the phone names are extracted for rule-based approach. However, the drawback is that there are too many unnecessary nouns being extracted as mobile names. There are too many false positives in the result. For pattern-based approach, lesser mobile names are being detected and extracted out. Further research on users’ patterns analysis needs to be done for pattern-based approach. Therefore, further improvement needs to be done. Firstly, more rules needs to be defined to further filter the unnecessary words. Secondly, those special words that do not appear for more than 15 times for each thread can be extracted. Thirdly, to add on to the users’ pattern analysis model, a list of categories of words that are hardly used for naming product names can be defined. Lastly, manual annotation on product names can be done in one XML thread and then extract them to train the rest of the data. As more refinements are continuously made, it is believed that the proposed techniques will achieve better performance in identifying the product names automatically. Bachelor of Engineering (Computer Science) 2014-04-17T08:20:10Z 2014-04-17T08:20:10Z 2014 2014 Final Year Project (FYP) http://hdl.handle.net/10356/58980 en Nanyang Technological University 65 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering
spellingShingle DRNTU::Engineering
Peh, Wei Leng
Product name detection from user forums
description People nowadays are strongly influenced to using acronym and not full names when referring to Smartphone or other products on forums. This leads to an increasing number of different naming convention (e.g., full names, acronym names or certain names referring to more than one product) used when referring to a certain product. Therefore, this report presents the proposed techniques to automatically identify the product names from online forums. The first proposed technique combines the usage of natural language processing tools, standard matching of noun phrases with a list of phone database and acronyms together with rule based method to further filter the output list of phone names after extraction. The second technique uses the users’ pattern analysis model to extract the possible phone names from forum. From the results, more than 75% of the phone names are extracted for rule-based approach. However, the drawback is that there are too many unnecessary nouns being extracted as mobile names. There are too many false positives in the result. For pattern-based approach, lesser mobile names are being detected and extracted out. Further research on users’ patterns analysis needs to be done for pattern-based approach. Therefore, further improvement needs to be done. Firstly, more rules needs to be defined to further filter the unnecessary words. Secondly, those special words that do not appear for more than 15 times for each thread can be extracted. Thirdly, to add on to the users’ pattern analysis model, a list of categories of words that are hardly used for naming product names can be defined. Lastly, manual annotation on product names can be done in one XML thread and then extract them to train the rest of the data. As more refinements are continuously made, it is believed that the proposed techniques will achieve better performance in identifying the product names automatically.
author2 Sun Aixin
author_facet Sun Aixin
Peh, Wei Leng
format Final Year Project
author Peh, Wei Leng
author_sort Peh, Wei Leng
title Product name detection from user forums
title_short Product name detection from user forums
title_full Product name detection from user forums
title_fullStr Product name detection from user forums
title_full_unstemmed Product name detection from user forums
title_sort product name detection from user forums
publishDate 2014
url http://hdl.handle.net/10356/58980
_version_ 1759853004523044864