Spam review detection

As more people depend heavily on the information presented on the web, user generated content like reviews could easily influence the purchase decisions of other consumers. As such, multiple fake reviews have been frequently posted to various popular online review websites to mislead the consumers....

Full description

Saved in:
Bibliographic Details
Main Author: Tan, Hui Min.
Other Authors: School of Computer Engineering
Format: Final Year Project
Language:English
Published: 2013
Subjects:
Online Access:http://hdl.handle.net/10356/54968
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-54968
record_format dspace
spelling sg-ntu-dr.10356-549682023-03-03T20:38:01Z Spam review detection Tan, Hui Min. School of Computer Engineering DRNTU::Engineering::Computer science and engineering As more people depend heavily on the information presented on the web, user generated content like reviews could easily influence the purchase decisions of other consumers. As such, multiple fake reviews have been frequently posted to various popular online review websites to mislead the consumers. Several studies have also been made in spam review detection. However, most research focus on specific review websites such as either Amazon or Yelp. Therefore, this raised a question whether these observed features suggested in these research papers could perform equally well in other domains such as TripAdvisor. In this project, a series of progressive phases were employed to implement algorithm that would detect these spam reviews with referenced to the suggested set of features and procedures. In total, three different types of features, N-Grams features, review centric features and user behavior features were chosen for the study. From the experiments, N-Grams features generally generate a better accuracy than review centric features with a difference in accuracy ranges from 10% to 30%. User behavior features consistently outperforms the other two sets of features with an average accuracy of 60% and above. Despite the limitations in this project, it is evident from the findings that the features relating to user behaviors gives the best accuracy among the rest which means that it is more versatile. Bachelor of Engineering (Computer Science) 2013-11-20T05:58:56Z 2013-11-20T05:58:56Z 2013 2013 Final Year Project (FYP) http://hdl.handle.net/10356/54968 en Nanyang Technological University 97 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering
spellingShingle DRNTU::Engineering::Computer science and engineering
Tan, Hui Min.
Spam review detection
description As more people depend heavily on the information presented on the web, user generated content like reviews could easily influence the purchase decisions of other consumers. As such, multiple fake reviews have been frequently posted to various popular online review websites to mislead the consumers. Several studies have also been made in spam review detection. However, most research focus on specific review websites such as either Amazon or Yelp. Therefore, this raised a question whether these observed features suggested in these research papers could perform equally well in other domains such as TripAdvisor. In this project, a series of progressive phases were employed to implement algorithm that would detect these spam reviews with referenced to the suggested set of features and procedures. In total, three different types of features, N-Grams features, review centric features and user behavior features were chosen for the study. From the experiments, N-Grams features generally generate a better accuracy than review centric features with a difference in accuracy ranges from 10% to 30%. User behavior features consistently outperforms the other two sets of features with an average accuracy of 60% and above. Despite the limitations in this project, it is evident from the findings that the features relating to user behaviors gives the best accuracy among the rest which means that it is more versatile.
author2 School of Computer Engineering
author_facet School of Computer Engineering
Tan, Hui Min.
format Final Year Project
author Tan, Hui Min.
author_sort Tan, Hui Min.
title Spam review detection
title_short Spam review detection
title_full Spam review detection
title_fullStr Spam review detection
title_full_unstemmed Spam review detection
title_sort spam review detection
publishDate 2013
url http://hdl.handle.net/10356/54968
_version_ 1759856194869002240