Modeling customer review ratings through Kmeans clustering and SVM classification

In the digital era, consumers decisions are highly influenced by online reviews, making it important for businesses to understand such electronic word of mouth in order to satisfy their customers. However, online reviews by customers are rarely straightforward, often containing mixed sentiments, mak...

Full description

Saved in:
Bibliographic Details
Main Author: Lim, Gina Qian Ying
Other Authors: Chen Songlin
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/159096
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In the digital era, consumers decisions are highly influenced by online reviews, making it important for businesses to understand such electronic word of mouth in order to satisfy their customers. However, online reviews by customers are rarely straightforward, often containing mixed sentiments, making it hard for one to manually gather useful insights from large volumes of data. Moreover, digital texts like online reviews in its raw form is difficult for ingestion by computers to perform analysis of customer sentiments to obtain useful information and insights. Further, although the analysis of sentiments in digital texts have been widely studied, those with neutral polarity have been largely ignored, with majority focusing on the binary problem of understanding reviews with explicit positive or negative polarity only. This study proposes an integrated machine learning model of clustering and classification techniques to understand sentiments found in customer reviews with neutral polarity. Support vector machine is the classification technique being employed together with the k-means clustering technique. The post processing for result analysis uses N-grams. The results from this study show that the model is efficient in classifying reviews with mixed sentiments and analysis of clustering results allows text contents of mixed sentiments to be understood.