Overcoming the J-shaped Distribution of Product Reviews
While product review systems that collect and disseminate opinions about products from recent buyers (Table 1) are valuable forms of word-of-mouth communication, evidence suggests that they are overwhelmingly positive. Kadet notes that most products receive almost five stars. Chevalier and Mayzlin a...
Saved in:
Main Authors: | , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2009
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/780 http://dx.doi.org/10.1145/1562764.1562800 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-1779 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-17792010-11-26T07:24:03Z Overcoming the J-shaped Distribution of Product Reviews HU, Nan ZHANG, Jennifer Pavlou, Paul While product review systems that collect and disseminate opinions about products from recent buyers (Table 1) are valuable forms of word-of-mouth communication, evidence suggests that they are overwhelmingly positive. Kadet notes that most products receive almost five stars. Chevalier and Mayzlin also show that book reviews on Amazon and Barnes & Noble are overwhelmingly positive. Is this because all products are simply outstanding? However, a graphical representation of product reviews reveals a J-shaped distribution (Figure 1) with mostly 5-star ratings, some 1-star ratings, and hardly any ratings in between. What explains this J-shaped distribution? If products are indeed outstanding, why do we also see many 1-star ratings? Why aren't there any product ratings in between? Is it because there are no "average" products? Or, is it because there are biases in product review systems? If so, how can we overcome them?<p>The J-shaped distribution also creates some fundamental statistical problems. Conventional wisdom assumes that the average of the product ratings is a sufficient proxy of product quality and product sales. Many studies used the average of product ratings to predict sales. However, these studies showed inconsistent results: some found product reviews to influence product sales, while others did not. The average is statistically meaningful only when it is based on a unimodal distribution, or when it is based on a symmetric bimodal distribution. However, since product review systems have an asymmetric bimodal (J-shaped) distribution, the average is a poor proxy of product quality.<p>This report aims to first demonstrate the existence of a J-shaped distribution, second to identify the sources of bias that cause the J-shaped distribution, third to propose ways to overcome these biases, and finally to show that overcoming these biases helps product review systems better predict future product sales.<p>We tested the distribution of product ratings for three product categories (books, DVDs, videos) with data from Amazon collected between February--July 2005: 78%, 73%, and 72% of the product ratings for books, DVDs, and videos are greater or equal to four stars (Figure 1), confirming our proposition that product reviews are overwhelmingly positive.<p>Figure 1 (left graph) shows a J-shaped distribution of all products. This contradicts the law of "large numbers" that would imply a normal distribution. Figure 1 (middle graph) shows the distribution of three randomly-selected products in each category with over 2,000 reviews. The results show that these reviews still have a J-shaped distribution, implying that the J-shaped distribution is not due to a "small number" problem. Figure 1 (right graph) shows that even products with a median average review (around 3-stars) follow the same pattern. 2009-10-01T07:00:00Z text https://ink.library.smu.edu.sg/sis_research/780 info:doi/10.1145/1562764.1562800 http://dx.doi.org/10.1145/1562764.1562800 Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Communication Technology and New Media Computer Sciences |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Communication Technology and New Media Computer Sciences |
spellingShingle |
Communication Technology and New Media Computer Sciences HU, Nan ZHANG, Jennifer Pavlou, Paul Overcoming the J-shaped Distribution of Product Reviews |
description |
While product review systems that collect and disseminate opinions about products from recent buyers (Table 1) are valuable forms of word-of-mouth communication, evidence suggests that they are overwhelmingly positive. Kadet notes that most products receive almost five stars. Chevalier and Mayzlin also show that book reviews on Amazon and Barnes & Noble are overwhelmingly positive. Is this because all products are simply outstanding? However, a graphical representation of product reviews reveals a J-shaped distribution (Figure 1) with mostly 5-star ratings, some 1-star ratings, and hardly any ratings in between. What explains this J-shaped distribution? If products are indeed outstanding, why do we also see many 1-star ratings? Why aren't there any product ratings in between? Is it because there are no "average" products? Or, is it because there are biases in product review systems? If so, how can we overcome them?<p>The J-shaped distribution also creates some fundamental statistical problems. Conventional wisdom assumes that the average of the product ratings is a sufficient proxy of product quality and product sales. Many studies used the average of product ratings to predict sales. However, these studies showed inconsistent results: some found product reviews to influence product sales, while others did not. The average is statistically meaningful only when it is based on a unimodal distribution, or when it is based on a symmetric bimodal distribution. However, since product review systems have an asymmetric bimodal (J-shaped) distribution, the average is a poor proxy of product quality.<p>This report aims to first demonstrate the existence of a J-shaped distribution, second to identify the sources of bias that cause the J-shaped distribution, third to propose ways to overcome these biases, and finally to show that overcoming these biases helps product review systems better predict future product sales.<p>We tested the distribution of product ratings for three product categories (books, DVDs, videos) with data from Amazon collected between February--July 2005: 78%, 73%, and 72% of the product ratings for books, DVDs, and videos are greater or equal to four stars (Figure 1), confirming our proposition that product reviews are overwhelmingly positive.<p>Figure 1 (left graph) shows a J-shaped distribution of all products. This contradicts the law of "large numbers" that would imply a normal distribution. Figure 1 (middle graph) shows the distribution of three randomly-selected products in each category with over 2,000 reviews. The results show that these reviews still have a J-shaped distribution, implying that the J-shaped distribution is not due to a "small number" problem. Figure 1 (right graph) shows that even products with a median average review (around 3-stars) follow the same pattern. |
format |
text |
author |
HU, Nan ZHANG, Jennifer Pavlou, Paul |
author_facet |
HU, Nan ZHANG, Jennifer Pavlou, Paul |
author_sort |
HU, Nan |
title |
Overcoming the J-shaped Distribution of Product Reviews |
title_short |
Overcoming the J-shaped Distribution of Product Reviews |
title_full |
Overcoming the J-shaped Distribution of Product Reviews |
title_fullStr |
Overcoming the J-shaped Distribution of Product Reviews |
title_full_unstemmed |
Overcoming the J-shaped Distribution of Product Reviews |
title_sort |
overcoming the j-shaped distribution of product reviews |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2009 |
url |
https://ink.library.smu.edu.sg/sis_research/780 http://dx.doi.org/10.1145/1562764.1562800 |
_version_ |
1770570710550315008 |