7 Reasons Why Total and Average Number of Reviews are Misleading
September 8, 2018
Brand owners who realize the impact reviews have on purchase intent in their category, try to improve this area.
One of the first questions we usually hear from our clients is What is the optimal number of reviews?.
The starting point of the discussion is usually the value of around 20, as many believe that there is a dip in conversion below this number. It is however very difficult in many stores for some niche products to reach even 5. On the other hand, 20 becomes unattractive if competition gathered thousands of reviews (cases in China).
So what is the optimal number of reviews? Instead of sticking to specific values, check on your competition store-by-store and set a target that will make your products look competitive.
After our clients figure out what number of reviews they want to gather, they face another dilemma: ?How to synthesise information about the reviews of hundreds of products in dozens of stores?.
Unfortunately, there is one intuitive yet wrong answer they usually seek first an average or a total.
7 Reasons why the average and the total number of reviews are wrong metrics.
Obviously, each metric has its pros and cons, therefore in order to criticise the metric, we need to first define what the expected use of such metric is.
Most of our clients want to understand how strong their brand is (namely attractive for shoppers) and what the opportunities are to boost its attractiveness further.
Summing up all the review numbers (or averaging them) results in metrics which are ultimately not delivering on defined expectations. They do not provide a clear picture of how good the brand is and which products are underperforming.
Success criteria for total number of reviews per product: above 20 Green, below 20 Red.
In the above example Brand A has over three times more reviews than Brand B. The average number of reviews looks even more stunning, as Brand A has only 3 products vs. 5 products of Brand B, so the average number of reviews of Brand A is over five times greater vs. Brand B. Both metrics clearly indicate the success of Brand A. The point is that Brand A has only 1 product that is well reviewed, the rest requires intervention.
1. Consumers / Shoppers never look at the total number of reviews for a brand (or the average), instead of product-by-product.
If you want your products to stay competitive, you need to ensure they all (one-by-one) exceed a specific threshold of review numbers and not simply the overall number of reviews for these products.
2. Total number of reviews per brand is not comparable against competition.
The fact that one brand has more reviews vs. other has a very limited interpretation. The key driver of such a difference may be a different number of listed products. Building metrics that have high affinity to other metrics make them difficult in interpretation.
The total number of reviews is directly impacted by the number of products, thus, instead of driving number of reviews per product, one can as well drive the number of products and not increase the number of reviews per product, which should be the primary objective.
3. Total or average number of reviews does not support differentiated star-rating objectives for different products
Let's assume you decide to set a different number-of-reviews objective for bestsellers of your brand and different objective for newly launched variants. Synthesising the data with total or average, would be highly misleading. The total/average score does not inform at all if the performance of products is satisfactory or require intervention and if so, where.
Example: Some products of Brand A sell well and are expected to gain disproportionately more reviews vs. other (especially that competitive variants are also popular and effective in gaining the reviews), whereas new variants may need some time to reach that high number of reviews. The combined or average number of reviews will blur that perspective.
4. Portfolio choices might be improperly impacted by maximizing the total/average number of reviews.
Mature products usually gathered a significant amount of reviews over months or even years. Delisting such products, will almost always drive the total/average number of reviews down. This can encourage e-commerce managers to artificially maintain product cards for products which are no longer available.
Example: Brand manager of Brand A may be hesitant to discontinue Product 2 as it is a power horse for the total number of reviews metric.
5. Product cards temporarily removed by e-retailer drive total (or even sometimes average) number of reviews down.
Looking at the table above, it is easy to imagine the misleading results which would follow, should one product temporarily be removed (e.g. Product 2 of Brand A). Whenever an e-retailer takes a product card down (some do it while products are out of stock), all reviews for this product disappear, driving the total number of reviews down (sometimes significantly). Such a situation would turn on an alarm for your analysts, however, a wrong one. The problem obviously is with distribution, not with reviews.
6. Products with a high number of reviews might distort the weak performance of other products.
If there are some products with a very high number of reviews, they might drive the total number of reviews up and conceal the fact, that some products have very low numbers of reviews or none at all. There are various solutions aimed at motivating shoppers to leave reviews, however, if there is no clear picture as to which products need such attention, such gaps may remain.
7. Products with multi-variants disrupt the total number of reviews.
Product variants which share the same reviews (e.g. shades and sizes) may significantly bias both metrics (total and average). Regardless of whether the reviews are double counted or cleaned, there will always be a challenge of the right interpretation of such data.
How to measure the number of reviews properly?
We recommend the following approach:
- Understand what a minimum number of reviews should be in order to stay competitive (success criteria) in your category/countries. This can be one number for each product respectively, in all stores and countries or you may decide to differentiate it according to stores and countries.
- eStoreCheck checks if a given success criterion is reached and flags products as True or False (above or below the accepted minimum number of reviews per product).
- Dashboards calculate the percentage of products which deliver on success criteria and those that are creating issues.
Example ? see color coding and line ?% of products that meet success criteria? for reference.
Benefits of the above approach?
- Looking through shoppers? eyes, i.e. product per product: clear picture on how close the brand is to research desired success criteria.
- Natural link to products that require attention (?false? on success criteria).
- Listing / delisting of a single product card seldom changes the metric significantly.