Comparison of Major Supermarkets in Australia — Part 4 (Price Comparison: Woolworths vs Coles)

Periculosa Libertas
6 min readSep 1, 2021

--

Jupyter notebook file (here)

In the previous part, I used RapidFuzz to assist me in identifying and matching identical products between Coles and Woolworths. The following section details how I prepared the Coles DataFrame (with the corresponding matched items from Woolworths) to be merged with the Woolworths DataFrame, in what is essentially an inner join in SQL — but using Pandas.

Preprocessing and Inner Merge

The CSV file after manual matching (with ‘Y’ values in the Manual Match column for matches)

The objective here is to affect the matching done previously on to the ‘Brand_Product_Size’ column by replacing the value in the ‘Brand_Product_Size’ column with the one in ‘WOW Product Match’, if the item matched is the correct one. Since, the ‘Brand_Product_Size’ column is the concatenation of ‘Brand’ , ‘Product Name’ and ‘Package Size’ of Coles’ products, changing the ‘Brand_Product_Size’ values to the way Woolworths names their product would enable the ‘Brand_Product_Size’ to be used as the ‘key’ on which both DataFrames can be inner joined.

The following transformations were done:

  1. If the ‘Match Score’ column is 100, the values in ‘Brand_Product_Size’ is replaced with the ‘WOW Product Match’ value.
  2. If the ‘Manual Match’ column has a ‘Y’, the values in ‘Brand_Product_Size’ is replaced with the ‘WOW Product Match’ value.
  3. Irrelevant columns were removed e.g. Category, Product URL etc.

This DataFrame (originally a Coles DataFrame which had Woolworths product matches inserted in new columns) is then inner merged (joined) with the Woolworths DataFrame, with the key being ‘Brand_Product_Size’. After the preliminary error check, the category columns are removed, and the products duplicated in multiple categories are also removed.

Preliminary Error Check

Before proceeding to the actual comparison, I outputted a CSV file which had mismatched brand names — something that is a red-flag for being an incorrect manual match. Sure enough, there were matched items which had different brands which was the result of human error. These errors were then rectified and all of the prior code re-run.

I matched Woolworths’ private label ‘Macro’ with Melrose

However, there were other errors which were due to Coles’ data integrity issues. In the following pictures, it’s evident that a mistake was made by Coles. As far as I know, Birch & Waite isn’t an exclusive brand of Coles or part of their private labels.

Further Error Checking

This was actually done much later in the analyses. Every time an error was observed, it was corrected and the code re-run.

A new column which displays the percentage discount versus the competition was created. Once the percentage values were sorted from largest to smallest, anything unusually large (i.e. >50%) was examined for being erroneous manual match.

The following errors were found and rectified:

Items cheapest at Coles — % discount vs Woolworths
Different branded item was mistakenly matched
Different packaging size was mistakenly matched
Items cheapest at Woolworths — % discount vs Coles
A 9 pack bundle was matched with one single item
Pack matched with individual item
This took browsing through both websites and examining all the details to conclude that they are totally different items
Different package size
Different package size

Woolworths vs Coles — Price Comparison

Total Matched Identical Products in Sample:

  • 4764

Same Price Products:

  • 2477 products have identical price (52%)
  • 2012 products have the same price and the same ‘on special’ status
  • 465 products have the same price but different ‘on special’ status

Same Price, Different ‘on special’ Status:

  • 460 products were from Coles and ‘on special’ to price match Woolworths.
  • 5 products were from Woolworths and ‘on special’ to price match Coles.

Different Price Products:

  • 2287 products have different price (48% of sample)
  • 1307 products were cheaper at Woolworths
  • 980 products were cheaper at Coles
  • 649 products had the same ‘on specials’ status

Different Price, Same ‘on special’ Status:

  • 423 were cheaper at Woolworths
  • 226 were cheaper at Coles

Different Price, Different ‘on special’ Status:

  • 884 were cheaper at Woolworths
  • 754 were cheaper at Coles

Products ‘on special’:

  • 2661 products had the same ‘on special’ status (55.8% of sample)
  • 2103 products had different ‘on special’ status (44.2% of sample)
  • 1798 products ‘on special’ at Coles
  • 1465 products ‘on special’ at Woolworths

Products ‘on special’ and cheapest:

  • 865 / 1798 products ‘on special’ at Coles were the cheapest (48%)
  • 1234 / 1465 products ‘on special at Woolworths were the cheapest (84%)

Products ‘on special’ but More Expensive Than Rival:

  • At the time the data was collected, the following items were ‘on special’ at Coles but Woolworths had cheaper prices:
This is no longer the case, after checking the prices on 1st September, 2021
  • At the time the data was collected, the following item was ‘on special’ at Woolworths but Coles had cheaper prices:
No longer the case, both supermarkets have price matched and increased the price to $3

Discount % Comparison

The frequency of varying degrees of discounts (%) vs. the competitor:

Woolworths vs Coles — Distribution of the % in Discounts when being cheaper than competition

Since, approximately half of the products have identical prices, the histogram of the percentage of discounts when cheaper than the competition provides us with an idea of the pricing strategy of both Woolworths and Coles.

Woolworths has a huge spike in numbers with a 50% discount versus the same product at Coles. The other cluster is around the 20–30% region.

Coles seems to have a much higher concentration of products centred around the low to mid 20% region, with the 50% discount spike being of significance.

The discounts histogram does not differentiate whether the items are ‘on special’. After a deeper look, I can confirm that all of the items with discounts of 50% and above, are indeed all ‘on special’, at both supermarkets. There is one lone exception from Coles — most likely due to human error:

Exactly 50% off, no ‘on special’ tag — most likely human error

A comparison of the two histogram above clearly shows that Woolworths have more large discounted items than Coles:

Summary Insights:

Although you can find items at each supermarket which are cheaper than their competitor, generally, Woolworths provides better value to consumers.

Coles’ strategy seems to involve having more items ‘on special’ (1798 items on special at Coles vs. 1465 at Woolworths) but the actual magnitude of the discounts are less than Woolworths. This fact was clearly demonstrated by the fact that only 48% of the products ‘on special’ at Coles were cheaper than Woolworths. In comparison, 84% of the products ‘on special’ at Woolworths were cheaper than Coles. Furthermore, the ‘Same Price, Different “on special” Status’ section confirms that Coles has to put products ‘on special’ just to price match Woolworths.

--

--

No responses yet