De-averaging and the Challenge of Error Margin

By Muhammad Zubair

One-size-fits-all approaches are outdated. Marketing plans designed for the “average” consumer are out of fashion as they usually are relevant to only a handful of consumers. For instance, the average pack size bought in the tea category could be 333g. No consumer is ACTUALLY buying this size (some buying a smaller size and some a larger pack, so the average happens to be 333g).

Furthermore, there are two issues with averages in the presence of outliers:

  1. Averages mask outliers – it is hard to spot outliers in averages
  2. Outliers skew averages, so average doesn’t even represent typical behaviour

So we get the worst of both worlds: we get neither the typical behaviour, nor the unusual behaviour.

De-averaging means recognizing that there are many small consumer segments that all require different marketing experiences.

So it is possible that we say most consumers buy both Ariel detergent and Lux Soap. However, a specific consumer group buys Ariel with Dettol and another consumer segment buys Lux Soap with Brite detergent – looking at data with an “overall” lens would never allow us to discover it. Naturally, one must be wise enough not to push this beyond the point of diminishing return but “different marketing paths for different people” has been the theme for many years now.

The availability of big data coupled with high computing power has allowed data analysts to slice and dice data. This approach, while increasingly more popular, is the direct opposite of traditional statistical ways of looking at data. Statisticians have long been telling us the importance of margin of error (MOE) and level of significance (LOS) in any analysis. In layman terms, they question the validity of any analysis if the following two questions are not adequately answered:

  1. How close the estimate is with the true (actual) value – called the margin of error
  2. How stable the estimate is? (that is if we were to repeat the survey, how likely we are to get the same results) – called level of significance

Invariably, we are told that all sampling works best at an OVERALL level, and as you start thin-slicing the data, you compromise both on MOE and LOS. Statistics is truly a funny subject. We are told that the same sample having high MOE (& low LOS) at a cell level, becomes an efficient sample on an overall basis. So, whenever we try to do analysis at a micro-level, research software generates tables with caution note saying “LOW BASE” in red font. This has discouraged market researchers to do a micro-level analysis.

Meanwhile, a whole new world of analytics is emerging. Its selling point was its ability to identify small clusters of consumers – and they rarely spoke about terms like MOE & LOS. These terms are not in their vocabulary, simply. Analytics or Data Scientists as they are more fashionably called now, are more concerned about the actionability of their recommendations than pre-occupied by the statistical-accuracy of findings. Accordingly, it is the IT professionals leading the data science filed in a global arena, while statisticians (and market researchers) have become mere bystanders.

So, to truly practice de-averaging, one needs a new set of skills, that allows taking a reasonable level of risk-taking fully knowing that the findings are based on relatively smaller samples and hence actions need to be taken with caution in consultation with the on-ground people.

By Muhammad Zubair, CEO Foresight Research (Pvt.) Ltd. – Pakistan