James Rohde and Eric Perz
In case you have been lost at sea, big data is very trendy right now. We’re told it’s the next big thing to tell us all we could ever need to know. This hype, of course driven by researchers, is accompanied by the what some are calling the death toll of surveys.
Big data…really? What makes it so ‘big’?
There are many, including us, who find the term ‘big data’ to be unclear or confusing. IBM does a good job explaining big data with their Four V’s framework, referring to its volume, velocity, variety, and veracity. If you write off volume as an obvious dimension and velocity (say, real-time decisioning) as a different conversation altogether, the conversation boils down to Variety and Veracity.
Today we log clicks, motion, SKUs, biometrics, time on site, cookies, transactions, lat/long, conversations, impressions, demographics, contact preferences, and even meatloaf consumed in the past 3 months.
Thankfully, skilled analysts are effective at filtering out the garbage and cleaning up messy fields, even as the amount of garbage and messy data grows. Big data is not really the stifling conundrum that the name may connote. Regardless, big data has won the buzzword battle, so in this commentary we will continue to use it. We just had to get our grievances are out there.
Like it or not, large data sets are more accessible than ever before and provide information that cannot be replicated with surveys. Moreover, there are certainly surveys that could be replaced by big data and smart analytics. Sorry, it’s a fact. This is driving some in our industry to an irrational fear that survey is doomed; it’s not that simple and it’s only true in a limited way.
For some studies, yes, the survey is going to be deemed the less reliable source of information but that is not any different from any other new methodology to find its way into the survey-driven world. Accessible big data is not something researchers need to fear. Having not only more but largely more accurate data at our disposal – at least, once it has been processed by analysts – gives us a huge added strength to what our precious surveys really do better than anything else…getting to the why.
I’m not talking about open-ends or literally asking about importance on a scale; this oversimplification works to the detriment of our industry by creating false positives. To shorten what is a long argumentative topic: open-ends and importance-scales can have their place but it is a much smaller place than they currently occupy.
Big Data vs. Survey
On its own, big data can and does offer a great deal of utility. Through various sources, assuming you can link them, you can see nearly everything about individual actions:
- What website did they visit?
- What have they purchased?
- When did they purchase?
- What did they watch right before they purchased?
- How did they make the purchase?
- Were they exposed to a display ad?
These are clear strengths of a big dataset since they all represent insights that otherwise give notoriously unreliable results in a survey. Where these datasets often (but not always) begin to see problems is where we begin to look for context to help us understand the whys, beyond merely inference or imputation.
- What do they think of our product?
- What do they think of our brand?
- Why did they make the purchase (gift, new hobby, replacing an old model, etc?)
- Has somebody recommended us to them?
- How did their interest in our product evolve?
- Do they remember any of the advertisements we’ve hurled their way?
For the sake of full disclosure, there are some data sources that claim to provide some of the information that I’m placing in the survey space. However, we are not talking about where information CAN be provided but from where it SHOULD be provided.
When and How, not If
If I am looking to make strategic decisions about my brand, I would no more want Facebook stats on positive brand comments to understand brand loyalty than I would survey data on brand impressions to drive my ad spend. Modelling with big data is in many ways an attempt to represent reality in a structured mathematical formula. You have to ask yourself: do I have the right data to do that? Taking a step back and considering the best source of information before confusing the task at hand with data for data sake, gets us to the best decision with the most efficient use of time.
Big Data & Survey, BFF
Instead of working to try and show-up big data capabilities – a silly notion as surveys could never even compete due to sample expenses – leveraging the strengths of big data and survey data to fortify each other within our overall research efforts creates a superior methodology that is of more use than either working independently.
James Rohde is Research Supervisor and Eric Perz is Associate Director, Measurement and Analytics at M.A.R.C. USA.