by Kathy Frankovic, former director of surveys at CBS News and a member of ESOMAR’s Professional Standards Committee
Election polling is the most visible part of market, opinion and social research. It carries the heavy burden for getting things right, but its previous successes have also brought high and perhaps unearned expectations for its accuracy. This year, and the U.S. presidential election in particular, provides a good example of what happens when people forget the limitations of polls, that sampling and non-response may matter, and that ascribing too much precision to polling estimates in times of change can make pundits and journalists look as silly as the pollsters they berate.
We’ve all seen the discussion about what happened in the U.S. last week: was there a “late surge,” were people misrepresenting their vote intention (the “shy Tories/Trump voters”), could pollsters have missed some important groups, did everyone put too much confidence in poll results? We have also seen claims for “new” methods to replace polling – single-bullet solutions for a problem that may or may not exist.
The precision people wanted to see in polls this year made polling aggregators and pundits far more sure of what would happen than may have been realistic. Polls do not have absolute accuracy, and even the best pundits can mis-read them. This year Nate Silver (fivethirtyeight.com), lionized after previous elections for his accuracy creating algorithms using polls to produce probabilities of the outcome, gave Democrat Hillary Clinton a nearly 70% probability of winning (to his credit, that 70% probability had dropped in the last week from a higher likelihood, but it was still a clear prediction). Other aggregators (like pollster.com) put the odds of a Clinton victory even higher, above 98%.
To be clear: nearly all final U.S. 2016 pre-election polls showed a small national lead for Clinton. And she carried the national popular vote by about two percentage points over Republican nominee Donald Trump (now with a counted two million vote lead in the national vote totals). But the national vote count (and national polls) say little about what happens in individual states, and that’s what matters. Had Clinton won the necessary Electoral College votes, we would have been having a very different discussion about polling today than we are, asking how pollsters could have done better, rather than calling the pre-election poll results a “massive, historical, epic disaster.” While there are methodological issues with the 2016 election polls, the industry should not be “reeling.”
That over-reliance on numbers made this year’s post-election commentary even more apocalyptic than necessary, as seen in the already-noted descriptions of a “reeling” profession and a “massive, historical and epic disaster.” And that’s not true. See Sean Trende of RealClearPolitics, another poll aggregator, and The New York Times’ Nate Cohn on this. Even Nate Silver has called the situation a “failure of conventional wisdom more than a failure of polling.”
The individual final pre-election polls ranged from a Clinton lead of six points to a 3-point margin for Republican victor Donald Trump. Pre-election comparisons are complicated because some polls included third party candidates (Gary Johnson for the Libertarians and Jill Stein for the Green Party) and others did not. When included, third parties received 5 to 9% combined (and have received about 5% of the actual vote). But the pre-election polls did not consistently include them. The polls also varied in estimating undecided voters. That percentage was as low as 1% and as high as 9%, depending on the polls.
The Clinton national vote win mattered little, as Trump carried Pennsylvania and Wisconsin (noting that state level polling varying widely in quality and the accuracy gap was particularly noticeable for Wisconsin). Those three states have a total of 46 electoral votes, and put Trump over the top in the electoral vote count. Just over 100,000 votes made the difference. [By contrast, Clinton leads in California with more than 3,000,000 votes: an excess of votes cast in the wrong place.]
This structural peculiarity of the American political system is not especially popular. In 2013, the Gallup Poll and others found six in ten Americans, Republicans, Democrats and independents alike, supporting the abolition of the Electoral College and instead choosing Presidents by the national popular vote. [Of course, after this election, Republicans are likely to change their minds and think the Electoral College is quite a good thing, just as they did in 2000, when Democrat Al Gore won the popular vote, but lost the Presidency to George W. Bush.]
In an election this close, there are lots of explanations. Some have nothing to do with polls. Campaigns make decisions affecting small groups of voters who are hard to track in polls. Television advertising can matter (the Trump campaign poured money into Wisconsin, while the Clinton campaign took the state for granted and the candidate herself never visited). The Trump campaign also admitted it wanted to suppress turnout of key Clinton groups (college-educated women, blacks, young liberals) by reminding them of Bill Clinton’s past womanizing and earlier Hillary Clinton statements she later disavowed. Votes cast by young voters and black voters did decline this year, and overall Clinton received far fewer votes than Barack Obama in 2012.
But pre-election polls aren’t off the hook. National polls overestimated Clinton’s popular vote by about the same amount that they underestimated Barack Obama’s margin in 2012. Many state polls in critical states, especially in the Midwest, were off by more, and had Clinton clearly ahead in states that Trump carried.
An American Association for Public Opinion Research (AAPOR) panel, formed before the election, will review the election polls. Like the British Polling Council panel following the 2015 general election in the UK, its results won’t be available for several months, but serious post-election investigations (beginning with the 1948 report that followed the election that gave us “Dewey Defeats Truman”) nearly always suggest worthwhile improvements in methodology, which many pollsters adopt. Those suggestions are often adopted. Pollsters themselves will be conducting internal reviews, to see if they can match the results even more closely. Any systematic error will be known and – as happens all the time – learned from.
WHAT WE ALREADY KNOW
But we know some things now.
There was a late surge. This year, the exit polls show movement towards Trump nationally and in critical states in the final days before the election (CNN provides an excellent set of tabulations). Across the country, Clinton led by two points among those who made up their minds before the last week of the campaign, and lost to Trump by five among the 13% who made up their minds in the last week. And more than 10% of those who decided in the last week didn’t choose either Trump or Clinton. Similarly, about 10% of voters in the three important Rust Belt states decided in the final days, and Trump decisively led with them: by 11 points in Michigan, 16 points in Pennsylvania, and 27 points in Wisconsin.
11 days before the election, FBI Director James Comey told Congress he was reopening the investigation into Clinton’s private email server. One week later, he said that there was nothing new, putting that issue which had long bedeviled Clinton back before the public, after it may have receded from most voters’ minds. The shift was missed by the polls. Many state polls were completed days before the election, before the full impact of these events could be measured.
There may have been shy Trump voters: Many polls saw little or no change in the last week, though the ABC News/Washington Post tracking poll showed movement first towards and then away from Trump. Its final poll matched the polling average. Could that final movement of voters in the last week to Trump, as indicated in the exit polls, be an indication that some voters felt uncomfortable announcing a vote for Trump earlier? So far there is no direct evidence for it, and there are few differences between telephone and online polls in general ion Trump support.
Did pollsters interview good samples?: Trump suppression efforts, noted earlier, may have turned some “likely voters” into no-shows. Other voters may not have been in the polls at all. This year, there was not just a gender gap, but also a race gap, a marriage gap, an age gap, a religious gap, a rural-urban gap, and an education gap, particularly amongst white voters. Those less educated white voters overwhelmingly supported Donald Trump, and if they were missing from the polls, it was Trump voters who were missing.
Exit polls have a known education and age response bias (perhaps not a surprise when those polls require respondents to fill out paper questionnaires), and it is easy to speculate that at least some less-educated voters could have been absent in pre-election polls of all types.
Years ago, we learned that young people, minorities and urban residents (in other words, people who move frequently) were most likely to have only mobile phones, not landlines. Polls with samples of mobile phone numbers were better at gauging support for Barack Obama. Mobile phones are a routine part of telephone polling,
Single-digit response rates for telephone surveys means more weighting and modeling, and that increases the possibility of error. Online polls have coverage issues, lack the scientific justification of probability sampling, and require significant modeling, but this year they performed as well, or even better, than phone polls. (This is quite different from recent British examples – the 2015 election and the referendum on whether or not the United Kingdom should exit the European Union)
The “Gold Standard” — probability telephone surveys — might be better called the “Silver Standard” as we have seen it can be tarnished and needs to be frequently reviewed and polished. Achieving that “gold standard” requires significant time and energy to reach potential respondents, but the days and weeks that can take limits the news value of polls, and would cost much more than news organizations today are willing – and able – to spend.
There probably is no replacement for the survey questionnaire, no silver bullet. Big data helps target groups, but as dependent on data collection as it is, even it may not be able to measure the exact size of each.
The problem was Interpretation: This year’s real problem was the interpretation problem, an error committed by both pollsters and pundits, both before and after the election. Maybe it’s more accurate to call it the over-interpretation problem.
Pollsters overpromise. They cite data that shows how accurate they were in the past when it may very well have been only that they were lucky. They don’t manage expectations, and violate the truth of what they know – that polling (and all survey research) is subject to error. They give into the temptation to report a 2-point, 3-point, or 4-point margin as a clear lead (and I am not blameless here).
And then reporters believe them – or decide on their own to think polls are super-predictors. But a national poll says little about what will happen in Wisconsin. The election horserace is news, and that is not going to change. But reporting could be a lot better, and poll results expressed with less certainty. . [There may have been some improvement over the years. In 1948, Life Magazine described Thomas Dewey as the “next president” in its pre-election issue. But Newsweek’s pre-printed, pre-distributed and then-recalled commemorative issue featuring “Madam President” is now on sale on EBay.]
We have to do a better job in talking about polls and training journalists. Just this year ESOMAR joined with AAPOR and WAPOR (the World Association for Public Opinion Research) and worked with the Poynter Institute to produce an internationally-focused online course for journalists and will promote the course in France, the Netherlands and Germany especially taking into account upcoming elections.
Much about this election can be explained, but pollsters still have a lot to answer for. So do the rest of us, who forgot polls are only estimates and can be wrong. We must make sure that conductors, exponents and commentators of this most public face of research provide realistic estimates, and do not expect to provide a Rolls Royce for the price of a Ford.
ONLINE COURSE FOR JOURNALISTS: UNDERSTANDING AND INTERPRETING OPINION POLLS
AAPOR, ESOMAR and WAPOR have launched the first-ever international online course to help journalists improve media reporting about polls and survey results.
Aimed at journalists, media students, bloggers, voters and anyone who wants to know how and why polls are conducted, the course is hosted by Poynter, an online training source for journalists.
This course will help journalists understand and interpret opinion polls. It will enable them to assess poll quality and explain why polls covering the same election can produce different results and why the outcome of an election might deviate from the result ‘predicted’ by the polls.
Developed by an international expert team, and funded by ESOMAR, WAPOR and AAPOR the course is free of charge. Go to:
For more information contact:
Professional.email@example.com or firstname.lastname@example.org
Kathy Frankovic is a polling consultant and former director of surveys at CBS News and a member of ESOMAR’s Professional Standards Committee
By Alexander Shashkin
As we know, people do not always do what they say. This is especially true for online behavior. Together with the fact that people do not remember what they do online, this does not allow us to use traditional research methods to understand how people choose and buy products in the internet.
Passive behavioral data help to overcome this difficulty. More and more researchers have access to it and experiment with different possible applications of such data. Though, there is still a need to conceptualize the use of behavioral data as well as to bring more case of business value for it.
Our experience with tracking data at OMI started almost three years ago when we created large user-centric panel in Russia on the back of our access panel that consists of over 1 000 000 people. Desktop and mobile trackers were voluntary installed by over 30 000 participants. Now this panel is working on EnjoyTracking software, and we have three consecutive years of cross-device history of behavioral data. It includes URLs and search queries (clickstream data) for desktops as well as data from mobile browsers and apps. Clickstream data was enriched by social demographic variables known from the panelists profile.
Before analyzing the case I would like to bring your attention to the ‘building blocks’ that we use for behavioral data analysis (see Table below):
This means that in addition to social demographics, researchers can use behavioral variables (such as site visits, search terms, or apps usage to define the target audience. Along with the behavioral data we can ask specifying research questions to accomplish results in its usual format (ratings, indices etc.).
For example, you need to clarify what sites are popular among mothers with kids of 3-6 y.o. in order to choose web portal for a special project and make recommendations on its content. Then you follow three steps:
- Define target audience as “mothers with 3-6 years old kids”
- Build website top for this audience (by reach).
- Add Affinity Index for the websites
As a result you would have a full image of online behavior of particular audience such (as mothers with kids we had in our example) and where to find them to bring your message more effectively.
When it comes to the TA definition through visited websites and search queries, the most time-consuming task is manual or partly automated classification (building a code-frame) and coding these queries and the content visited during the relevant web sessions.
You can do more complex research studies, building them as a construction set using the ‘LEGO blocks’ described in Table 1. I would like to share two real examples of such studies:
- Digital segmentation and media optimization for a pharmaceutical brand.
- to describe the online audience of certain pharmaceutical product
- to perform digital segmentation
- to optimize online advertising strategy.
The audience of client’s product was defined as people performing searches for related key words (we called it thesaurus). The set of relevant searches was first brainstormed, then we found panelists who actually proceeded these search queries and looked at other relevant searches they performed in the same web-sessions. The audience was segmented according to their searches: for example, behavior of those who searched for the problem was significantly different from those, who searched for the brand. Each behavioral segment was described in terms of owned, paid and earned digital channel usage.
The study also allowed to rank different web resources inside each channel making it possible to optimize the brand’s digital presence, meaning that fully actionable results leading straight to the media planning were actually delivered.
- Path to purchase for a mobile device.
- to understand the strategies consumers use to search and buy mobile devices online. This would allow more targeted communication on particular stages of a sales funnel to the client.
First, we selected people from our user-centric panel who performed relevant search or visited relevant websites during the last six months. We realized that the purchase itself might happen offline. To define fact of offline purchase and offline factors we used qualitative research survey for respondents whose online history we followed.
On the second stage we segmented websites related to the topic into different categories (owned/paid/earned + shops, etc). We tried to understand the share of usage for each category of sites among segments that were relevant to the client: those who purchased online and offline, those who made expensive purchase as well as various social demographic and geographic segments.
We also analyzed path to purchase for the most interesting segments qualitatively (following the steps of the person URL by URL). Such analysis was followed by the series of IDIs to understand the reasons for certain steps in search/purchase process.
To summarize, online behavior tracking is an ultimate way to describe and understand the online audience of a brand or product. Researchers are able to 1) define the ‘internet behavioral profiles’ and consideration sets of the consumers to build digital segmentation, 2) better understand the potential brand or product audience in the Internet, 3) optimize online media strategy. Knowing the general media consumption of a certain audience is important for media planning, but knowing the media consumption around and during the search for brand-relevant information is crucial for understanding of the consumers’ decision-making. Combining behavioral data with survey research and qualitative analysis helps to understand the place of Internet in the purchase journey and help brands in developing successful digital strategies based on facts, not only words.
Alexander Shashkin, PhD in Sociology, is CEO of Online Market Intelligence (OMI).