by Kathy Frankovic, former director of surveys at CBS News and a member of ESOMAR’s Professional Standards Committee
Election polling is the most visible part of market, opinion and social research. It carries the heavy burden for getting things right, but its previous successes have also brought high and perhaps unearned expectations for its accuracy. This year, and the U.S. presidential election in particular, provides a good example of what happens when people forget the limitations of polls, that sampling and non-response may matter, and that ascribing too much precision to polling estimates in times of change can make pundits and journalists look as silly as the pollsters they berate.
We’ve all seen the discussion about what happened in the U.S. last week: was there a “late surge,” were people misrepresenting their vote intention (the “shy Tories/Trump voters”), could pollsters have missed some important groups, did everyone put too much confidence in poll results? We have also seen claims for “new” methods to replace polling – single-bullet solutions for a problem that may or may not exist.
The precision people wanted to see in polls this year made polling aggregators and pundits far more sure of what would happen than may have been realistic. Polls do not have absolute accuracy, and even the best pundits can mis-read them. This year Nate Silver (fivethirtyeight.com), lionized after previous elections for his accuracy creating algorithms using polls to produce probabilities of the outcome, gave Democrat Hillary Clinton a nearly 70% probability of winning (to his credit, that 70% probability had dropped in the last week from a higher likelihood, but it was still a clear prediction). Other aggregators (like pollster.com) put the odds of a Clinton victory even higher, above 98%.
To be clear: nearly all final U.S. 2016 pre-election polls showed a small national lead for Clinton. And she carried the national popular vote by about two percentage points over Republican nominee Donald Trump (now with a counted two million vote lead in the national vote totals). But the national vote count (and national polls) say little about what happens in individual states, and that’s what matters. Had Clinton won the necessary Electoral College votes, we would have been having a very different discussion about polling today than we are, asking how pollsters could have done better, rather than calling the pre-election poll results a “massive, historical, epic disaster.” While there are methodological issues with the 2016 election polls, the industry should not be “reeling.”
That over-reliance on numbers made this year’s post-election commentary even more apocalyptic than necessary, as seen in the already-noted descriptions of a “reeling” profession and a “massive, historical and epic disaster.” And that’s not true. See Sean Trende of RealClearPolitics, another poll aggregator, and The New York Times’ Nate Cohn on this. Even Nate Silver has called the situation a “failure of conventional wisdom more than a failure of polling.”
The individual final pre-election polls ranged from a Clinton lead of six points to a 3-point margin for Republican victor Donald Trump. Pre-election comparisons are complicated because some polls included third party candidates (Gary Johnson for the Libertarians and Jill Stein for the Green Party) and others did not. When included, third parties received 5 to 9% combined (and have received about 5% of the actual vote). But the pre-election polls did not consistently include them. The polls also varied in estimating undecided voters. That percentage was as low as 1% and as high as 9%, depending on the polls.
The Clinton national vote win mattered little, as Trump carried Pennsylvania and Wisconsin (noting that state level polling varying widely in quality and the accuracy gap was particularly noticeable for Wisconsin). Those three states have a total of 46 electoral votes, and put Trump over the top in the electoral vote count. Just over 100,000 votes made the difference. [By contrast, Clinton leads in California with more than 3,000,000 votes: an excess of votes cast in the wrong place.]
This structural peculiarity of the American political system is not especially popular. In 2013, the Gallup Poll and others found six in ten Americans, Republicans, Democrats and independents alike, supporting the abolition of the Electoral College and instead choosing Presidents by the national popular vote. [Of course, after this election, Republicans are likely to change their minds and think the Electoral College is quite a good thing, just as they did in 2000, when Democrat Al Gore won the popular vote, but lost the Presidency to George W. Bush.]
In an election this close, there are lots of explanations. Some have nothing to do with polls. Campaigns make decisions affecting small groups of voters who are hard to track in polls. Television advertising can matter (the Trump campaign poured money into Wisconsin, while the Clinton campaign took the state for granted and the candidate herself never visited). The Trump campaign also admitted it wanted to suppress turnout of key Clinton groups (college-educated women, blacks, young liberals) by reminding them of Bill Clinton’s past womanizing and earlier Hillary Clinton statements she later disavowed. Votes cast by young voters and black voters did decline this year, and overall Clinton received far fewer votes than Barack Obama in 2012.
But pre-election polls aren’t off the hook. National polls overestimated Clinton’s popular vote by about the same amount that they underestimated Barack Obama’s margin in 2012. Many state polls in critical states, especially in the Midwest, were off by more, and had Clinton clearly ahead in states that Trump carried.
An American Association for Public Opinion Research (AAPOR) panel, formed before the election, will review the election polls. Like the British Polling Council panel following the 2015 general election in the UK, its results won’t be available for several months, but serious post-election investigations (beginning with the 1948 report that followed the election that gave us “Dewey Defeats Truman”) nearly always suggest worthwhile improvements in methodology, which many pollsters adopt. Those suggestions are often adopted. Pollsters themselves will be conducting internal reviews, to see if they can match the results even more closely. Any systematic error will be known and – as happens all the time – learned from.
WHAT WE ALREADY KNOW
But we know some things now.
There was a late surge. This year, the exit polls show movement towards Trump nationally and in critical states in the final days before the election (CNN provides an excellent set of tabulations). Across the country, Clinton led by two points among those who made up their minds before the last week of the campaign, and lost to Trump by five among the 13% who made up their minds in the last week. And more than 10% of those who decided in the last week didn’t choose either Trump or Clinton. Similarly, about 10% of voters in the three important Rust Belt states decided in the final days, and Trump decisively led with them: by 11 points in Michigan, 16 points in Pennsylvania, and 27 points in Wisconsin.
11 days before the election, FBI Director James Comey told Congress he was reopening the investigation into Clinton’s private email server. One week later, he said that there was nothing new, putting that issue which had long bedeviled Clinton back before the public, after it may have receded from most voters’ minds. The shift was missed by the polls. Many state polls were completed days before the election, before the full impact of these events could be measured.
There may have been shy Trump voters: Many polls saw little or no change in the last week, though the ABC News/Washington Post tracking poll showed movement first towards and then away from Trump. Its final poll matched the polling average. Could that final movement of voters in the last week to Trump, as indicated in the exit polls, be an indication that some voters felt uncomfortable announcing a vote for Trump earlier? So far there is no direct evidence for it, and there are few differences between telephone and online polls in general ion Trump support.
Did pollsters interview good samples?: Trump suppression efforts, noted earlier, may have turned some “likely voters” into no-shows. Other voters may not have been in the polls at all. This year, there was not just a gender gap, but also a race gap, a marriage gap, an age gap, a religious gap, a rural-urban gap, and an education gap, particularly amongst white voters. Those less educated white voters overwhelmingly supported Donald Trump, and if they were missing from the polls, it was Trump voters who were missing.
Exit polls have a known education and age response bias (perhaps not a surprise when those polls require respondents to fill out paper questionnaires), and it is easy to speculate that at least some less-educated voters could have been absent in pre-election polls of all types.
Years ago, we learned that young people, minorities and urban residents (in other words, people who move frequently) were most likely to have only mobile phones, not landlines. Polls with samples of mobile phone numbers were better at gauging support for Barack Obama. Mobile phones are a routine part of telephone polling,
Single-digit response rates for telephone surveys means more weighting and modeling, and that increases the possibility of error. Online polls have coverage issues, lack the scientific justification of probability sampling, and require significant modeling, but this year they performed as well, or even better, than phone polls. (This is quite different from recent British examples – the 2015 election and the referendum on whether or not the United Kingdom should exit the European Union)
The “Gold Standard” — probability telephone surveys — might be better called the “Silver Standard” as we have seen it can be tarnished and needs to be frequently reviewed and polished. Achieving that “gold standard” requires significant time and energy to reach potential respondents, but the days and weeks that can take limits the news value of polls, and would cost much more than news organizations today are willing – and able – to spend.
There probably is no replacement for the survey questionnaire, no silver bullet. Big data helps target groups, but as dependent on data collection as it is, even it may not be able to measure the exact size of each.
The problem was Interpretation: This year’s real problem was the interpretation problem, an error committed by both pollsters and pundits, both before and after the election. Maybe it’s more accurate to call it the over-interpretation problem.
Pollsters overpromise. They cite data that shows how accurate they were in the past when it may very well have been only that they were lucky. They don’t manage expectations, and violate the truth of what they know – that polling (and all survey research) is subject to error. They give into the temptation to report a 2-point, 3-point, or 4-point margin as a clear lead (and I am not blameless here).
And then reporters believe them – or decide on their own to think polls are super-predictors. But a national poll says little about what will happen in Wisconsin. The election horserace is news, and that is not going to change. But reporting could be a lot better, and poll results expressed with less certainty. . [There may have been some improvement over the years. In 1948, Life Magazine described Thomas Dewey as the “next president” in its pre-election issue. But Newsweek’s pre-printed, pre-distributed and then-recalled commemorative issue featuring “Madam President” is now on sale on EBay.]
We have to do a better job in talking about polls and training journalists. Just this year ESOMAR joined with AAPOR and WAPOR (the World Association for Public Opinion Research) and worked with the Poynter Institute to produce an internationally-focused online course for journalists and will promote the course in France, the Netherlands and Germany especially taking into account upcoming elections.
Much about this election can be explained, but pollsters still have a lot to answer for. So do the rest of us, who forgot polls are only estimates and can be wrong. We must make sure that conductors, exponents and commentators of this most public face of research provide realistic estimates, and do not expect to provide a Rolls Royce for the price of a Ford.
ONLINE COURSE FOR JOURNALISTS: UNDERSTANDING AND INTERPRETING OPINION POLLS
AAPOR, ESOMAR and WAPOR have launched the first-ever international online course to help journalists improve media reporting about polls and survey results.
Aimed at journalists, media students, bloggers, voters and anyone who wants to know how and why polls are conducted, the course is hosted by Poynter, an online training source for journalists.
This course will help journalists understand and interpret opinion polls. It will enable them to assess poll quality and explain why polls covering the same election can produce different results and why the outcome of an election might deviate from the result ‘predicted’ by the polls.
Developed by an international expert team, and funded by ESOMAR, WAPOR and AAPOR the course is free of charge. Go to:
For more information contact:
Professional.firstname.lastname@example.org or email@example.com
Kathy Frankovic is a polling consultant and former director of surveys at CBS News and a member of ESOMAR’s Professional Standards Committee
By Reg Baker
As our profession evolves into new practices, then so must our ICC/ESOMAR International Code on Market and Social Research. As the ICC/ESOMAR Code is of vital importance to our profession, all ESOMAR members can vote on it in a Referendum, which will be open until 31 October 2016. In this article, Reg Baker, who was part of the project team revising the ICC/ESOMAR Code, addresses one of concerns that came to light in the revision process.
Thus far, the newly revised version of the ICC/ESOMAR Code has been mostly well received by ESOMAR members with one notable exception: use of the word data subject in place of respondent. As one member queried, “What’s that all about?”
There are two answers to that question. The simplest (and perhaps least satisfying) explanation is that data privacy legislation worldwide is migrating toward the use of the term. Given current and widespread concerns about privacy and the increasing use and misuse of personal data linking the Code and the guidelines that support it to the relevant legal concepts and terminology makes good sense.
But, there also is another much more relevant explanation that grew out of the ongoing evolution and diversification of research methods and practices. When the vast majority of research was done with surveys and focus groups—that is, asking questions and recording answers—the term respondent was an accurate description of how individuals participated in research. In some of our recent guidelines we refer to this as active research, defined as “the collection of data through direct interaction with an individual.”
More recently we have seen an increase in the use of passive methods, meaning “the collection of personal data by observing, measuring or recording an individual’s actions or behaviour.” In this context, the term respondent no longer seems appropriate. There still may be an interaction with the individual, for example to gain consent, but there no longer are questions and answers. In this context the term respondent seems odd, and so we moved to research participant, to cover people who take part in both active and passive methods.
Enter big data, or as we describe it in the revised Code, secondary data, defined as “data collected for another purpose and subsequently used in research.” With secondary data researchers generally do not interact with those individuals whose personal data we might acquire and analyse as part of our research, so defining them either as respondents or even research participants makes no sense. Hence the term, data subject, defined simply as “any individual whose personal data is used in research.”
Of course, we could continue to use three different terms, each in their specific context and sometimes in combination. To those of us who work on the teams that develop guidelines, this seems to add complexity without adding value. And so, over the coming months as we go back to update our guidelines to reflect the enhancements in the new Code we plan to use the single term data subject to signal anyone whose personal data is used in research, regardless of how it was obtained.
Reg Baker, Consultant to the ESOMAR Professional Standards Committee and Executive Director of MRII
WHY YOU NEED TO VOTE FOR THE NEW CODE: