The problem with opinion polls

For months, virtually all the big United Kingdom polling institutions had been giving the two main parties neck and neck in the parliamentary elections. Everyone was thus stunned by the (huge) seven-point differential between the two parties when the results poured in on election night. Interestingly, while the survey results had been so wrong, the exit polls were spot on.

How could the polling institutions have erred so badly, considering the decades of experience that many of them have, and most of them being affiliated with high-reputation professional organisations such as the British Polling Council? The debacle was so serious that both the British Polling Council and the Market Research Society quickly announced independent inquiries into the pollsters’ methods.

The problem is that this kind of collective fiasco predictions has been happening more and more frequently. In March, a similar scenario unfolded with the Israeli elections, as (again) the incumbent party and its prime minister emerged with a very clear win when the polls had predicted a dead heat with the opposition. Last year, the referendum on the independence of Scotland also failed to predict the final 10-point resounding win for ‘No’. And likewise, to some extent, with the 2014 congressional elections in the US.

Have the polling organisations lost their mastery of collecting and analysing opinion data? Have people been deceiving the pollsters or perhaps just making decisions on the last day? What has been happening?

The first important thing to know and keep in mind when discussing surveys is the samples from which the generalised results are extracted. Difficulties and costs of collecting data (interviewing) are such that most polls are conducted with only a few thousand people. Further data cuts reduce the sample size to typically about 1,000; this then implies an “error margin” of about 3 per cent. Hence, if the poll says “40 per cent of the population is with X”, and if the sample size is 1,000, what it essentially means is: “We can say with high confidence that 37 to 43 per cent of the population is with X.” To reduce the error to 1 per cent, the sample size would need to be increased to 10,000.

The second important thing to understand, and this relates more directly to the recent problems, is that people have more and more frequently been polled online and/or over mobile phones. Before recent years, surveys were conducted mainly via landline phones and, when feasible, face to face. Online surveys are obviously much cheaper than phone interviews (requiring a lot less person-hours), and landlines are fast disappearing from our lifestyles. But why would online and mobile-phone polling be suspect? Because for online surveys, one does not know who exactly filled the form, and cell phones users do not always represent the population well. Indeed, the fact that exit polls, which are most often conducted face to face and at voting stations, have consistently given accurate results, confirm the large uncertainties attached to the cell-phone and web-based surveys.

Thirdly, one must note the “bandwagon effect” that seems to have played a role among polling institutions. Indeed, it later surfaced that some survey results were not published because they seemed “out of line” with the “mainstream” ones. Releasing only surveys that seem to mesh well with the prevailing “understanding” is a serious effect that should ring a loud alarm among the big survey organisations.

And fourthly, there is the “devil that we know” effect, whereby people huff and puff against the party in power, but then, shortly before voting, decide that “the risk of changing horses is greater than the shortcomings we’ve been complaining about”...

Needless to say, polling institutions have been aware of these complicating factors and have modified their analysis tools accordingly, but the recent failures seem to clearly indicate the need for bigger overhauls of the surveying methods.

One potential solution is surveying via social media. Indeed, recent statistical analyses of Facebook data (posts, “likes” and “friends” networks) have revealed remarkable information about people that they may not even admit to their real friends. Likewise, for Twitter, which I described in a recent column as “an instant X-ray or pulse reading of communities at various scales”. These social media really deal in “big data”, offering millions of views and attitudes, not just a few thousand answers to a handful of questions.

Pollsters have taken note of this potential game-changing treasure trove of information, but they have also remarked that the views expressed on social media come with large uncertainties of their own, not to mention the thorny issue of privacy (does anyone have the right to count and use my Facebook “likes” and my Twitter “retweets”?).

Quantitative social science is difficult, for various reasons, including human factors, but it has made huge progress in recent times. We have become more demanding of its results, in terms of both accuracy and speed of delivery. With new tools and techniques, and greater scrutiny, future results should be much more improved. The next few big elections will be interesting to watch.

Nidhal Guessoum is a professor at the American University of Sharjah. You can follow him on Twitter at: www.twitter.com/@NidhalGuessoum.