In defence of polling...

Seemingly, this take is going to be a little controversial: the polling industry got pretty much all the main stories of the July 4th 2024 general election, correct.

Jul 17, 2024

There has been some chat around of a polling miss; a charge that the British polling industry, as a collective, failed to give an accurate account of the 2024 UK general election.

This is not, as it goes, the view of the British Polling Council.

Nor is it my view. Here are all the things which happened on July 4^th which the majority of pollsters and election modellers correctly anticipated to an accurate degree:

The Conservatives suffered their worst ever defeat at a British general election.
A large number of sitting Cabinet ministers (and would-be future leadership contenders) lost their seats.
Labour won a landslide majority without putting on a Blair-level vote share.
The Liberal Democrats added dozens upon dozens of seats to their parliamentary total, without significantly growing their vote share.
Support for the Scottish National Party nosedived, with Labour replacing them as the largest party in Scotland.
Reform UK arrived as a serious electoral force, and outperformed UKIP in 2015 both in terms of votes share and seats won.
The Greens recorded their best ever result, and significantly expanded their vote share and parliamentary representation.

And what of the MRPs?

This election saw a record number of pollsters putting MRP models out – nine, by my count. Now, while

most models were pretty off the mark in terms of the scale of the Labour victory (in fact, all of them overestimated the number of Labour seat wins),
few of them properly captured the extent of the Liberal Democrat advance, and
some had the Conservatives dropping well below 100 seats,

a handful of MRPs were pretty darn close to getting the details of the election correct: including YouGov, More in Common, and Focaldata.

I am actually very proud of what we managed to achieve this election with our YouGov MRP. Every party’s seat and vote share numbers were within our stated intervals, and we called 92% of constituencies correct. A large number of those which we got wrong we listed as tossups.

To put that into context, 92% is only one percentage point less than was managed by YouGov’s 2017 General Election model which swept MRP into the British election mainstream by correctly anticipating – much against the conventional wisdom at the time – that Theresa May would lose her majority.

Overall though, I wouldn’t say this election was a great success for MRP models in general - even ours had some notable issues. Which is actually a helpful demonstration of the fact that they are no silver bullets, blessed with automatic capability to turn polling data into accurate election projections. They, just like any other model of voting and elections, require a huge amount of work and expertise, and are subject to error and mishaps.

How did pollsters do on the headline vote figures?

Onto the vote shares, and what the polling industry had to say about those versus how it actually all panned out.

Some national share projections significantly underestimated the Conservative party vote, with five pollsters having Sunak’s party at 20% or lower in their final call.

But 15 by my count did not produce such wide errors, and 10 had the eventual figure of 24% within a 2-point error margin. That, I would say, does not constitute an industry miss.

The polling was also largely pretty accurate on the Liberal Democrat and Green Party national shares, and nigh on every pollster estimated the Reform UK vote share within 3 percentage points (the majority were within two). Again, no polling miss here, I would argue.

But what should be of concern for us as pollsters is that, with notable exceptions being Verian, Norstat, and Ipsos, we did not paint an accurate picture of the level of support for Labour.

Thus, by extension, we overestimated the size of the Labour lead over the Conservatives and thus the size of their victory in terms of the popular vote.

On the former (overstating the Labour share), the polling average heading into voting day had Labour at around 39-40%. The eventual final Labour vote share on July 4^th was 35%. This is, by any reasonable measurement, a bit of a miss.

Not a large one, and not an election defining one – nor indeed one which changed the narrative of the result we eventually saw – but a miss nonetheless which we as an industry have to now work hard to diagnose, understand, and correct (more on that below).

On the latter (overstating the Labour lead), while it is undeniably true, gauging polling accuracy purely by looking at implied leads is not, in my opinion, good or fair form. Polling leads are product of, not the intended measurement of, vote share point estimates - the thing we actually measure.

Our job as pollsters is to try, to the best of our ability, accurately capture support for parties within an acceptable, pre-stated margin of statistical uncertainty or error. We are not trying to estimate polling leads – those are a by-product of the task we set ourselves.

The reason why I don’t think it is fair to judge polling accuracy based on lead size is that even a one-point error in party shares (which would be a very impressive result for a pollster) inevitably counts double in terms of estimating a lead.

Let’s suppose a Pollster X is covering a given election, and have Party A measured at 45% of the vote in their final call. But, it turned out they actually won 43%. Equally, Pollster X had Party A’s main rival, Party B, measured at 35%, but they actually got 37%.

In this scenario, Pollster X got the election correct – Party A won, and won comfortably. As well, neither party have been estimated incorrectly by the pollster, as both came in within two percentage points of where Pollster X said they were.

However, the lead would have been ‘mismeasured’ by a factor of four points; Party A had a projected winning margin of 10 points over Party B in the poll, when in actual fact it was a six-point victory.

But would it be fair to criticise Pollster X for that, given they accurately measured support for both Party A and Party B within any reasonable bounds of statistical uncertainty? I would argue that it would not be fair, and this is why pollsters ought not to be judged on measuring leads.

Back to GE2024, I am therefore resistant to arguments that pollsters mismeasured Labour’s lead over the Conservatives for the above reason. That’s not a fair stick to beat us with. But that doesn’t escape the fact that nigh on all of us did not accurately measure and report the Labour vote.

So, what might have caused the Labour miss?

Before progressing here, I should express plainly that the following are all my thoughts and my thoughts alone, and do not represent any official post-election post-morteom on the part of YouGov. The same can be said of anything I write on my Substack rather than on my YouGov page, really. That’s why the two things co-exist!

Anyway, one thing which is worth noting is that the polling industry did pick up on some pretty hefty fragmentation of the Labour vote in the days leading up to the election itself. Vote shares of 40%+, even pushing 45%, were pretty common in the week of the 17^th of June. But by the time of their final calls in the week of the 1^st July, many pollsters had Labour sub-40%.

This fragmentation looked like it continued right up until the day of the vote itself, which is definitely an important part of the story of the overestimation of Labour’s stated vote share - things were happening past the point at which we were collecting and reporting data.

But that isn’t the only source of error, I don’t think. Another more concerning factor that we failed to properly register the extent of the fragmentation within certain elements of Labour’s voter coalition – most consequentially, the Muslim vote.

Taking the YouGov MRP as an example, while the anticipated that Jeremy Corbyn had a very good chance at winning his Islington North constituency and (notionally) taking it off Labour, we did not anticipate any of the Labour defeats in Birmingham Perry Bar, Blackburn, Dewsbury and Batley, Leicester East, or Leicester South.

What all the above seats have in common is a particularly large Muslim population. Each, bar Leicester East, were won by independent candidates standing on strong pro-Palestine platforms.

A number of further Labour MPs, including the new Health Secretary West Streeting and Victims and Safeguarding minister Jess Philips, came within a hair’s breadth of losing their seats to similar candidates.

Recent analysis from the Economist suggested that for every 1 percentage point increase in Muslim population in a given constituency, Labour’s vote went down by an average of 0.7 points. This is, by the way, a continuation of a pattern spotted in the 2024 local election results by Matt Singh (among others). Labour’s current struggle with Muslim voters is very much real.

As well, evidence from Leicester, Brent, and Harrow also suggest ongoing problems with Labour regarding capturing the Hindu vote – a community we have known to be much less predisposed to Labour than other ethnic minority groups for some time, now.

Should we now be saying the same of the Muslim vote? Has it reached a point where we should simply no longer assume by default that Labour can bank support from Britain’s Muslim communities? If so, how can pollsters improve their methods to properly capture that? These are questions we must ask of ourselves and answer by the time the next general election rolls around.

Anything else?

We also may, and probably did, miss things regarding turnout and younger voters. Along with ethnic minority voting behaviour, these are two further areas where we know polls tend to struggle.

The stars, I think, aligned against the industry in this regard on July 4^th – significant and important movements and moments in all the areas that we struggle most in.

Keep an eye out for full and proper diagnoses of polling performances by pollsters as the weeks and months progress. I’ve offered some thoughts and hunches here, but I do so from my post-election escape to Borneo without having seen or worked with any underlying data (yes, I left my laptop at home!). I could therefore be quite wrong about all the above, or only telling part of the story. More to come!

Plain Speaking English

Discussion about this post