Pollsters around the world have failed to predict the outcomes of many important elections over years. In the US, Franklin Roosevelt defeated Alf Landon in 1936, Harry Truman defeated Thomas Dewey in 1948, Donald Trump defeated Hillary Clinton in 2016 – all against the predictions of the major opinion polls. In India, many opinion polls performed poorly – either to understand the winner or to estimate the quantum of winning margin – in most of the general elections of this century.

The situation is also quite pathetic in the UK. Major opinion polls performed miserably in the general elections of 2015, 2017, and in the Brexit referendum of 2016. While most opinion polls in the 2017 UK snap elections failed to predict that Theresa May would lose her Commons majority, and, in effect, Brexit would be stalled indefinitely, YouGov’s Multilevel Regression and Post-stratification (MRP)-based election model had consistently predicted that the most likely outcome of the election would be a hung Parliament.

The model estimated that the Conservatives would win 269-334 seats (with a mid-point estimate of 302 seats), and that Labour would win 238-302 seats (with a mid-point estimate of 269 seats) – the actual figures came out to be 318 and 261, respectively, which was a tremendous performance in the midst of the debacle of most other opinion polls. What’s more, YouGov’s MRP model could correctly predict the shock results in seats such as Kensington and Canterbury. A US polling company, Langer Research Associates, also said that they tested an MRP model in the 2016 US presidential election, and it correctly predicted the outcome.

No wonder that YouGov employed the MRP technique for producing opinion polls once again in the 2019 UK elections, and the interest among media and people about this sort of an opinion poll was much more this time around. YouGov’s second and final MRP poll of the 2019 general election showed that Conservatives could win 339 seats with a vote share of 43 per cent, and Labour Party could bag 231 seats with 34 per cent of the vote. The actual result was somewhat in the same direction (Conservatives: 43.6 per cent votes, 365 seats; Labour: 32.2 per cent votes, 203 seats).

It is well-known that opinion polls often predict the overall vote share without much error, but the estimation of seats is often quite erroneous. Extensive tactical voting as in UK, switching between parties and uneven geographical spread of party support across the country, as happens in any multi-party system, makes the predictions of seat shares very much prone to errors. Usually, pollsters conduct research on a sample of people and try to ensure that their sample is representative of the whole population with respect to different socio-economic parameters.

In MRP, a large nationwide poll is carried out instead. But the sample may not be representative of the whole population. Lots of other data about the people in the sample are recorded, and a mathematical model is devised to represent how various groups of people are likely to vote. The idea of an MRP model came from a 1997 research article of Professor Andrew Gelman of Columbia University, and Thomas C. Little of Morgan Stanley Dean Writter. The first part is to fit a model to do the adjustment, and inferences for the population are made in the second part, which is called post-stratification.

While, it is difficult to tell how, in a traditional opinion poll, the national vote share translates into the number of seats, MRP allows one to predict local results from national surveys. This model is then used to estimate the probability that a voter with specified characteristics will vote in favour of Conservative, Labour, or some other party. Using data from the UK Office of National Statistics, the British Election Study, and past election results, YouGov has estimated the number of voters of each socio-economicdemographic criterion in each constituency.

Turnout is assessed on voters’ demographics and based on analysis of 2015 and 2017 British Election Study data. Combining the model probabilities and estimated census counts, a reasonably accurate estimate of the number of voters in each constituency favouring a party is expected to get is prepared, which essentially allows the poll to make granular predictions. MRP has been used by other agencies in the 2019 UK election. A week before the election, a poll by Datapraxis, also using the MRP model and based on 500,000 online interviews, predicted that Johnson would win a majority of 38 in parliament. Several websites set up for guiding tactical voting, also used MRP models.

For example, an anti- Brexit campaign group, Best for Britain, used an MRP model to advise ‘Remainers’ how to vote tactically in their constituencies. Certainly, MRP will become more popular after its success in predicting the 2019 UK election. However, MRP has been facing criticism as well – not everybody in the business think it is the holy grail. For example, Nate Silver of FiveThirtyEight opined, “MRP is the Carmelo Anthony of election forecasting methods”, and that’s not a compliment – Carmelo Anthony being an American professional basketball player for the Portland Trail Blazers, who is considered by many to be much over-rated.

According to FiveThirtyEight, ‘polls’ blended or smoothed using methods such as MRP means running a model rather than a poll. Will this technique be used to devise opinion polls in India in the near future? MRP is certainly not going to be the final answer for opinion polls. But it is encouraging that people are trying to make opinion polls more realistic by using sophisticated statistical techniques and using various other datasets. These approaches are in the right direction, at least.

(The writer is Professor of Statistics, Indian Statistical Institute, Kolkata)