Some thoughts on modelling

One of the most controversial aspects of the British Government’s response to the Coronavirus crisis has been its reliance on modelling of the pandemic. There have been suggestions that Boris Johnson preferred to listen to experts in this arena, rather than to those in more traditional branches of public health. In the end, the Government’s strategy changed but there will be many questions to be asked of the UK’s initial reliance on modelling and the efficacy of those models.

This is not the first time that predictions generated by computer models have hit the headlines, just think about climate science – all the predictions of our collective climate future are based on models. These headlines don’t always do us any favours. The Oxford University study that created headlines reporting half of the population could have been exposed to the virus seemed irresponsible in the manner of its reporting. If half the population had already been exposed and so few were sick what was there to worry about? It turned out there was a lot to worry about as this result was generated by an extreme set of initial assumptions.

I’m currently writing a book on strategy and this potential for dangerous news stories was one of the reasons why I’d been thinking about modelling as part of a chapter on risk assessment. I’d concluded that if there is one thing that would improve our ability to assess risk, then it’s to acquire a better understanding of the flood of predictive data coming from models and algorithms. A basic competency with how data sets are collected and analysed, and how the conclusions are reached and presented is absolutely fundamental to good risk assessment and to making good decisions in the modern world.

A model is some — admittedly often complex — maths that calculates an outcome based on the best thinking available in whatever area we are trying to predict. I’ve used modelling a lot in my professional life as a race boat navigator with software that predicts the optimal route in changing weather.

The first thing to point out is that the weather routing software doesn’t predict the probability of an outcome. It doesn’t do risk assessment. It doesn’t output; “the chance of a top three position is 40% on the northern route, and 60% on the southern route”. Instead, it just shows the fastest way to sail from A to B based on the parameters (weather and accompanying boat performance) that it’s given. It doesn’t assign a probability to the chances of this outcome being correct for a very good reason — in the limited world view of the model there is no uncertainty. For instance, it doesn’t compute the possibility of the boat being damaged in particularly bad weather.

And that’s the first point I want to make about modelling — the algorithms very rarely if ever take everything into account. This is the first problem and the very first question that needs to be asked when presented with the output from any model that’s predicting the future;

Does the prediction include the influence of everything that might affect the outcome?

In the case of the weather routing software the answer is definitely not; apart from not taking into account the chance of damage, it also makes assumptions about the boat speed in different wind conditions. The results are also dependent on the accuracy of the weather forecast – and I’m sure we’re all aware of its limitations. Who’s not left the house without an umbrella or jacket and needed both before the end of the day? There are many good and well understood reasons why weather forecasts are inaccurate. These throw light on more systemic problems with many of the other things we try to model.

Firstly, the atmosphere is an incredibly complex system with many different variables; temperature, pressure, humidity, wind speed, wind direction and so on. The models must combine all these variables in complex calculations based on our understanding of the physics that controls their interaction. Unfortunately, we still don’t completely understand all the physics of the atmosphere. We’re getting better at it, but there are still many questions to be answered. So the next question for our model is;

Is the science behind the prediction mature, well understood and accurate?

Of more concern is that even if we did completely understand all these processes down to the molecular or atomic level, we still couldn’t model the weather perfectly because we can’t capture enough detail in the starting conditions. We are getting better at this with satellites, weather balloons, aircraft and drones all adding to the traditional stalwarts of global met data collection — the land stations and ships — but it’s still a far from complete picture. And if we don’t know the full set of starting conditions and know them accurately then our predictions will have limited accuracy.

This is generally true; the bigger the data set the better. Any experiment repeated a thousand times is more likely to provide an accurate answer than one repeated five times. However, resources are not infinite, and no one is going to collect data indefinitely. So data sets will be limited, and we have to decide whether the size of any particular data set is sufficient for the analysis to be credible.

It’s also true that there are many more ways to generate bad data than there are to get good data. If instruments are involved, they need to be an adequate quality, properly calibrated and maintained. If people are involved then anything that they report themselves, by filling in a form or survey is less reliable than that obtained by experiment. And experiments with people are better when they are double-blind (so both experimentor and experimentee are unaware of what’s being done and to whom), and randomised (so the experimental groups are selected at random) to limit the chance of confounding variables. So, the next question to ask of any prediction that you come across is;

Is any data used accurate and comprehensive enough to provide a useful prediction?

It would be bad enough if this were it; but there is an even worse problem for weather forecasting (and some other predictions) in the so-called butterfly effect, an idea developed by an American mathematician and meteorologist called Edward Lorenz, who came up with Chaos Theory. Chaos Theory describes dynamic systems that are particularly sensitive to the starting conditions, so in weather terms, the classic example is a butterfly flapping its wings. This tiny effect subsequently influences the creation of a tornado several weeks later, as it ripples out into the atmosphere.

It’s not just that a tiny change in the atmospheric starting conditions can radically alter the outcome, but that even with perfect knowledge of every single molecule in the atmosphere at any given moment in time, we still couldn’t forecast the weather completely accurately because we can’t control all the butterflies, birds, people, animals or volcanoes that can influence the outcome as time goes on.

So in the case of the weather forecast and many other unstable systems, what we understand from the mature science is that in fact there will always be limitations on the accuracy of the predictions. So we can add one more question;

Does the science behind the prediction envision and allow for a completely accurate answer?

These four questions make a good first test when you see a prediction that’s been generated by a forecaster. In the case of a claim that there is a fifty percent chance of rain today, the answer to all four questions is no – and yet as a navigator I rely almost completely on the weather forecast to develop a strategy. And this is one of the most important points about models; if you are familiar with them, understand the limitations and assumptions then even bad models can be helpful – but when you don’t they can be dangerous.

The problem is that we are all exposed to modelling data that could have a potentially huge impact on our lives – so the four questions are intended to provide a way to deal with that data, to get some sense of its value, and work out how to treat it when we have no expertise in that area. I have zero healthcare or epidemiological knowledge, and yet as the outbreak gathered momentum in the news cycle, I needed a way to assess the predictions that were making headlines, to make a judgement on how to act. So I applied the four questions to the Coronavirus models.

Does the prediction include the influence of everything that might affect the outcome?

No. It was easy to find an example of something that wasn’t included – we don’t have any idea whether there is a seasonal element to transmission of the virus.

Is the science behind the prediction mature, well understood and accurate?

Epidemiology appears to be a well-resourced and mature discipline – however its need for assumptions about human behaviour will always mean there is room for doubt, and some people appear to even question its status as a science.

Is any data used accurate and comprehensive enough to provide a useful prediction?

No, not at this stage of the pandemic. For one thing the amount of testing is massively variable from country to country which completely undermines the data on the number of confirmed cases, and even the number of deaths is prone to errors from uneven reporting.

Does the science behind the prediction envision and allow for a completely accurate answer?

Probably not; chaos theory is a consideration in epidemiology, and I found some evidence that some systems are susceptible to chaotic outcomes.

These are the answers I came up with after half an hour on google as the outbreak gathered pace in the news cycle. The references I found seemed credible to me, but I’m not vouching for them. In my opinion -- and when it comes to working out how the outbreak affects me that’s the one that counts -- they were good enough to reach a conclusion. In this case, I felt that I shouldn’t put too much weight on the modelling, not least because there was a far better data source; the news bulletins out of China, Italy, Germany and Spain. It was very clear which public health strategies were working and which weren’t -- the best data was the real-time experiments being run in the countries that got it before us.

It seemed clear that the British Government’s strategy was wrong and that minimum social contact and – where possible – isolation was the best individual response. It felt almost inevitable that the NHS would be overwhelmed. It was going to be a bad time to get sick, and the priority for a few weeks at least was to stay well.

Fortunately, the Government changed its strategy and this is partly because my reaction was far from unique. The UK public and business response led the political leadership right up to the big U-turn on 23rd March. While there’s evidence that the output from high quality modelling was useful in the hands of professionals who understood what they were dealing with, it’s a tragedy (but hardly a surprise) that Boris Johnson and his leadership team weren’t among that number.