In late 2018, when I still worked at the Met Office, I sent a document to some people there which explained why I thought AI would come to dominate weather forecasting, and why weather forecasting organisations should be looking at AI, urgently. Today, the 28th of July 2023, there is a leader on the subject in The Economist as well as an extended article in its Science and Technology section.
Neural networks are likely to provide better weather forecasts in due course than current numerical models. If this is true then weather forecasting organisations that don’t use them will be replaced by ones that do. Even though this only may be true, weather forecasting organisations should be investigating these techniques, today.
[…] NN models are likely to be highly successful for weather prediction. However they will not be trivial to design and deploy: cargo cult NN approaches are not going to work.
If NN models are successful then they will largely displace hand-crafted physics-based models (GCM models such as UM3). Weather forecasting is a service, and consumers of the service care only about how good the forecasts are rather than how they are produced.
If this happens then organisations involved in weather forecasting, such as the Met Office, will need to adopt NN models or cease to exist: NNs are an existential threat to weather forecasting organisations.
This means that such organisations should be investigating NN models very seriously now so that, in the likely case that they are successful, they are not left behind.
The traditional approach [to weather forecasting] is to understand the physics and write a system which numerically solves the equations to a lesser or greater degree of accuracy. This has been pretty successful of course.
An alternative approach is to not do that at all, but rather build a system which can, itself, learn to simulate the weather: a system which can be trained to simulate the weather, in other words, based on observations. As far as I’m aware such an approach has not been tried on any significant scale.
There is copious training data. There is obviously a really huge amount of data which can be used to drive a model, which NNs love. But NN models need training data in general: they need to be told how well they did so they can correct their weights. And weather is almost the best example it’s possible to think of of this: if we want to predict, say, rainfall in 24 hours time, then, if we wait 24 hours, we know how much rain actually fell, and we can use that data to teach the model how do to better. And this is true for everything, all the time: every time the model makes any prediction about the state at some future time then, at that future time, we know what the state actually is and can use that information to train the model. This is the sort of situation NN people dream about.
[…] Hand-crafted models are more likely to remain sane than NN models in the early stages. There’s no rule that says that an NN won’t get some mad idea into its head and start, occasionally, making predictions which are completely physically insane.
While NN models are an almost perfect fit for weather forecasting they are, perhaps surprisingly, a terrible fit for climate modelling. This is for two reasons.
Sparseness of training data. NNs are likely to work for weather prediction because the training data is so copious: if you want to predict the weather a given time ahead then you simply predict, wait until that amount of time has elapsed and you have training data, and then you iterate this process. You can’t do that for climate: if you want to predict the climate a century ahead you can neither wait for a century for the training data nor can you iterate the process.
Opacity of NN models. Even if climate modelling by an NN is technically practical it’s an absolutely terrible answer to the questions people actually want to answer. If I run some NN model and it predicts 4 degrees of warming by 2100 the first thing people will ask is ‘why does it predict that?’. And the best answer to that question is ‘because some opaque blob of weights which neither I nor any human understands told me that’, which is a terrible answer: it’s essentially the same as ‘a voice in my head told me’. Given the political sensitivity of climate modelling this is not going to be an answer anyone will accept, and nor should they.
So climate modelling is a really good example of a place where a transparent physics-based model is the only reasonable answer. And that’s ultimately because the people who are interested in climate ere not just interested in a statistically-good prediction (whatever that even means in this case): they’re interested in why the prediction is what it is. Climate modelling requires hand-crafted physics-based models, and there’s no way around that.
Here is an excerpt from The Economist’s leader:
The application of machine learning and other forms of artificial intelligence (AI) will improve things further. The supercomputers used for NWP calculate the next days’ weather on the basis of current conditions, the laws of physics and various rules of thumb; doing so at a high resolution eats up calculations by the trillion with ridiculous ease. Now machine-learning systems trained simply on past weather data can more or less match their forecasts, at least in some respects. If advances in AI elsewhere are any guide, that is only the beginning.
Well, I am not some unique genius: many people could, and probably did, see what was coming when I wrote the 2018 document. I predicted that neural network approaches would come to dominate weather forecasting, and it looks like they will.
But what I also realised remains, I think, important, and is not addressed at all in the articles in The Economist. And that is this:
- AI, in the form of neural networks, is not a suitable approach to climate prediction both because the training data is inadequate, but more importantly because it is critical that climate models not only predict the climate but allow people to understand why they are predicting what they predict, rather than simply being an opaque blob;
- currently climate models, at least in the Met Office and I am sure elsewhere, are to a great extent parasitic on weather models, sharing a great deal of of their code with those models.
This means that if weather forecasting becomes dominated by opaque NN models, climate modellers will have to bear the entire cost of funding development of their models. Chances are they can’t do that.
An even worse outcome would be that climate modellers leap into using opaque NN models without thinking through what this means. This would hand the climate denialists who increasingly dominate the politics of the UK a weapon which they would certainly not hesitate to use.
When I sent the 2018 document to people in the Met Office I did not even receive an acknowledgement: I am quite sure nobody read it. I think this says a great deal about the nature of organisations like the Met Office.
Despite how the all this might read, I’m not at all embittered by this: if I cared about the Met Office in 2018 I certainly don’t now, four years later. If anything, I’m rather pleased that what I thought, in 2018, would happen does indeed seem to ba happening. Most importantly I want the other thing I realised in 2018 — that climate modelling isn’t well-suited to NN approaches and that organisations which do both weather and climate modelling need to worry about this as NN approaches to weather forecasting eat physics-based approaches alive — to exist in some form that is accessible to people. That’s why this article exists.
Note that I used the term ‘neural network’, abbreviated to ‘NN’ in the document, as I did not then (and do not now) want to lazily consider neural networks to be the same thing as AI. ↩
UM, the Unified Model, was the model the Met Office used for both weather and climate modelling ain 2018. ↩