I've used this example when teaching students about models used for predictive power vs. models used to better understand mechanisms:
You have a model that predicts with great accuracy that rowdy teens will TP your house this Friday night, so you sit up late waiting to scare them off.
You have a model with less predictive power, but more discernible parameters. It tells you the parameter for whether or not houses have their front lights turned on has a high impact on likelihood of TP. You turn your front lights on and go to bed early.
Sometimes we want models that produce highly accurate predictions, sometimes we want models that provide mechanistic insights that allow for other types of action. They're different simplifications/abstractions of reality that have their time and place, and can lead you astray in their own ways.
Can't you run the model in reverse? Brute force through various random parameters to the model to figure out which ones make a difference? Sure, it could have absurd dimensionality, but then it would be unlikely one could even grasp how to begin. After all, AlphaGo couldn't write a book for Humans about how to play go as well as it can.
That's what model interpretability research is. You can train an interpretable model from the uninterpretable teacher, you can look at layer activations and how they correspond to certain features, or apply a hundred other domain-specific methods depending on your architecture. [0]
Sadly, insight is always lost. In a noisy world where even with the best regularization, some fitting on it, or higher order features that describe it, is inevitable for maximizing prediction accuracy, especially if you don't have the right tools to model it (like transformers adapting to lacking registers [1]) and yet a lot of parameters within chosen architecture.
What's worse, bad expectations are often much worse than none. If your loan had been denied by a fully opaque black box, you may be offered recourse to get an actual human on the case. If they've trained an interpretable student [2], either by intentional manipulation or by pure luck, it may have obscured the effect of some meta-feature likely corresponding to something like race, thus whitewashing the stochastically racist black box. [3]
This reminds me of another thing I use when teaching: a perfect model of the entire world would be just as inscrutable of the world itself.
I think having multiple layers of abstraction can be really useful and have done it myself for some agent-based models with high levels of complexity. In some sense, these approaches can also be thought of as "in-silica experiments".
You have a model that is complex and relatively inscrutable, just like the real world, but unlike the real world, you can run lots of "experiments" quite cheaply!
At some point we always need some sort of model of the world, you don’t want to call it physics?, that’s fine, but you are still modeling the world
Interestingly though, physical models are usually expressed as mathematical equations. Which is an arbitrary way of modeling
A Neural Network could technically “discover” different models, just through optimizing predictions for whatever we want the model to do
We might not be able to distill the NN into nice compact equations, but they might still form a pretty good model of whatever phenomenon is fed to it through observational data
Note that just by picking the input data and the output expectations, we are already defining a model
My comment was in the context of the article, which states that AI doesn’t need physics
Having models doesn’t imply what you probably mean by “knowing about physics”, it just means having a representation of something that is not that something (like a map of the world is a map, not the world)
So, depending on how the dog’s understanding of the world works, it maybe doesn’t need any models, I can’t really know
But I do know, that if I want to describe anything, using any symbols whatsoever, then I’m implicitly creating a model of what I’m trying to describe with the symbols
So, if I’m trying to communicate and understand physical phenomena using AI, then I’m implicitly creating a physics model, whether I want to call it that or not
> Interestingly though, physical models are usually expressed as mathematical equations. Which is an arbitrary way of modeling
Not exactly. The particular symbols used in mathematics may be arbitrary but the mathematical structures and relationships are not.
There’s some interesting research on how the limits of predictability may come down to what is mathematically provable in a formal system. That may sound kind of weird, but consider that any computable model can be modeled as a Turing machine, which is essentially a process that manipulates symbols according to a set of rules—not too different than mathematics itself. The difference is that based on certain assumptions of internal consistency (that cannot be proven if the system is in fact consistent), the mathematical manipulation of symbols can be used to make predictions about the behavior of Turing machines. There’s a very deep connection between the two and neural networks are still just another form of this, perhaps not as human-interpretable however.
No. That's precisely the issue here. NN's do not "model the world" in anything remotely like the traditional meaning of that term.
Modelling the world historically means identifying objects/phenomena and proposed causal relationships between them. Without the relationships - e.g. if we add heat to the system, it will move more - there's no model. You may still be able to get predictions, and they may still be useful, but you are not defining a model.
A charitable interpretation of nico is that he was saying a well-trained NN is itself a model of the world. If it can tell you what a system will do given some inputs, then it functions as a model. While internally it isn't creating a model that we could understand, it does "model the world" in the sense that we can treat it as a model
Consider for a moment replacing the NN with another person, who forms a model of the world that is very useful for prediction.
Now our lead experimenter asks this person "what will happen if the global average temperature increases by N degreesC?" and they get an answer.
Can we way that the lead experimenter has built a model? They have not, certainly not in the sense that they have any access to it. The person who replaced the NN may have (and indeed, probably has) built some sort of model, but that's a very different claim.
Explainability in NN/ML systems is a hot topic, and many people (not all!) would say that if the NN/ML system cannot explain why adjusting parameter X will cause changes in parameters A, M and T, then you have no access to anything that merits being called a model.
A consequence of this is that if the person who replaced the NN can explain themselves (e.g. answer the X -> A,M,T coupling), then even the experimenter can probably be said to "have a model". But if all that can be said is "I don't know and/or I can't explain, you just need to trust me that this coupling is real", then the claim that a model has been built is on unstable ground.
A truism in the computer modeling communities of the 1970s and 1980s was “the product of a modeling exercise is not the computer model, but the modeler”.
The insight gained by rigorously modeling a system in computer code produces a person (the modeler) who can provide valuable insight when asked questions about the system. In policy analysis, the modeler’s insight can often provide quick and dirty and auditable (and often correct) analyses/answers about the modeled system without ever running the developed formal computer model. The exercise of the formal development of a computer model credentials the modeler as having gained a level of rigorous systems-level expertise. And the scope and detail of that modeler knowledge is certified in the depth and breadth of the computer model itself (and the currency and accuracy of the input data sets).
Nice to have such an human analyst around when important policy decisions need to be made, since such policy decisions should be made and implemented by humans who can explain the confidence that exists regarding the knowledge that supports the given decision. The decision makers can then point to the analysts for the estimate of the degree of confidence that can be ascribed to the policy analysis that supports the decision. That’s how it’s supposed to work, and that philosophy is formalized in existing decision processes for complex technical systems such as transportation, telecommunications, power, military systems, etc.. You know, the important stuff….
> Now our lead experimenter asks this person "what will happen if the global average temperature increases by N degreesC?" and they get an answer
What symbols or language did the lead experimenter use to ask the person the question? And what does degrees, temperature and global mean?
All of those things require models to be communicated between the components of your system
Any symbolic communication is necessarily a model of what it is trying to represent
Of course, if there isn’t someone to interpret it, it’s just symbols. But to interpret a meaning behind symbols, then it implies the symbols represent a model of the meanings that are being communicated
I’m saying the person setting up the NN is doing so by following some model of what outputs to expect from the inputs
To train a NN you need a training set of data, that data follows a certain order or pattern that represents the model that the person has
As long as the NN behaves as expected by the person setting it up, then it is useful as a model
How useful? That depends on what you are trying to do, your model and the data
If you want to take it to its extreme, all language is a model. Even what I’m expressing right now with this text. It’s just a model of what I’m thinking and the meanings I’m trying to convey
> To train a NN you need a training set of data, that data follows a certain order or pattern that represents the model that the person has
I have been repeatedly assured by people who manage to sound authoritative about NN's and ML here on HN that, despite my instincts to the contrary, this claim is no longer true. I continue to doubt it, but there you have it.
Your claims about language are quite interesting, and quite controversial.
No, you obviously need data. But several times when I've challenged the unacknowledged role that data pre-identification and classification is playing, I've been told that this isn't needed any more. You can just throw unorganized, unclassified data at these things (so they claim), and it will "figure it out".
in some sense yes, and if you are only interested in good predictions this might work out well. What, maybe due to my limited understanding, is, that this is not theory driven and therefore does not really provide understanding of the underlying process.
> and therefore does not really provide understanding of the underlying process
What is “the underlying process”?
For example, Newton was able to model gravity quite successfully without ever being able to “understand the underlying process”. In fact, physics today still doesn’t have a good grasp on what gravity is. Yet we use the models and equations all the time
In a way, physics is also a collection of black boxes, perhaps just seemingly more elegant boxes
Gravity can be modeled as "things on the earth's surface accelerate downward at about 10 m/s², regardless of their mass". This works very well. Gravity can be modeled as "planets orbit the sun according to Kepler's laws". This also works very well.
Newton realized that these phenomena could be explained as arising from the same underlying process of an inverse square law. This is a much more useful model, and allows predictions that allow us to do things like space flight, even if it is not complete.
The simple ones: advection, latent heat release/absorption from water changing phases, and the Coriolis force. If you need an AI for this, please take a course on differential equations.
The hard ones: droplet/ice crystal formation, cloud feedback on radiative transfer, evaporation at air-sea boundaries. If you can train a model for these processes, please, please tell someone.
"what is the underlying process" is another way of saying, all we have is models. We don't really understand anything. Even gravity. Yet, we can model gravity extremely well for practical purposes.
Exactly. Taking it further, I don’t understand how my hands work. Yet here I am typing away, without even having a good model for it, except just the language I’m using to describe what I’m doing
Physics wants to open the black boxes until it can no longer figure out how to pry the remaining boxes open!
It's not useful to draw a false equivalence between AI-style "the model predicts, that's good enough" and science as a whole which cares very much about the underlying structure.
Why are you assuming that people want to stop understanding the AI models?
If anything, AI researchers are digging deeper into the models too
And people in physics are starting to use AI tools to model physical phenomena
I think that it’s a never ending task to understand all the black boxes. Definitely not possible by a single person. But also at some level you get to circular references. There is no fixed point in the universe, there is no point 0 or origin that we can find. Everything is relative to something else
Wouldn't you be able to somehow couple it with another model that takes the NN data and somehow untangles it's convoluted logic into an isomorphic human readable equation, ie. a model that has one task and that is translating NN logic into human equations.
The training data could be real physics in a simulator held up against evolutionary driven AI logic that competes against it with various goals that are then evaluated and if given a high score then marked as isomorphic and given enough runs you'd get a dataset.
I think at some point it is worth admitting that there are variables you can't account for. Like the precise geography - the models model an area 1 mile square as a single vector, maybe even more coarse. They don't model every tree, rock, and bush. In a neural net you can just have "weight goop" which accounts for the net effect of these unmodeled features, but in a traditional model adding "fudge factors" and extrapolating back from the model to points of interest is tricky.
That's like saying chemistry has nothing to understand because we know Schrodinger's wave equation. Or that we understand biology and psychology just for for the same reason.
Yeah, I think that makes sense. Those systems are similar in that they are made of zillions of tiny parts and there's no way to pull a one equation to rule them all out of it. We could, given infinite compute, but it's just not feasible.
Interesting timing. Microsoft Aurora[0] was announced yesterday although that got very little attention here[1]. Too recent to be added to the article.
> The first step is potentially changing the way data is assimilated into AI-based models. At present, they almost universally use a set of initial conditions produced by a physics model. That is, a model like the ECMWF spends an enormous amount of computing power to collect data from buoys, surface stations, weather balloons, airplanes, ships, satellites, and many other sources and then synthesizes a set of initial conditions for grid points across the planet. All models then take this as the beginning "state" of the planet's weather and forecast from that.
So this is essentially learning the time-stepping part of the physical model, not deriving predictions from raw data. While still interesting and probably still complex, this is far less impressive than the title lead me to believe.
You think the difficult part is merging observations with the last forecast? I guess it's a very underdetermined problem, but isn't the loss function (compare the forecast grid with later observations) the same whether you're doing grid_t0 -> grid_t1 or (observations, grid_t0) -> grid'_t0 -> grid_t1? I don't know enough about ML to know how much complexity the extra step adds, but doesn't seem like a massive difference.
Observation assimilation is a huge field in and of itself. Observables have biases that have to be included in assimilation, they also have finite resolution and so observation operators need to be taken into account.
I just assume every AI headline is one damn company or another trying to juice their stock price by finding someplace in their product line to shove an LLM. I'm right more often than not.
> For example, deep learning weather models have proven to be excellent at forecasting the tracks of hurricanes. But while these models are better at predicting where hurricanes will go, they tend to be lower-performing on the intensity changes of such storms relative to physics-based models.
For context, intensity changes are the current Big Problem. Otis [1] had its track predicted almost exactly, but its explosive intensification from a tropical storm to a Cat 5 was totally unpredicted. Possibly some of the ~$12bn damage could have been avoided if Mexico had known that in advance.
I've said this before in another comment some months back, but I'll repeat: my worry is that these models aren't learning some comprehensive new climate dynamics model with parameterisations [2] , but only fitting what the Earth has historically done. And if AI weather prediction is only learning what climate dynamics do 95% of the time, it's almost by definition not useful for predicting extreme weather and it will get less accurate the more the climate changes. You're just going to get more Otises.
[2] much as I would welcome, with open arms, some accurate AI-generated black-box parameterisations for e.g. subgrid precipitation - might be more explainable than the FORTRAN black-box parameterisations we have now :)*
My naive understanding is that the majority of temperature data comes from where humans are: the surface. Hurricanes are 3d, extending up for miles. The models go almost entirely off the surface temperatures, with very very sparse balloon data (which is a poor sample, since a balloon will follow the air it's put in). Wouldn't the whole volume, or at least a little of it, need to be observed, since the energy in that volume is what's powering the hurricane, not the energy on the surface? I would assume this is why the models have trouble.
Satellites also measure the temperature/height of clouds, and there's also some data from aircraft. A lot of commercial aircraft automatically report the temperature/pressure as they fly. The only problem is that a lot of their flight is in the stratosphere, but they give good data in their climb/descent.
Compared to weather balloons, it's quite a bit more data. In the U.S. there are only 91 weather balloon launching sites, so that's 182 observations per day. AMDAR has 700 aircraft, and each one probably makes about 4 flights per day, and they get a temperature profile going up and going down, so that's 5600 profiles per day. There are about 450 airports in the U.S. with regular commercial service, and the majority of these are covered.
One thing these models gave going for them is they won’t come at the problem with preconceived notions.
A sensor that’s off by some constant factor is feeding bad data to a physics model and resulting are deemed incorrect if they don’t match the future value of that sensor. AI on the other hand could self correct for such issues because the data doesn’t mean anything only the patterns.
I can only assume current models include everything even tangentially relevant from albedo to topology and ocean currents. But that doesn’t mean they include everything relevant just everything people consider relevant.
Eh, atmospheric CO2 (and other GHGs) is just another input parameter. I don't see any reason that the model wouldn't be able to incorporate all of the data. If the climate system is just getting more chaotic, well, you're still running multiple projections, and you'll see that in increased variance.
Because we have well-measured inputs for CO2 between basically 250 PPM and $current_day PPM. Do you see how this statement
> Eh, atmospheric CO2 (and other GHGs) is just another input parameter
only works for CO2 inputs outside our measurements, if the climate response to those inputs is linear (and thus predictable from the responses we have already seen)?
I claim that the climate response to CO2 forcing is, in fact, strongly nonlinear, and further that it's nonlinear for other "unusual inputs" - not just CO2 - things like sea surface temperature or unusually low pressure troughs. So-called extreme weather. I can't bring up good citations at the moment, sorry, but here's a somewhat exaggerated thought experiment:
Take an AI model trained on weather measurements ~1970-2024. Also take a model of the primitive fluid equations on a rotating sphere. What predictions might you expect from each one for an asteroid hitting an empty patch of the West Pacific?
You have a model that predicts with great accuracy that rowdy teens will TP your house this Friday night, so you sit up late waiting to scare them off.
You have a model with less predictive power, but more discernible parameters. It tells you the parameter for whether or not houses have their front lights turned on has a high impact on likelihood of TP. You turn your front lights on and go to bed early.
Sometimes we want models that produce highly accurate predictions, sometimes we want models that provide mechanistic insights that allow for other types of action. They're different simplifications/abstractions of reality that have their time and place, and can lead you astray in their own ways.