This is maybe the best response thus far. We can say that there's no real modelling capability inside these LLMs, and that thinking is the ability to build these models and generate predictions from them, reject wrong models, and so on.
But then we must come up with something other than opening up the LLM to look for the "model generating structure" or whatever you want to call it. There must be some sort of experiment that shows you externally that the thing doesn't behave like a modelling machine might.
But then we must come up with something other than opening up the LLM to look for the "model generating structure" or whatever you want to call it. There must be some sort of experiment that shows you externally that the thing doesn't behave like a modelling machine might.