I agree. However I think there is a gap of meta learning that someone may figure...

I agree. However I think there is a gap of meta learning that someone may figure out how to fill. Imagine you could take the state of the art driving AI, deploy it, and then wait until it makes a mistake. And then... suppose you could just explain to it in english what it did wrong like you would interacting with a multi-modal LLM. The missing component (for now) is that AI taking your feedback and adjusting the weights in the driving model to fix that mistake. Not just adding additional training data, but correcting based on more fundamental understanding and abstraction of what just occurred and the key take-aways, etc. and then making sure not to repeat that mistake. Just like a new human driver would learn.

It's possible someone might figure out a way to create a training loop using a multi-modal LLM to generate synthetic training data based on the situation you just explained and then updating the driving model by training on this new data until its performance improves on the task.