> I'm not seeing the comparison because what you're describing is not at all an internal or emergent process.
And (1) that process isn't an autocomplete because it isn't reliably text->text. It could be visual->sound, sound-> action, visual->action or visual->visual or any combination. As well as proactively seeking out new stimulus. Modern LLMs aren't limited to text, they just happen to leverage a large model of the language as their most novel element.
(2) We're not talking about internal or emergent processes, we're talking about intelligence. No human is born intelligent. They go through a pretty similar process to LLMs where a lot of data gets dumped on them and they start responding to it.
Humans obviously need air to breathe. If you give us scuba lessons and extensive gear we can go underwater and keep breathing air, but the fundamental operation doesn't change. It's the same with LLMs. When you send an image to an LLM it's not parsed by the LLM but handed off to a separate program that converts it into a text format for the LLM which then is then sent back to the same old autocomplete process.
And yes humans are born intelligent. Have children and it's the most amazing and beautiful thing watching them begin to create out of nothing. For instance with absolutely zero prompting or external guidance children will begin to engage in intentful make believe play, like starting to share their food with their favorite toy.
But this 'something from nothing' is also essentially required if you think about it. Try to put yourself back in primitive man's shoes some tens of thousands of years ago. You know basically nothing about the world around you, yet in the blink of an eye we've discovered the secrets of the atom, put a man on the Moon, created mathematics from nothing, created all the underlying technology and infrastructure required for us to have our little debate here online, and so much more.
And (1) that process isn't an autocomplete because it isn't reliably text->text. It could be visual->sound, sound-> action, visual->action or visual->visual or any combination. As well as proactively seeking out new stimulus. Modern LLMs aren't limited to text, they just happen to leverage a large model of the language as their most novel element.
(2) We're not talking about internal or emergent processes, we're talking about intelligence. No human is born intelligent. They go through a pretty similar process to LLMs where a lot of data gets dumped on them and they start responding to it.