But how, it's not like it's thinking, it's just spitting the next likely token

frotaur · 2025-08-08T09:38:17 1754645897

This can generate new text. If the abilities generalise somewhat (and there is lots of evidence they DO generalise on some level), then there is no obstacle to generating new proofs, although the farther away they are from the training data, the less likely it becomes.

For an obvious example of generalisation: the models are able to write more code than there is in the dataset. If you ask it to write some specific, though easy, function, it is very unlikely it is present verbatim in the dataset, and yet the model can adapt.

createaccount99 · 2025-08-08T09:03:59 1754643839

That is an oversimplification, and not the whole truth. Read anthropic's blog posts if you want to learn more, or ask gpt5.