Humans and machines are distinct with respect to copyright law. A human memorizing a book is legal. A machine scanning a book is creating a new copy and is in (at least some cases) illegal. It is not obvious that just because humans are allowed to learn from things that machines also are.
I tend to favour the view that in this case it is legal (by way of the de minimis doctrine), but I don't think it's a trivial question.
A human memorizing a book is legal; a human reciting that book aloud for an audience is not (performances of plays require licenses to the performing rights of a work, for example).
Distribution is when the issue arises - not consumption and construction of a mental model.
I acknowledge the parallels are imperfect and this all needs to be worked out in court. But it’s possible that at the pace LLMs are developing, by the time courts start addressing these questions we’ll already be questioning whether the distinction between machines and people is as big as we thought.
I tend to favour the view that in this case it is legal (by way of the de minimis doctrine), but I don't think it's a trivial question.