They’re sort of separate. In a sense you could say that the ChatGPT model is a lossily compressed version of its training corpus. We acknowledge that a jpeg of a copyrighted image is a violation. If the model can recite Harry Potter word for word, even imperfectly, this is evidence that the model itself is an encoding of the book (among other things).
You hear people saying that a trained model can’t be a violation because humans can recite poetry, etc, but a transformer model is not human, and very philosophically and economically importantly, human brains can’t be copied and scaled.
They're very separate in terms of what seems to have happened in this case. This lawsuit isn't about memory or LLMs being archival/compression software (imho, a very far reach) or anything like that. The plaintiffs took a bit of text that was generated by ChatGPT and accused OpenAI of violating their IP rights, using the output as proof. As far as I understand, the method at which ChatGPT arrived to the output or how Game of Thrones is "stored" within it is irrelevant, the authors allege that the output text itself is infringing regardless of circumstance and therefore OpenAI should pay up. If it's eventually found that the short summary is indeed infringing on the copyright of the full work, there is absolutely nothing preventing the authors (or someone else who could later refer to this case) from suing someone else who wrote a similar summary, with or without the use of AI.
> You hear people saying that a trained model can’t be a violation because humans can recite poetry, etc
Also worth noting that, if a person performs a copyrighted work from memory - like a poem, a play, or a piece of music - that can still be a copyright violation. "I didn't copy anything, I just memorized it" isn't the get-out-of-jail-free card some people think it is.
A jpeg of a copyrighted image can be copyright infringement, but isn't necessarily. A trained model can be copyright infringement, but isn't necessarily. A human reciting poetry can be copyright infringement, but isn't necessarily.
The means of reproduction are immaterial; what matters is whether a specific use is permitted or not. That a reproduction of a work is found to be infringing in one context doesn't mean it is always infringing in all contexts; conversely, that a reproduction is considered fair use doesn't mean all uses of that reproduction will be considered fair.
I would guess that if there were a court case where a poet sued someone commercially that is for pay(say tickets specifically for it) reciting his poetry they might very well win. So reciting poetry probably could be copy right infringement at certain scale.
And as AI companies are commercial entities. I would lean towards direction where they doing it in general, even if not for repeating specific works, it could be infringement too.
You hear people saying that a trained model can’t be a violation because humans can recite poetry, etc, but a transformer model is not human, and very philosophically and economically importantly, human brains can’t be copied and scaled.