LLMs, even as a technological achievement, are trained on pirated content. That doesn't differ much from illegal obtaining and processing of data (especially for commercial purposes). In that sense processing is not individual consumption, but still consumption for model training purposes.
Besides piracy itself, in US legal system, there's a concept of a "fruit of the poisonous tree". The logic of the terminology is that if the source (the "tree") of the evidence or evidence itself is tainted, then anything gained (the "fruit") from it is tainted as well.
That would mean, that all LLMs creations would be considered as illegal. This would also apply to generated training data, in case the model would be ordered to be destroyed.
Back in the day, Microsoft learned a lot by dumpster-diving AT&T for computer manuals. Let's say that was at least trespassing, and probably theft. Can the government order the destruction of Windows? (Please?) Or is that a bridge too far for the rule of law?
Let's say I steal a novel - a good one. I read it very carefully. I learn how the author does foreshadowing, plot construction, and makes believable characters. I go out and write a novel. It's not plagarism, it's different from the one I read, but it has similar stylistic elements. Can the government order the destruction of all copies of my novel?
In non-criminal-trial contexts, there has to come a point where "fruit of the poisoned tree" meets some kind of limit - after N generations, or after sufficient transformation, or something. (For that matter, I'm not sure that "fruit of the poisoned tree" is a thing outside of prosecution evidence at criminal trials.)
Besides piracy itself, in US legal system, there's a concept of a "fruit of the poisonous tree". The logic of the terminology is that if the source (the "tree") of the evidence or evidence itself is tainted, then anything gained (the "fruit") from it is tainted as well.
That would mean, that all LLMs creations would be considered as illegal. This would also apply to generated training data, in case the model would be ordered to be destroyed.