I think the conclusion is slightly wrong: they're not trying to hide the training process. They'll probably wind up vigorously defending that in court however they can, they're in big trouble if they can't train like that.
They're trying to avoid reproducing copyrighted text, which is a totally separate (and arguably more clear-cut) legal question. Input vs output.
This is what I've been wondering. Does Fair Use apply here at all? Sure, the models were trained on copyrighted material. But wouldn't the generative part of the AI count as transformative?
They're trying to avoid reproducing copyrighted text, which is a totally separate (and arguably more clear-cut) legal question. Input vs output.