> If an AI model recongnizably regurgitates the essay it trained on then it has ...

kod · 2025-03-20T16:08:20 1742486900

> So long as the resulting model does not contain copies, it is not infringement

That's not true.

The article specifically deals with training by scraping sites. That does necessarily involve producing a copy from the server to the machine(s) doing the scraping & training. If the TOS of the site incorporates robots.txt or otherwise denies a license for such activity, it is arguably infringement. Sourcehut's TOS for example specifically denies the use of automated tools to obtain information for profit.