Hm. That to me seems to be quite a badly written law. Is a copyright notice written in plain German in the website footer 'machine-readable'? Is there some definition of 'machine-readable' somewhere?
It's also far from clear to me whether a court would find training an LLM to constitute text and data mining 'for the purpose of gathering information, in particular regarding patterns, trends and correlations'.
> That to me seems to be quite a badly written law.
It's pretty normal for a law to not be specific on the technicalities so they don't have to update the law whenever the software changes. The de facto standard to prevent bots from scraping your sites has been robots.txt for almost 30 years.
If artists didn't mind Google scraping their images, putting them on their site, adding ads and making billions, I really don't see them having much of a justification to call out StableDiffusion for "stealing" their stuff. In general artists would be in a lot of trouble if taking stuff from the Internet would be outlawed, as that's where they get all their reference images from too.
Either way, I am sure we'll see quite a few lawsuits going forward, laws are always open to interpretation, especially when new technology archives. But long term I really see copyright in general being in a lot of trouble, since derivatives and remixes are becoming completely trivial with AI. Where does the original work stop and the copyright violation starts is being rather difficult to decide when you can just wander around latent space and create literally thousands of similar images in minutes, with as much or as little variation as you want.
My argument is that although robots.txt is a machine-readable way of asserting reservation of use, it's not the only machine-readable way, and the law does not seem to place a burden on the rights-holder to choose a particular 'machine-readable format'.
While a court would likely conclude that a watermark on an image is not 'machine-readable' (I say likely—OCR technology would however make it possible that a court could find that a watermark is machine readable), I would say that because the law does not require a specific method, I think it might be found that a copyright notice in the footer, or in an image caption, is indeed 'machine-readable'.
On balance, I agree that there's a lot of things we are woefully underprepared for coming up in the very near future on using tools in this way to generate art. The answer is not simply to try and lock up all the art away from the robots—but I don't know what the answer actually is.
"A reservation of use in the case of works which are available online is effective only if it is made in a machine-readable format."