Lots of people ironically put the Getty watermark on pictures and memes that they make to satirically imply that they are pulling stock photos off the internet with the printscreen function instead of paying for them.
Memes generally would not fall under the category of non-copyrighted material; they’re most of the time extremely copyrighted material just being used without permission. And even a wholly original work an artist sarcastically puts a Getty watermark and then licensed under Creative Commons or something would fall into very murky territory – the Getty watermark itself is the intellectual property of Getty. The original image author might plead fair use as satire, but satirical intentions aren’t really a defence available to DALL-E.
So even if we’re assuming these were wholly original works that the author placed under something like a Creative Commons license, the fact that it incorporated an image they had no rights to would at the very least create a fairly tangled copyright situation that any really rigorous evaluation of the copyright status of every image in the training set would tend to argue towards rejecting as not worth the risk of litigation.
But the more likely scenario here is that they did minimal at best filtering of the training set for copyrights.
You could argue that mocking the Getty logo like that is some form of fair use, which would be a backdoor through which it can end up as a legitimate element of a public domain work, in which case it would be fair game.
I agree with you that it is also possible that people posted Getty thumbnails to some sites as though they are public domain, and that is how the AIs learned the watermark.
Fair use does not make a work public domain; it merely helps the creator of the derivative work defend their case in court. But neither the original nor the derivative becomes public domain after a successful fair use defense.
Not a lawyer, of course, but I think slapping the Getty logo on a work claiming "fair use" and then releasing the work under public domain would be a case of misrepresentation, because Getty still has a copyright claim on your work. Regardless of the copyright status, it's still a clear trademark violation to me.
You can produce a public domain work using content that you have fair use rights to. The original owner of the content you are using fairly has no claim of ownership. You would have to assert that right in court if the owner of the copyright came after you, but that does not preclude the possibility of making a public-domain work with other copyrights used in fair use.
Obviously, that would not entitle anyone to rip those elements from your work and use them in a way that was not fair use. The Getty watermark could fall into this category: public domain pictures using the watermark fairly (for transformative commentary/satire purposes) could go into the network, which uses that information to produce infringing images.
Trademarks are a different story, but trademark protections are a lot narrower than you might think.
The point is that it's very conceivable that the neural network is being trained to infringe copyrights by training entirely with public-domain images.