Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

First off, you may be unaware but the subject of browser caches is well-tread in copyright law, and has been ruled not to be the same as other methods of downloading, so it's not applicable here.

Googlebot allows creators to restrict what it crawls. OpenAI does not. It also allows creators to have their work removed from the "transformed data" (i.e. be de-indexed). AI models do not.

Googlebot at no point attempts to create an alternative content to the original input content, which is the entire point of ML models.

Y'all always stick to abstract analogies, because when it comes to actual details humans and ML models are extremely different, and those analogies don't hold up at low levels.



Then we should put the same restrictions on OpenAI as we do as humans. So if a website has a paywall, OpenAI must pay to view the content. However, since this data is being given to an AI that is not an individual, there probably be different licensing so copyright holders can extract value from their works in the final model in some way. Maybe some payment before, and some after.

Even non paywalled content should receive compensation of that work is copyrighted. OpenAI should not be able to profit off the work of others at a mass scale in this fashion.


First, we should not give AI models the same rights as humans, because they're not humans. We should place far MORE restrictions on AI models.

Second, we should force OpenAI to cut deals with each content creator whose content they want to make use of.


1. Correct 2. Yes. Since even if GPT-4 is producing transformative content from "itself", a new medium of profit was created (training AI models) which have not been done before.

(Example what I mean:) Even though a library is free, you are not expected to go into the library, and read 10,000 books within a few hours, and put the books back on the shelf, and walk out like it is fine. Humans can only listen, watch, and read only a finite amount of content in their lifetime, GPT-4 can read at a scale that eventually be all humans on earth reading at the same time.

Streaming services for music only exist because you cannot scrape all the music in the catalog. You will listen to few songs to pay the artists and record label more percentage in comparison to the whole catalog you wont listen to.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: