Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> "The law, in its majestic equality, permits rich and poor alike to massively-plagiarize anything they want after investing at least $100,000,000 on a computational pipeline to statistically launder its origins and details."

-- Cyberpunk Anatole France

____

If I were to steel-man your comment, it would be something like: "Scraping and training must be fair-use because people can be building all sorts of systems with ethical and valuable purposes. What you generate from a trained system can easily infringe, but that's a separate thing."

Also, where does the GNU Public License fall in terms of "anti-human copyright maximalization"? Is it bad because it uses fire, or is it good because it fights fire with fire?



>it would be something like: "Scraping and training must be fair-use because

It wouldn't be "fair use". It makes no copies. "Fair use" is the horseshit the courts dreamt up so they could pretend copyright wasn't broken when a copy absolutely needed to be made.

This makes no copies, so it doesn't even need "fair use". Instead, there are people who believe that because they made something long ago that they and their descendants into the far future are entitled to tax everyone who might ever come across that thing let alone actually want copies of the thing.

Your argument must sound intelligent to you, but it starts from a premise of "of course copyright is the only non-lunatic policy people could ever imagine", and goes from there. You can't even think in any other terms.

> Also, where does the GNU Public License fall in terms of "anti-human copyright maximalization"? Is it bad because it uses fire, or is it good because it fights fire with fire?

Stallman is clever to twist the rules a little to get a comparatively sane result from them, but there are others who aren't clever enough to even recognize that that's what he's doing. So, in their minds "what about the gnu license" seems like a gotcha. I won't name those people, but their username starts with Terr and ends with an underscore.


> Your argument must sound intelligent to you, but [...] You can't even think in any other terms.

> others who aren't clever enough [...] I won't name those people, but their username starts with Terr and ends with an underscore.

https://news.ycombinator.com/newsguidelines.html

____________

> It wouldn't be "fair use". It makes no copies.

Incorrect, the real-world behavior we're discussing involves unambiguous copies, where LLM companies scrape and retain the data in a huge training corpus, since they want to train a new iteration of the model when they adjust the algorithms.

That accumulation is analogous to photocopying books and magazines that you borrow/buy before returning/selling them again, and arranging your new copies into a clubhouse or company break-room. Such a thing is not usually considered "fair use."

In a hypothetical world where all content is merely streamed into a model, then the question of whether model-weights can be considered a copy with a special form of lossy compression is... separate, and much trickier.

> Your argument [...] starts from a premise of "of course copyright is the only non-lunatic policy people could ever imagine"

Nope, it's just the context of the discussion because it's status-quo we're living with and the one we're faced with incrementally changing. If you're going to rage-post about it, at least stop and direct that rage appropriately.

> Stallman is clever to twist the rules a little to get a comparatively sane result from them, but [you don't] recognize that that's what he's doing.

I already described the GPL as "fighting fire with fire", I don't understand how the idiom didn't make sense to you.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: