This whole AI learns like a human is trajectory of thought pushed by AI companie...

DrScientist · 2025-01-03T13:52:55 1735912375

It's both complex and extremely simple for the same reason - it's a human judgement in the end.

Just because you can't define something mathematically, doesn't mean it isn't obvious to most people in 99% of cases.

Reminds me of the endless games in tax law/avoidance/evasion and the almost pointless attempt to define something absolutely in words. To be honest you could simplify the whole thing by having a 'taking the piss' test - if the jury thinks you are obviously 'taking the piss' then you are guilty - and if you whine about the law not being clear and how it's unfair because you don't know whether or not you are breaking the law - well don't take the piss then - don't pretend you don't know whether something is an agressive tax dodge or not.

If you create some fake IP, and license it from some shell company in a low tax regime to nuke your profits in the country you are actually doing business in - let's not pretend we all can't see what you doing there - you are taking the piss.

Same goes for what some tech companies are doing right now - every reasonable person can see they are taking the piss - and high paid lawyers arguing technicalities isn't going to change that.

Kim_Bruning · 2025-01-04T01:30:53 1735954253

> Consider how code infringement is not about code itself but about what it does. If you saw somewhat original implementation of something and then you rewrite it in different language by yourself there is high chance its still copyright infringement.

Actually if you rewrite it in a different language, you're well on your way to making it an independent expression; (though beware Structure, Sequence and Organization, unless you're implementing an API : See Google v. Oracle). Copyright protects specific expressions, not functionality.

> Compare that to pepople prompting directly with name of artist they want to replicate. This in direct copyright infringement in both essence and intention no matter the resulting image.

As far as I'm aware an artists' style is not something that is protected by law, Copyright protects specific works.

If you did want to protect artistic styles, how would you go about legally defining them?

omnimus · 2025-01-04T09:31:28 1735983088

The fact LLMs are generating any images is purely thanks to database of source images that are copyright protected. Its a form of sophisticated automated photobashing. Photobashing is grayzone but often legal because of the other artist doing the (often original) work.

When you prompt for Mijazaki image this image can only exist thanks to his protected work being in database (where he doesnt want to be) otherwise the user wouldnt get Mijazaki image they wanted.

We will see how that all plays out but i think if Mijazaki took this to court there would be solid case on grounds that the resulting images breach the copyright of the source, are not original works and are created with bad intent that goes against protections of original author.

What seems to be current direction is atleast that the resulting images cannot be copyrighted automatically in public domain. Making it difficult to use commercially.

Kim_Bruning · 2025-01-04T14:54:58 1736002498

Actually, while I just said "there is no database", maybe you're working from a very different mental model from mine...

What do you mean by "Database" in this context? What information do you think is being stored, (and how?)

omnimus · 2025-01-04T17:45:15 1736012715

I understand what the model is and how you get to it. I know the training data is not stored. But as far as i understand - the model is closer to derived intermediary from the training data. Like database index or like you said form of compression.

Thats why i on purpose tend to call trainng data + model the database. Because to non progammers it makes more sense. To me there is intentional slight of hand of hiding the fact that the only reason LLMs can work as they do now is because of the source data. The way its usually marketed it seems like the model is program that generalised principles of drawing from looking and other drawings thats why it can draw like Mijazaki when it wants to. Not that it can draw Mijazaki because it preprocessed every Mijazaki drawing, stemmed patterns out of it and can mash them with other patterns (from the database).

Thats why i intentionally say database to lead this discussions back to what i see is core of these technologies.

chii · 2025-01-05T04:06:51 1736050011

What you're describing as database would be what i call information.

Kim_Bruning · 2025-01-04T14:22:35 1736000555

There's no such database, AFAICT.

If you've ever worked with open source models (eg one of the stable diffusion models or models based on them, using tools such as AUTOMATIC1111 or ComfyUI); you can inspect them yourself and simply see. If you haven't done so already, see if you can figure out the installation instructions for one of the tools and try!

Meanwhile ...

Ok, fine, I've heard some crazy compression conspiracy theories, but they're a bit too crazy to be credible.

I've also heard stories about these models being intelligent - a little artist living in your computer. I think that's going a bit too far in another direction.

In reality, I think it's better to install the software and take your time to learn about the way these models are actually built and work.

[ btw: If Miyazaki were to take this to court with the argument you put forward, he wouldn't get very far. "Please remove my images from your systems in whatever form you are holding them". The response for the defense would simply be: "We don't actually have them, and you are quite welcome to inspect all our systems". ]

(Incidentally, I've been here before. I play with synths as a hobby! ;-)

omnimus · 2025-01-04T09:07:52 1735981672

I dont believe rewrite in different language is specific expression.

We will see because we are well on our way of LLMs being able to translate whole codebases to different stack without a hitch. If thats OK than any of the copyleft, open-core or leaked codebases are up for grabs.

Kim_Bruning · 2025-01-04T13:16:36 1735996596

A hand rewrite (or intelligent rewrite in general) will tend to become unique pretty quickly, especially when you start leaning into language features of the target language for improved efficiency. Your Structure and Organization will be different.

If you order an LLM (or a human) to do a straight 1:1 translation, you'll sort of pass one test (it's a completely different language after all!), but fail to show much difference wrt structure, sequence or organization. I'm also not entirely sure how good of an idea it is technically. If you start iterating on it you can probably get much better results anyway. But then you're doing real creative work!

Terr_ · 2025-01-03T21:38:37 1735940317

> This whole AI learns like a human is trajectory of thought pushed by AI companies.

My retort towards the " it would be legal if a human did it" argument is that if the model gets personhood then those companies are guilty of enslaving children.

> Compare that to pepople prompting directly with name of artist they want to replicate.

In that case, I would emphasize that the infringement is being done by the model, It's not illegal or infringing to ask for an unlicensed copyright infringing work. (Although it might become that way, if big corporations start lobbying for it.)