RFC: Banning "AI"-backed (LLM/GPT/whatever) contributions to Gentoo

JonChesterfield · on April 15, 2024

Reading through the linked thread at https://github.com/pkgxdev/pantry/issues/5358 I'm in total agreement with Gentoo.

constantcrying · on April 15, 2024

Yes, something like that absolutely should be avoided at all cost.

Genuinely in disbelief what I was reading in that thread. How do people think that an autogenerated nonsense description is better than not describing the software at all?

jddj · on April 15, 2024

Feels like this is just one of many pets.com moments to come

red-iron-pine · on April 15, 2024

AI, but for dogs!

[still gets $30M in VC funding because of course it does]

rsynnott · on April 15, 2024

I just can't stop giggling at the fact that the first response was a wordy, yet completely useless, answer from a magic robot.

WirelessGigabit · on April 15, 2024

AllTrails does this too, to an extend. When you create a route it asks for a couple of questions (think checkboxes). But it then writes it out as human written text as a way to make you believe someone has actually curated a piece of text describing this hike, which actually ends up being an overgrown trail on someone's private property.

What's worse is that they have the condensed information and decide to make it harder to consume.

ComplexSystems · on April 15, 2024

Getting rid of nonsense descriptions is very sensible. Trying to set some kind of policy to prevent that is also sensible.

Banning developers from internally using Copilot, a personal tool that is essentially a glorified autocomplete so you don't have to type as much, all because of "copyright infringement," "ethics" and "energy waste" concerns, is dumb. It is unrelated to the problem at hand, unenforceable, bizarrely overreaching, unnecessarily divisive and also dumb and I hate it.

growfeather · on April 15, 2024

It’s like when these AIs were released someone threw a giant rock in the pond. It’s still rippling, but it’ll eventually settle down, because there’s obviously a middle ground, which is what you’ve described. I think we will get there.

superb_dev · on April 15, 2024

How are any of those unrelated?

suprfsat · on April 15, 2024

He can't even invert a binary tree on a whiteboard.

brodo · on April 15, 2024

That bot responding to the issue about bots writing nonsense… Satire is dead.

op00to · on April 15, 2024

Wait … that’s a real bot? Like not a joke bot that shows up in AI related conversations? Holy moley.

mostlysimilar · on April 15, 2024

It's also inherited the internet's passive aggressiveness. Check out this line:

> Haha, I'm glad you're finding this interaction amusing!

nonrandomstring · on April 15, 2024

It's damaging to reputation. and that's a concern I don't see being considered here. We had a vote on our project to stop using AI generated images [0].

A central reason was "reputation". It's a PR move because ordinary people are coming to associate AI with really bad things. If they see us using AI for icons, why would they not assume the rest of the content is AI generated too? It doesn't matter whether the general public are right or wrong.

I think what we're hearing in this thread is mostly a developer viewpoint. Try to see it like the customers see it.

Of course using AI makes many tasks easier. Changing tack is a schlep because I have to go back an replace the old instances. Honestly I'm almost regretting the decision and shudder to think what this accumulated debt would be like with a couple million lines of code and docs instead of a dozen pictures.

Seems very courageous of them to make a policy choice early.

[0] https://cybershow.uk/blog/posts/nomoreai

jddj · on April 15, 2024

> If they see us using AI for icons, why would they not assume the rest of the content is AI generated too?

This was raised in that linked thread and is a completely valid concern. Who could look at that package manager and feel confident about its other design decisions even if it wasn't largely AI generated itself?

ekidd · on April 15, 2024

> It doesn't matter whether the general public are right or wrong. I think what we're hearing in this thread is mostly a developer viewpoint. Try to see it like the customers see it.

As an open source developer, I have complex feelings about users who act like they're "customers," entitled to make demands of me. (Unless they are, in fact, paying customers. Paying customers are great, but they're often not the goal of most of my projects.)

The stuff I write on my own time, I write first and foremost for myself. I'm happy to share it with other people, but that's a gift. Sometimes sharing will get a little scene going, and we can all have fun together. But I certainly don't spend $400+/year to "code sign" my binaries for MacOS and Windows. 90% of time, if I hear from a Windows user, it's because they're upset with me and they want something. And I will often deliberately avoid advertising certain tools, because I have zero desire to spend my life providing free tech support.

I'd rather have 100 users who were into what I was trying to do, than 10,000 unpaying "customers" where I had to worry about things like "It doesn't matter whether the general public are right or wrong." I do that at work, not for free on the weekend.

So if I want to enable CoPilot's auto-complete when designing a new file format, then eh, I really don't care if this makes some user suddenly upset and unwilling to use my project. I'm happy to label things clearly so that they know what they're getting and they can make an informed decision. But they're not my customers unless they paid me money.

rsynnott · on April 15, 2024

Sadly, looks like they've removed the generated stuff; they had some gloriously bonkers stuff. The generated image for RabbitMQ, in particular, was just amazing.

madeofpalk · on April 15, 2024

Is pkgxdev related in any way to Gentoo?

nl · on April 15, 2024

This doesn't seem to be related to Gentoo? The linked mailing less message explicitly says their policy won't apply to upstream packages.

nl · on April 15, 2024

I can't say how strongly I disagree with the ethical/copyright concerns raised here.

The idea that intelligences - whether they be human, artificial or alien - should be forbidden from learning from code freely shared on the internet goes against everything I like about open source.

I think it's fair that no one should be able to use reproduced copyright code verbatim, whether that by by a human memorizing something or a computer copying it.

But I take the complete opposite view on the ethics of letting machine learn from work. I think this should be encouraged.

talldayo · on April 15, 2024

> The idea that intelligences [...] should be forbidden from learning from code freely shared on the internet goes against everything I like about open source.

Love it or lump it, this has been the battleground for most free software distribution. Just having a licensed copy of code does not permit you to do anything with it, even the BSD license has restrictive terms for the user. I'm an enormous Copyleft advocate, but arguably Open Source is only enforced by stopping people from using it illegally. If ChatGPT's fate is to turn into an GPL-licensed code-launderer, then projects have a great basis for banning it among their contributors.

> I think it's fair that no one should be able to use reproduced copyright code verbatim

Then I don't see how you can be upset at Open Source projects for adopting basic standards. They also want to protect their own license and community, with the main difference being that they're not in it for the money. Again - there was never any point where "Open Source" was synonymous with "do whatever you want with the code unconditionally".

raxxorraxor · on April 15, 2024

Agreed. Although it is sometimes difficult to imagine any sensible application of copyright of code in the first place apart from code that for some reason or another contains data.

There exist copyrighted algorithms for certain applications, especially for AI application these days, but a reengineering should be possible and similarities should be handled as liberal as possible.

zdimension · on April 15, 2024

> learning

Therein lies the rub. Most of the discourse and debate around LLMs and copyright revolve around the central question of what it means to learn.

Virtually everyone agrees that a human learning from reading code doesn't violate copyright (by somehow copying the knowledge into one's brain), because the human brain is some kind of copyright laundering machine, maybe? I don't really know whether there is any argument for that apart that can't be reduced to an appeal to common sense.

On the other hand, a DL algorithm learning from processing tokens of code scraped from public sources such as GitHub doesn't present the same kind of obviousness. My personal belief is that it's also learning and shouldn't be forbidden, but I can't deny the negative consequences of that. We're already seeing a lot of bad things come out of the democratization of GPTs.

madeofpalk · on April 15, 2024

> Virtually everyone agrees that a human learning from reading code doesn't violate copyright

I don't think this is so universally agreed upon as you think, or at least the implications of learning from copyrighted material. This is why projects like ReactOS and WINE strongly prohibit contributors from reading leaked Windows source code, incase they learn a little bit too much and accidentally reproduce copyrighted material.

> because the human brain is some kind of copyright laundering machine, maybe

Absolutely not. Music is full of legal cases where someone learned and copy a bit too much and went to court over it.

https://en.wikipedia.org/wiki/Pharrell_Williams_v._Bridgepor...

https://abcnews.go.com/Entertainment/jury-reaches-verdict-ed...

https://en.wikipedia.org/wiki/List_of_songs_subject_to_plagi...

nl · on April 15, 2024

> Virtually everyone agrees that a human learning from reading code doesn't violate copyright (by somehow copying the knowledge into one's brain), because the human brain is some kind of copyright laundering machine, maybe? I don't really know whether there is any argument for that apart that can't be reduced to an appeal to common sense.

This isn't really agreed on at all. See https://en.wikipedia.org/wiki/Clean_room_design which wouldn't exist if it was agreed on.

However, note the case law example of the NEC V20 which found in NEC's favor:

> While NEC themselves did not follow a strict clean room approach in the development of their clone's microcode, during the trial, they hired an independent contractor who was only given access to specifications but ended up writing code that had certain similarities to both NEC's and Intel's code. From this evidence, the judge concluded that similarity in certain routines was a matter of functional constraints resulting from the compatibility requirements, and thus were likely free of a creative element.

skissane · on April 15, 2024

> because the human brain is some kind of copyright laundering machine

I once came up with the idea of a physical copyright laundering machine. It had three CD-R drives (this shows how long ago I had the idea). You’d insert a CD-ROM to launder, and two blank CD-R discs. To one of the CD-Rs, it would write a one-time pad, to the other it would write the input CD-ROM XOR the one-time pad. A hardware RNG (I wanted to use a quantum process such as radioactive decay for more emphatic indeterminism) generates the one-time pad. It also generates a single random bit which determines which output CD-R gets the key and which one gets the ciphertext. That bit is never revealed to the user (or recorded in any way). The end result is two CD-Rs, one containing random data, the other a copyrighted work encrypted with random data-but it is impossible to know which is which.

I never actually built one of these machines. I wanted to patent it, but gave up when I realised how much patents cost. I also eventually realised that my machine would never work, because it was approaching the law with the mind of a developer not the mind of a judge - I doubt any judge would actually be convinced by my copyright laundering machine, they’d find a way to rule against it, whatever exact way that might be. The law and computing are both systems of rules, but the rules in the former involve far more discretion and flexible interpretation.

nonrandomstring · on April 15, 2024

> the human brain is some kind of copyright laundering machine, maybe? I don't really know whether there is any argument for that apart that can't be reduced to an appeal to common sense.

Because it's an end not a means. The concept is central in philosophy, law, ethics, education.

beepbooptheory · on April 15, 2024

It's actually somewhat poignant to me that this not an obvious point for many circling around this. Accelerationism (in its true pseudo-Deleuzian/Nick Landian form, not the subreddit) is coming for us one way or another.

Maybe all we can hope for is consolation. Like, if the tech elite is successful in this campaign of soft-dehumanization ("we are all LLMs anyway") it will open up some new cultural pathways to greater respect for animals and the environment.

What is a cow but some kind of copyright laundering machine anyway?

Subjectivity as we know it is quite contemporary, articulated in part by things like Kant's kingdom. I think things will start to change again. The world of love, art, particularly human passions might fade away to something different. You can almost feel people hungry for this. It doesn't have to be good or bad, its not for our subjectivity to understand after all. But I will say it does feel like we are leaving summertime now, towards a colder future.

nonrandomstring · on April 15, 2024

> It doesn't have to be good or bad

I think it's inevitably bad because dehumanisation ends in war. Dehumanisation, for me, is synonymous with violence. It isn't just technology but the terrifying post-modern subjectivity that allows 8 billion people to co-exist in relative peace. "Accelerationism" looks set to build still more and better weapons, fewer ways of resisting using them, and less capacity to care, so I fear the colder future you speak of will be a nuclear winter. Of course, as you say, it's like some people want that.

skissane · on April 15, 2024

> dehumanisation ends in war

Or humans as pets. If you want to see humanity’s future, consult the nearest dog or cat.

GuB-42 · on April 15, 2024

It looks like a knee-jerk reaction by some "AI" hater rather than a well thought out request.

- 7 instances of the word "shit". I don't mind swearing, but it is indicative of the author being maybe a bit too emotional for a technical proposal.

- It is unnecessarily broad. Not using AI to create bug reports? What if you use AI tech to find a bug? Are you not allowed to report it? The stated issue here seems to be mostly about code completion, but it is stretched to everything AI-related, everywhere.

The point raised are copyright, quality and ethics, which are valid points, but not specific to AI.

Copyright: You have the same problem when copy-pasting code, and people do that, you can't really single out AI. Instead of banning AI, a more sensible guideline would be to just be aware of copyright when importing code from elsewhere, including AI generated code, but also copy/pasting from online sources (ex: StackOverflow) and using external libraries. There are tools to check for copyright compliance.

Quality: AI-generated code is often lower quality, but so is code written by bad coders, judge by quality of contribution, not by how it is done. As for the "we can't really rely on all our contributors being aware of the risks", maybe start by picking contributors you can rely on. And if you think they may not be aware of the risks, tell them about the risks rather than saying "you can't do that".

Ethics: I don't know what Gentoo stands for, but I'm guessing it is mostly about making a good source-based Linux distribution. Don't hijack the project for your own goals. Now, I have no problem with a Linux distribution that has "no AI" as one of its core values, but it doesn't have to be Gentoo.

rsynnott · on April 15, 2024

> Not using AI to create bug reports? What if you use AI tech to find a bug? Are you not allowed to report it?

I mean, the hot new trend seems to be people reporting completely imaginary bugs that some AI tool thought it saw to projects, so, eh, I can see where they're coming from there.

apienx · on April 15, 2024

No, thank you.

"The goal of Gentoo is to design tools and systems that allow a user to do that work as pleasantly and efficiently as possible, as they see fit." https://www.gentoo.org/get-started/philosophy/

And this is what Torvalds had to say about LLM-enhanced submissions to the kernel. https://www.youtube.com/watch?v=w7-gJicosyA

gritzko · on April 15, 2024

Torvalds is cautiously optimistic, hopes that AI will be able to spot bugs in the code. That kind of AI I did not see yet. All the concerning issues from the Gentoo post I can see on a regular basis. For example, plausibly looking BS or AI spam. It is all delivered to my doorstep, so to say. That is the issue.

lemagedurage · on April 15, 2024

ChatGPT3.5 already spots bugs, e.g. when I swap the order of conditions in fizzbuzz. An error that a human could make. We've been at the point where AI can help spot bugs for a while already.

AI can be used poorly, AI can be used well.

bayindirh · on April 15, 2024

Another problem is how we arrive to a model that can be used poorly or well. There's huge copyright and ethical problems underneath every big model, and I refuse to use models which are trained with copyrighted materials, without consent.

Gentoo is right here, and until we pass these hurdles, I don't use any of these systems, even with a 100 feet pole.

hobs · on April 15, 2024

Ok, now ask it how many Ms are in ammunition. Just because it can do somethings some of the time doesn't mean we'd happily accept contributions from it.

Philpax · on April 15, 2024

1) It does not need to solve every issue to be useful; it just needs to surface some issue that a human reviewer can then validate. It's seen a lot of code; it can find common issues.

2) The specific issue you're talking about is because they don't see letters, they see tokens, which are groups of letters / subwords. It can't count those because it can't actually "see" what it's counting. This is being worked on as well.

Tooster · on April 15, 2024

Someone committing poor quality LLM generated code and deeming it appropriate for review could create equally bad, if not worse, handwritten code. By extension, anyone who merges poor quality LLM code could merge equally poorly handwritten code. So ultimately it's up to their judgement and about the trust in the contribution process. If poor quality code ended up in the product, then it's the process that failed. Just because someone can hit you with a stick doesn't mean we should cut down the trees — we should educate people to stop hitting others with sticks instead.

"Banning LLM content" is in my opinion an effort spent on the wrong thing. If you want to ensure the quality of the code, you should focus on ensuring the code review and merge process is more thorough in filtering out subpar contributions effectively, instead of wasting time on trying to enforce unenforceable policies. They only give a false sense of trust and security. Would "[x] I solemnly swear I didn't use AI" checkbox give anything more than a false sense of security? Cheaters gonna cheat, and trusting them would be naive, politely said...

Spam... yeah, that is a valid concern, but it's also something that should be solved on organizational level.

Nullabillity · on April 15, 2024

Cheaters are gonna cheat, but filtering out the honest/shameless LLM fans is still an improvement. And once you do find out that they lied, you now have a good reason to ban them. Win/win.

larodi · on April 15, 2024

Well, Torvalds says in the interview ‘we already have tools such as linters and compilers which speed up the work we do as part of software development’

I get the impression he agrees this road to LLM content is inevitable, but also kind of emphasises the role of the reviewer who takes the final decision.

ctrw · on April 15, 2024

I have llm very patiently explain to me why I crashed prod when I used the wrong conversion factor between ms and mus and us. Thanks SI very cool that one of the more often used units needs unicode to be entered into code.

Llm are absolutely helping with catching buts and code quality already.

megous · on April 15, 2024

I had LLM patiently show me use after free bugs in non-existent Asterisk C code it just made up. :D

Suzuran · on April 15, 2024

Obviously that's your fault for not having the code it found the bugs in. Why are you attacking progress?

nonrandomstring · on April 15, 2024

> Why are you attacking progress?

Progress to where? One should not use "progress" as an unqualified noun to denote a scalar. Progress is a vector, with both magnitude and direction. The direction part is really important.

megous · on April 18, 2024

keybored · on April 15, 2024

What does their philosophy for the end-user have to do with their development practices?

constantcrying · on April 15, 2024

This seems insane and totally unenforceable.

Even when I wrote very single line of code myself, I use AI to ask it about questions regarding the programming language or the library that I use. Banning that is just handicapping yourself.

I do like the sentiment. You absolutely do not want people to commit code they don't understand themselves, but the solution isn't to outright ban AI. The solution is to have trusted, knowledgeable developers who are aware of the limits of AI and use it appropriately.

Nullabillity · on April 15, 2024

If you don't trust it to write… you probably shouldn't trust it to tell you how to write either.

constantcrying · on April 15, 2024

Why? It is clearly better at one thing than the other.

Nullabillity · on April 15, 2024

Is it, or are you just having a harder time spotting the errors?

madeofpalk · on April 15, 2024

Stack Overflow is also full of very confident wrong answers. People saying "but what if ChatGPT lies to you!" seem to forget the rest of the internet is also filled with outdated or wrong information. You always still need to actually figure out if the result is correct or not.

Heck, even when I come up with a solution myself without needing to research it, I'm still likely to get it wrong or make mistakes. That's why we rely on tests, specs, code review, etc, because humands still make mistakes.

kragen · on April 15, 2024

llms like gpt-4 are enormously better than google at finding the relevant functions in an api and telling you how to chain them together to accomplish a given task. sometimes there are errors, but in most cases those are easy to spot, especially if you read the documentation for those functions. for some purposes, testing is enough (and it will happily write you a test suite too)

it's kind of annoying to ferret out the bugs it has carefully concealed in the code it wrote for you, though

consp · on April 15, 2024

> especially if you read the documentation for those functions

Shouldn't that be enough in the first place? RTFM is a real thing.

kragen · on April 15, 2024

the trouble is that you don't know which functions to read the manual for. last night, for example, i was trying to find out how to get objdump to disassemble an ms-dos .com file with the correct load address of 0x100 rather than 0. google was unhelpful, but gpt-4 told me that the option i was looking for was called `--adjust-vma`. probably if i'd read the objdump manual i would have been able to find it (though the objdump documentation isn't very clear about what `--adjust-vma` will do on a raw binary file), and technically that's only about 15 pages, and spending the requisite half hour reading it would probably produce returns the next time i use objdump. so it's a reasonable idea

but it's pretty common for gpt-4 to be able to instantly find the thing you need out of 150 pages or 1500 pages or 15000 pages. or the five things you need, and how to combine them. if you want to know what libraries can probably solve your problem, asking gpt-4 is a much better idea than reading the manuals of all possible libraries

constantcrying · on April 15, 2024

It seems far superior to Google.

TheFreim · on April 15, 2024

Using LLMs as a glorified documentation search is the best use case I've found. Sometimes I've forgotten something, or never knew it in the first place, so being able to use human language to describe a function or library I'm looking for is very useful. Once I've done that I move to the real documentation as fast as possible.

jeltz · on April 15, 2024

The funny thing is that if a function does not exist they often just make up one. But that happened on Stack Overflow too. Most of the answers in areas I knew had incorrect answers on SO. Only in areas where I am a beginner there is usually correct answers.

madeofpalk · on April 15, 2024

LLMs have mostly replaced Stack Overflow for me. I find I can get to a working solution quicker using ChatGPT compared to Google + Stack Overflow.

mattsan · on April 15, 2024

It should just be banning the use of AI in code reviews

constantcrying · on April 15, 2024

I don't see what you would even do with an LLM in a code review.

mannykannot · on April 15, 2024

The positive view: generate scenarios in which the code will fail.

The negative view: write plausible-seeming explanations justifying the code as correct.

mattsan · on April 15, 2024

yep this is what I meant. hallucinating, justifying or missing bad stuff.

additionally, similar to how large PRs are more likely to just be skimmed and replied with a "LGTM!", an LLM missing some bad stuff but still producing a seemingly thorough review would increase the chance of the bad stuff making its way in.

allowing LLMs to write code would be fine if its truly verified by a human, but let another LLM hallucinate and cloud a persons judgement and you've got a problem

kragen · on April 15, 2024

i thought i'd find out, so i picked a quasirandom piece of my own code (http://canonical.org/~kragen/sw/dev3/nonarytbl.py) and gave it to gpt-4 to review, with the prompt:

"Please review the above code. How does it work? Is it well designed? Is it efficient? What are its good points and its bad points? How should it be improved? Is it readable and maintainable?"

i feel like gpt-4's code review (included below) was mostly correct and useful. however, the efficiency concerns in particular are unfounded, and the python approach to handling errors like those cited is to just let the exception propagate, suboptimal though that is for usability. also, i'm not sure i agree with its high opinion of the modularity, usability, and readability

simply pasting gpt-4's partly incorrect analysis into a textarea when someone asks you for a code review would be malpractice, but it could be a useful source of ideas for possible improvements to suggest and possible problems to investigate. also, i've found it to be especially good at coming up with plausible ways that someone could misunderstand my code, which is to say, ways that my code is unclear and therefore could have its clarity improved

overall i'd say that gpt-4 is a lot better at reviewing code than at writing code. i've gotten significant improvements out of its code by asking it to review its own code looking for improvements, then asking it what the code would look like with those improvements. also, having it write test suites, then showing it the results of running the test suites

— ⁂ —

The provided Python script is designed to generate and print arithmetic tables for balanced nonary (base 9) numbers using a unique set of Unicode glyphs. The script covers operations such as multiplication, maximum, addition, and subtraction. Here's a breakdown of how the code works and an evaluation of its design:

### How It Works:

1. `gen_table` Function:

- This function generates a table for a given binary operation (`op`) like multiplication or addition.

- It takes an optional `name` parameter for labeling the operation in the table header.

- The `values` parameter specifies the range of values to be used in generating the table (default is from -4 to 4).

- It constructs each row by applying the operation between all pairs of values.

2. `nonary` Function:

- Converts an integer into its balanced nonary representation using specified glyphs.

- Handles negative numbers and numbers greater than 8 by recursive calls.

3. `layout` Function:

- Formats the table into aligned columns for better readability.

4. `paste` Function:

- Combines multiple tables horizontally with spaces between them.

5. Main Execution Block (`if __name__ == '__main__'`):

- Generates specific tables for different operations and prints them formatted properly.

### Evaluation:

Good Points:

- Modularity: Functions like `gen_table`, `nonary`, and `layout` are modular, making specific tasks well encapsulated.

- Reusability: The functions can be reused or extended easily; e.g., adding new operations or changing glyph sets.

- Readability: Despite some complexity in logic, especially in recursion and formatting, variable names and function names are descriptive which aids understanding.

Areas for Improvement:

1. Efficiency Concerns:

- Recursion in `nonary`: For large numbers, this could lead to deep recursion levels.

- Repeated calculation of widths in `layout`: This could be optimized if performance becomes an issue with larger data sets.

2. Error Handling:

- There’s no error handling if invalid inputs are passed to functions like non-existent glyphs or unsupported operations.

3. Documentation & Comments:

- While there's a detailed module-level docstring explaining high-level concepts, inline comments explaining complex parts (like recursion in `nonary`) would improve maintainability.

- More detailed docstrings on what each function expects as input and what it outputs would help other developers understand their interfaces quickly without diving into code details.

4. Testing & Validation:

- Adding unit tests would ensure that changes do not break existing functionality and help validate edge cases like very large/small numbers or unusual sequences of operations.

5. Scalability & Generalization:

– Currently tailored specifically to balanced nonary system; generalizing this to handle any base might make it more useful as a tool for exploring other numeral systems.

6. User Interface Improvements:

– Providing command-line arguments or interactive prompts could allow users to specify ranges or select specific tables without modifying source code directly.

### Conclusion:

The script is well-designed concerning modularity and separation of concerns but lacks robustness due to missing error handling and testing components. Enhancements towards documentation, user interaction, scalability/generalization could significantly increase its utility while maintaining ease-of-use for educational purposes or numerical system explorations.

nicetryguy · on April 15, 2024

Those concerns seem legit? Surprised at the negativity here.

JonChesterfield · on April 15, 2024

HN thinks LLMs are the early days of the singularity and not a spam generator. I assume this is an echo chamber effect.

_ugfj · on April 15, 2024

I stand against the tide for sure.

I am grievously concerned about the sea temperature being off the charts for a year now and we do not know why. Wasting tremendous amounts of energy to generate bullshit doesn't seem like a prudent action. Yes it's not yet a large chunk of total energy usage but we do need to stop the hype before going further when it'll be.

Because as https://hachyderm.io/@inthehands/112006855076082650 explains it's good for naught else.

> You might be surprised to learn that I actually think LLMs have the potential to be not only fun but genuinely useful. “Show me some bullshit that would be typical in this context” can be a genuinely helpful question to have answered, in code and in natural language — for brainstorming, for seeing common conventions in an unfamiliar context, for having something crappy to react to.

> Alas, that does not remotely resemble how people are pitching this technology.

And then of course there are all the ethical concerns.

wilya7 · on April 15, 2024

How about Christmas lights? Washing machines? TV and video entertainment? Elevators for the first floor? How about for the second floor? Social media posts, like these?

The demand for technology leads to advancements that meet our needs. As we continue to innovate, we must focus on consuming more energy rather than less.

You are eager to decide what is useful and what is not. Can you predict the future? Can you predict the full impact of technologies? Can you see second, third and forth order effect? Likely not. For instance, many may not have anticipated the significant role smartphones play today.

It concerns me when some individuals attempt to control others' resource usage, potentially leading to authoritarian rule driven by fear. Such actions might result in adverse effects before any noticeable climate changes occur in the near future.

_ugfj · on April 15, 2024

The counterweight elevator is -- by far -- the most energy efficient way for people to live and be supplied. As in, if you need to supply a few thousand people with food and services then cramming them in high rises and surrounding those with facilities in walking distance will consume the least energy. Even more efficient than moving around on bikes or trams. I do not have the book at hand where this was calculated in painstaking detail but I am not sure a human consumes less energy walking up a floor than the elevator does thanks to said counterweight.

Also, if you need to convince people to live in such circumstances then a little convenience goes rather far so that also needs to be considered.

Modern washing machines are certainly more water efficient than hand washing and I wouldn't be surprised, again, if they would be more energy efficient too. Once again: humans consume energy too. Edit: and as someone else noted, we look should look at the societal effect. Well, it's quite clear the washing machine is an extremely big plus as it automates a time consuming, hard physical task.

So far every order effect of LLMs are terrible as they are built on the backs of exploited workers and are used to further disenfranchise workers and also artists.

> As we continue to innovate, we must focus on consuming more energy rather than less.

I think in your fervor to put me down with a flippant comment you went too far. You know this is patently untrue, aren't you? I mean, since you mentioned washing machines surely you are aware both the United States and the EU are pushing hard for more efficient washing machines? https://energy-efficient-products.ec.europa.eu/ecodesign-and... https://environmentamerica.org/center/media-center/biden-adm...

As for lights, LED lights consume less energy and are safer than the old incandescent bulbs. That's once again progressing towards less energy.

Sure we consume more energy than we did pre-industrial revolution but that doesn't mean we must continue upwards.

tommiegannert · on April 15, 2024

> The demand for technology leads to advancements that meet our needs

The things you listed are "wants." Perhaps we could say that washing machines has turned into a need, in much the same way crude oil has. What would the world have been had we tamed nuclear power before oil was commercialized?

> You are eager to decide what is useful and what is not

I think GP is eager to decide what is net beneficial, which is a tradeoff between usefulness and cost (monetary, social, environmental.)

I don't personally care that much about Earth. It's a rock in space, and it will continue existing with our without a functioning ecosystem, but I try to be conservative with my actions, so that the people who do care, may continue enjoying it.

At this moment, AI is a "want," not a "need."

nonrandomstring · on April 15, 2024

> The demand for technology leads to advancements that meet our needs.

Or does the desire for technology leads to advancements that meet our wants? Wants versus needs, and desires versus demands get confusing sometimes.

> It concerns me when some individuals attempt to control others' resource usage,

From a psychological point of view that's understandable. And it does portend an ugly authoritarianism. From a realist standing, it's inevitable if resources are limited. Right now we seem to be limited by the heat capacity of the planetary ecosphere. To avoid that becoming an open conflict I think we need to enrich the debate to talk about appetites rather than needs.

_heimdall · on April 15, 2024

> The demand for technology leads to advancements that meet our needs.

A huge majority of tech advancements are driven by supply rather than demand. Capitalism and modern economics pushes companies to build whatever they can market and sell, they aren't designing new products and tech because consumers already asked for it.

_heimdall · on April 15, 2024

> I am grievously concerned about the sea temperature being off the charts for a year now and we do not know why

The hypothesis I've seen that makes the most sense is that enforcement of ultra low sulfur diesel for marine shipping caused the sea temperature rise [1]. I don't remember which podcast it was now or I would link it, but I first heard this a couple months ago from a researcher who is concerned that this quick rise in sea levels combined with the start of an El Niño this year will lead to a very hot summer with more unpredictable weather.

Anecdotally, weather predictions in my area (south eastern US) have been absolute trash. My best guess is that prediction models are way off now based on changes in underlying conditions the models never had to account for.

[1] https://www.carbonbrief.org/analysis-how-low-sulphur-shippin...

jug · on April 15, 2024

Surely, LLM's can be something other than either the coming of singularity or spam generation? In fact, much of the current use of LLM's today do not fall into either fostering the singularity or generating spam.

Personally, I think LLM's have a place but need supervision for the foreseeable future. In case there are diminishing returns on the horizon, we might need a new AI breakthrough to go further than what we have. But I am absolutely helped in my job by AI assistance. Key is however to understand the system well enough to realize when something is off. Engineers won't be wholly replaced by AI anytime soon, but many can already be helped by them.

mostlysimilar · on April 15, 2024

HN is also full of non-technical people who start companies and can't wait to cut out the technical humans (the most expensive part of their startups) entirely.

orbital-decay · on April 15, 2024

These concerns seem like throwing the baby out of bathwater, which has been correctly pointed out down the Gentoo mail list thread. It's nowhere near as clear cut as OP makes it sound.

Let's take for example the point about code laundering by LLMs and their un-traceability. Clean room reimplementation is widely considered ethical in the software world (for example Wine is built this way), but thinking of it in this context, how is it not laundering to avoid lawsuits, technically legal but ethically dubious? Where's the line? How can writing the boilerplate and generic scaffolding be considered laundering? Certainly there are gradations, and a blanket ban is myopic? This all has been discussed in that thread.

What about writing Autowiki-style documentation, filling the gaps nobody wants to work with? Machine-assisted translation? Indirect use of the tools?

Another point, quality. The person committing the code should be responsible for it, LLM-assisted or not. And the reviewer is responsible for verifying that it meets the quality bar before accepting it. It's as simple as that. (although it's addressed down the thread by pointing out the concerns are redundant).

JonChesterfield · on April 15, 2024

If storing code in a text file and copying the file doesn't magically remove copyright, and compressing it in a zip file then decompressing it doesn't remove copyright, it seems likely that embedding it in a state machine to be retrieved with the prefix of the original file doesn't magically remove copyright. Even if it's slightly corrupted in the process.

If you wanted a copilot style thing that doesn't print out GPL'd software, you wouldn't feed it a load of GPL'd software for it to remember in the first place. What we have is like doing a clean room reimplementation by loading the original in your text editor before you start and deleting parts of it at random.

Draiken · on April 15, 2024

Seems impossible to enforce, but I applaud the spirit of it.

Pretty soon anyone looking to add "open source contributor" to their GH profile can take a Gentoo issue and ask an AI to cook up a solution, put that on a PR, and send it in.

This will be a nightmare for maintainers. I'm not sure if there is a solution, since AI usage will spread regardless of how good/accurate it is and there's no way for us to differentiate between plausible bullshit and actual contributions, without reading it carefully. Reputation of contributors is probably the best proxy for genuine contributions, but that's a catch 22, so it can't be the only way.

madeofpalk · on April 15, 2024

As a maintainer of a Very Large open source project, I have not found this to be the case. I have not found that AI/LLM tools have generated noticably more noise. The occasional low-effort PR existed before ChatGPT and Copilot, and it continues to exist after it. 'Banning AI' does not absolve your responsibility to review PRs, nor do I believe does it make your job actually easier.

I believe I've noticed only one 'LLM spam' comment on an issue needlessly comparing different Javascript package mangers.

Draiken · on April 15, 2024

To be clear, I don't think this is true right now. But if the technology does improve just a bit further, it will be easy enough to be spammable and it will be abused.

pantalaimon · on April 15, 2024

> The AI bubble is causing huge energy waste.

Pretty ironic coming from a distribution that requires every user to compile everything from source.

emporas · on April 15, 2024

When every program is compiled for a specific architecture, doesn't that enable optimizations otherwise impossible? Of course, two users compiling each program twice for the same architecture is wasteful, but optimizations will save cycles down the line. No?

epcoa · on April 15, 2024

> but optimizations will save cycles down the line. No?

No. There are not practically enough cycles saved to overcome the often minutes of CPU time to build every update of a large software package.

> When every program is compiled for a specific architecture

Every binary distribution already "compiles" for a specific architecture.

The inane variant "tuning" generally renders miniscule speedups. Even if there is performance to be gained it can be done on a limited case by case basis.

agentultra · on April 15, 2024

Seems like a reasonable move to me.

I’ve seen folks use ChatGPT to generate code and review it for security flaws. It does often solve the tasks. And it leaves behind many kinds of vulnerabilities: injections, overruns, etc.

Based on the little empirical evidence we have about informal code review [0], it seems that we ought to limit or outright ban generated code. A Human can only read so much code before their impact on catching errors significantly drops. OSS project maintainers have enough on their plate and we don’t need to exhaust them with trying to maintain AI generated code.

[0] https://sail.cs.queensu.ca/data/pdfs/EMSE_AnEmpiricalStudyOf...

Update spelling

cik · on April 15, 2024

Absolutely - though for me it's a personal anecdote. I remember within the first week of ChatGPT becoming generally available, I spent 2 hours crafting a response to a pull request, detailing a "security vulnerability". It didn't matter that this wasn't a vulnerability; it didn't matter that the reporter didn't know of what they were speaking.

What mattered was eliminating the time I had that morning, to do meaningful work. Responding to that (and since, several) PRs was not the desired outcome. It also kicked off internal, processes - which in turn added more time. If you think about it, it's an interesting hack at least!

rubymamis · on April 15, 2024

Why don't evaluate contributions based on how well the code/documentation is written? What does it matter who wrote it, if it's good? Assuming no spamming by bots.

somenameforme · on April 15, 2024

Same reason LLMs aren't so great for regular usage. They do an excellent job of creating highly plausible looking nonsense, which makes it much more likely for things to slip through the cracks. If there was some sort of magical way to ensure code being submitted was flawless, then we'd no longer have bugs in anything, let alone the endless critical flaws that appear in pretty much everything.

For a large highly visible open source project, there's also probably intellectual property concerns. LLMs are being trained wholesale on code that's under IP protections (be that copyright or otherwise) and having that end up in your code base could land you in trouble. 'The AI did it' will probably not be seen as a viable defense.

dagw · on April 15, 2024

highly plausible looking nonsense

Ask the LLM to generate some unit tests for the code it just generated. Then if the test fails, ask the LLM to fix the bug in the code.

I find that this catches and fixes a surprising number of LLM bugs in generated code.

denton-scratch · on April 15, 2024

> Then if the test fails, ask the LLM to fix the bug in the code.

If an LLM is intended to generate code, perhaps it's standing orders should include "Generate a comprehensive test suite, run it, fix the bugs, then iterate. Send me an email when you've finished."

dagw · on April 15, 2024

I feel pretty convinced that that is the next iteration here, and would be quite disappointed if there aren't at least three companies working it right now. In fact maybe the correct solution is to generate 3 version of each function using 3 different LLMs and only return if at least 2 of them generate the same results, like back in the very early days of computers.

denton-scratch · on April 15, 2024

> and would be quite disappointed if there aren't at least three companies working it right now

Why does it need three companies working on it? Isn't this just a matter of prompt engineering?

I've never played with an LLM, so I really don't know. Perhaps it requires some ordinary linear code to produce a "comprehensive test suite".

Ultimately, the problem I can't see 3 or 300 companies solving, is the correct interpretation of technical instructions in English. Native English speakers have trouble with that, and I doubt that a machine can out-perform a native speaker. Maybe I'm wrong, and it can; but it needs to also be able to convince me that my doubt is misplaced.

Writing technical specs is like writing laws; you're using vague words to describe something precise. We don't use machines to interpret laws, and I don't trust machines to interpret specs.

rubymamis · on April 15, 2024

Current LLMs. While I agree, they sometimes create more problems than solving (I'm currently debugging a bug caused by GPT4 code that I initially approved), they are getting better by the day and already 1.5x my productivity.

passwordoops · on April 15, 2024

That's an oddly specific, very round number. You have metrics for it, or is this the same math being used to declare AGI is a year away?

rubymamis · on April 15, 2024

No scientific at all, but I keep a long to-do list for my current project with dates, and it *feels* like LLMs have allowed me to cut a third of development time.

passwordoops · on April 15, 2024

If you still have the record of previous projects (with similar tasks, of course) it would be really cool to see the before/after.

And if you don't, and therefore cannot quantify, then for future reference simply state "it feels like the LLM makes me more efficient"

rubymamis · on April 15, 2024

I could have worded it a bit better - I don't have the before and after since I've been using ChatGPT Pro for the past year, yet only started my new project[1] 7 months ago. What I meant is: by keeping track of every little task I'm doing each day, I'm able to perceive how much time the LLM saved me. Like I said, it's unscientific. BTW, I'll, nonetheless, release this list in a future blog post.

[1] https://get-plume.com/

michaelt · on April 15, 2024

The balance of the past was that if I spent 4 hours making a patch, the busy important unpaid maintainer would spend 15 minutes reviewing it, giving me pointers etc.

And they will thank me for it! I've just saved them 3h45m.

This ratio stopped maintainers getting buried in review requests etc.

With the rise of LLMs there's a risk the ratio will get flipped: someone can make a patch in 5 minutes which needs 15 minutes of review.

That could make being a maintainer much more time-consuming - and mean the job is much less about maintaining code, and much more about dealing with timewasters politely, which ain't many people's idea of a fun hobby.

rubymamis · on April 15, 2024

The solution is simple → don't use automatic code suggestions/PRs. There must be a human involved that suggest a piece of code - and he/she should be fully responsible for it (if there is an error/bug in the code - he/she's at fault).

That should be the way, at least, until AI is proven to better at writing code than humans.

belorn · on April 15, 2024

Treating everything as if a contribution is done by an unknown entity (or worse, a hostile entity) is more work than generally people in a community want to do. Social communities are built on trust.

With the XZ utils backdoor, people are currently removing any contribution done by that attacker. Even if the work done is of limited quantity, evaluating every character and every byte is a huge endeavor requiring large amount of man hours. At some point it often is easier to discard code then doing the necessary in-depth analyses and testing in order to determine with a 100% certainty that the code do what the code should be doing with no side effect, security issue or legal implication.

bombcar · on April 15, 2024

and if you trust me in general to be writing decent code, the review is a quick one (and let’s be honest, sometimes defacto or minimal).

If you think I’m an AI, you have to review it so closely you right as well just have written it yourself.

keybored · on April 15, 2024

It matters because it can eat up the reviewers’ time. Contributors can generate some AI output in perhaps (amortized) basically no time while the reviewer has to review it for half an hour.

At scale it doesn’t work to simply judge the content on its merit when the content can be generated at no cost. The content becomes indistinguishable from spam, no matter the intent of the contributor.

ajsnigrutin · on April 15, 2024

AIs can make good looking pictures too, where everything looks nice... it's all good...

...until you count how many fingers the humans have.

Same is true for AI generated code.

rubymamis · on April 15, 2024

LLMs may not be superbly good at coding right now - I compile a list of code questions they've all failed, so it's very clear they are not there yet. But they are improving at an amazing pace (and already improved my productivity in so many ways - reminding me syntax of different languages, helping to debug obscure code, writing certain functions (with varying degree of success), etc).

ekidd · on April 15, 2024

I do write some niche open source projects, ones which:

- Have been written with CoPilot enabled in my editor, and

- which optionally use GPT 3.5 as a translation API, and

- Which use OpenAI's text-to-speech model to generate spoken dialog files for testing.

I suppose I can try to mark my projects in a such a way as to inform Gentoo that it's against their policy to package them.

Overall, I would guess that my CoPilot-assisted code is slightly worse than code I hand-craft. The biggest difference seems to be that with CoPilot I write fewer tiny functions, and I tend to keep more related code in one place. On the other hand, CoPilot makes writing test code extremely quick. And I'm not talking about generic boilerplate here: CoPilot can write non-trivial parser or type inference code that relies heavily on internal project APIs that do not exist outside my project.

Overall, I'd guess that CoPilot allows me to produce twice as much code at 90-95% of the quality. Which since we're talking about open source projects that I maintain in my spare time (and that were painfully over-engineered to begin with), is probably a decent tradeoff.

layer8 · on April 15, 2024

TFA is not about upstream content, as explicitly mentioned.

ekidd · on April 15, 2024

My takeaway was that they weren't happy about upstreams, but didn't feel like there was anything that could actually be done:

> We can't do much about upstream projects using it.

As a potential upstream, I could potentially help them by adding some kind of metadata to my package, indicating, "Some portions of this code were written with CoPilot active." And this could allow them to automatically filter out and reject packaging requests from users.

(As an open source author, I'm deeply ambivalent about distro packaging anyways. I release my software as pre-built, standalone binaries specifically to avoid the tarpit of distro packaging politics. If my software is packaged for a distro, it will almost always need to go through someone else, who may or may not do a good job, or keep the software up to date, or break the software in a way that creates more support requests for me.)

layer8 · on April 15, 2024

They may not like use of generative AI in general, but they are clearly only talking about a policy relating to original contributions to Gentoo specifically.

This is tangential, but as a user I vastly prefer distributions, because I can rely on stuff working in context, on observing the distro conventions, and on automatically receiving security updates. It is a much more pleasant and convenient mode of software sourcing.

vasco · on April 15, 2024

> In other words, explicitly forbid people from using ChatGPT, Bard, GitHub Copilot, and so on, to create ebuilds, code, documentation, messages, bug reports and so on for use in Gentoo.

Maybe a naive question but, how will they know?

mannykannot · on April 15, 2024

If the submitter acts as a competent gatekeeper to keep out the crap, no-one will know and neither will they care.

mminer237 · on April 15, 2024

If they can already tell easily whether or not a submission is bad, what is the relevance of an AI ban? It will push out good contributors who use AI responsibly, and presumably make zero effect on people just wanting to abuse AI to get a patch into Gentoo. You still have to do the exact same work to tell whether a contribution is good or not.

mannykannot · on April 15, 2024

> If they can already tell easily whether or not a submission is bad...

There's no implication in what I wrote that this is the case. The fact is that weeding out the crap is a tedious and difficult job, and maintainers (especially volunteer ones) don't deserve the extra burden brought by automated crap generation.

> It will push out good contributors who use AI responsibly...

My entire comment, which is just one sentence long, explains why this will not be the case.

>... and presumably make zero effect on people just wanting to abuse AI to get a patch into Gentoo.

That's not everybody (or if it is, Gentoo is screwed anyway, which does not mean it's not worth trying to stop it happening.)

These points have been made by multiple people in these comments over the last few hours.

xniclb · on April 15, 2024

It's not about "how".

When it is discovered that your submission is AI generated, it is enough reason to discard the submission without having to review it any further.

Many open communities do the same.

Tooster · on April 15, 2024

"Many communities depend on AI tools to detect and ban AI content" interesting...

If not depending on AI tools then depending on a... hunch? So like a modern-era witch hunting?

ceejayoz · on April 15, 2024

I mean, I came across a StackOverflow post the other day with "I apologize for the misunderstanding." in the middle of it.

There's a bit of a "intro paragraph, five numbered list items, concluding paragraph" style format that you start to notice pretty quickly.

speedgoose · on April 15, 2024

Certainly, but you can prompt the LLM to write following a specific style.

malet · on April 15, 2024

All very well, the question is, how will you know? And if you can’t reliably differentiate between ai or human contributions how could this be enforced?

kevindamm · on April 15, 2024

I thought the same at first, but if you look at the discussion page linked at the end, there's an example of some package descriptions that were auto-generated but the result was very inaccurate -- claiming features that aren't even close to what the package offers. Not knowing the contents of the package, you might not notice. And it would definitely trip up any attempt at indexing these descriptions for search.

Then a bot responded to the discussion poster's concerns and it was humorous but also it offered no way to resolve the issue.

So there are one or two cases where a maintainer might notice something off and this policy would offer a clear-cut way to reject whatever submitted the inaccurate decisions or to take the AI out of the discussion forum.

But for the cases of copilot-authored code I don't think there's any reliable way to detect or reject it. This probably falls under their "but not upstream changes" caveat.

wongarsu · on April 15, 2024

So in reality it would probably end up as a faster way to reject unhelpful or harmful uses of AI. If you manage to auto-generate correct package descriptions (maybe through human review) nobody has a reason to complain, even if you overuse the word "delve" a bit.

kevindamm · on April 15, 2024

At the same time, if you produce text and code that reads overly like a bot, they may have just cause to dismiss your submission and maybe even ban you, if we're being so teleological about it.

I don't personally agree this particular line in the sand will help in all cases -- it is a difficult standard determining whether something is AI-created, this will likely increase the burden on the humans in the loop. But as policies go, it makes sense to have a line drawn in the sand for outright rejecting it on source not content, especially in the context of a package manager and Linux distribution. The burden on said humans in the loop will be even greater if they don't have a rule in place granting blanket dismissal on this characteristic, especially if they're correct in seeing an increase in AI-produced packaging of unknown binaries.

wccrawford · on April 15, 2024

Isn't that the point of code reviews, though? A human can write incorrect descriptions as well. This one is incredibly easy to catch for any maintainer of the project.

Of course, it'll get harder and harder to spot the problems, but that just brings the "bug" closer to the human-generated level.

Banning AI doesn't fix the problem, especially since the type of person that would have AI generate a description and then not even read it also isn't going to follow the rules.

mannykannot · on April 15, 2024

The problem is that current LLMs are capable of creating junk that's prima-facie plausible in industrial quantities, and finding the sound material within it is a big and growing burden for any organization, especially those that accept contributions from the public.

> The type of person that would have AI generate a description and then not even read it also isn't going to follow the rules.

This is not a sound argument against this rule; it is an argument for the proposition that current LLMs present a threat to the open-source model, despite this rule.

johnisgood · on April 15, 2024

> A human can write incorrect descriptions as well.

Sure they can, but the descriptions are completely irrelevant to what the package or project at hand does.

gritzko · on April 15, 2024

The author clarifies that there is no way to detect it. He only wants it explicitly stated in the policy: "don't bring that".

denton-scratch · on April 15, 2024

Yes, this would be like the Wikipedia policy that uncited material can be deleted at any time.

skissane · on April 15, 2024

Human-generated and AI-generated aren’t mutually exclusive categories anyway. For example, a person writing a document (or software package) can start out with some rough human-generated notes, pass them to an AI to flesh out, and then edit the AI’s output to improve it, fix anything the AI got wrong (or even that they just don’t agree with, e.g. “this code is correct but it’s not how I’d write it”)

What’s the fundamental difference between that and a human doing it without an AI’s help at all? It could be essentially the same outcome, just more time and human effort

PhilipRoman · on April 15, 2024

It's very easy to tell. AI fanboys will show you the most generic and bland pictures with crippled limbs and go "See? You probably can't tell if it was AI generated or not"

The code is usually equally braindead. Not that it matters of course, most code that humans write today isn't much better, but if you value quality like many foundational open source projects do, the difference becomes obvious.

t-sauer · on April 15, 2024

But DeepL is fine apparently (as it should be in my opinion) so I guess this is going to be a random dice roll what maintainers will allow or not.

I can understand the sentiment behind this proposal, but it is way too nuanced and complicated to just solve it with a few basic rules.

moooo99 · on April 15, 2024

I am confused, how is a translator comparable to an LLM that primarily serves as a generator? Just because it’s powered by some neural network behind the scenes?

exitb · on April 15, 2024

It's similar in that regard that it might be wrong sometimes. It's also similar, as coding LLM may be used to translate code between languages.

t-sauer · on April 15, 2024

That's precisely my point. What does "AI-backed" mean exactly? The RFC doesn't ban LLM, but "AI-backed" contributions. Technically from my understanding and interpretation DeepL would fall under that category and would be banned from being used but in the linked thread people seem to agree that DeepL usage is fine. So where do you draw the line? How do you define which "AI-backed" contributions are still fine and which aren't. Eventually different contributors will have different opinions and it will get messy.

moooo99 · on April 15, 2024

At this points it’s just playing word games imho.

The RFC specifically mentions tools like Bard, ChatGPT and Co-Pilot as what it considers to be ban-worthy.

If I’m going to be extremely pedantic about it: with Co-Pilot, the a actual contents of the contribution are AI generated. With a translator like DeepL, the contents are authored by the contributor, the translation tool just translations what’s already there

GaggiX · on April 15, 2024

Both Deepl and LLM use transformer decoders, a type of generative models.

layer8 · on April 15, 2024

> is fine apparently

What are you referring to?

I agree that “AI” needs to be more closely defined, as for example using A* search is also AI.

johnisgood · on April 15, 2024

DeepL, as in, the translator? That is very different from random text unrelated to the project.

t-sauer · on April 15, 2024

DeepL is an AI-backed translator. If I use it to help me write my documentation, how is it different?

johnisgood · on April 15, 2024

I do not see how translating your own writing vs. generating unrelated text is NOT different. What does AI-backed mean in this case, though, if not translating your OWN text?

oefrha · on April 15, 2024

> In other words, explicitly forbid people from using ChatGPT, Bard, GitHub Copilot, and so on, to create ebuilds, code, documentation, messages, bug reports and so on for use in Gentoo.

On the one hand, I’m on the fence about this heavy-handed approach. Tons of people, myself included, use AI assistants to create high quality work in less time. Of course I’m also aware of tons of low quality garbage.

On the other hand, I’m all for banning automated submissions which have been on the rise for the past couple of years, which are often thinly veiled (if at all) ads for AI startups. GitHub in particular should allow owners to ban all unsanctioned bots, and report unlabeled bots.

tgsovlerkhgsel · on April 15, 2024

The quality concerns are absolutely justified. The complaints about energy use sound like unfounded, extremely far-fetched arguments just used by people who don't like LLMs for other reasons.

The inference energy cost is likely on the same order of magnitude as the computer + screen used to read the answer (higher wattage, but much shorter time to generate the response than to formulate the request and read it).

The training energy cost is significant only if we ignore that it is used by many people. For GPT-3, I've seen plausible estimates of ~1 GWh, which would equal to about 400 tons of CO2, about as much as a single long-distance plane (total, not per passenger, fuel consumption only) round trip. Estimates for newer models usually ignore the existence and likely use of more efficient accelerators.

agentultra · on April 15, 2024

It’s not completely unfounded.

Water consumption skyrocketed while training up the current generation of models [0].

The energy (and water) usage to service requests is not staggering but it is concerning that is uses as much as it does for the output it generates [1]

[0] https://futurism.com/critics-microsoft-water-train-ai-drough...

[1] https://www.forbes.com/sites/cindygordon/2024/03/12/chatgpt-....

pantalaimon · on April 15, 2024

The energy use of Gentoo is already way above any other distribution as it requires the user to compile all software themselves.

TheFreim · on April 15, 2024

Hearing complaints about energy consumption from someone using a source-based distribution is quite rich considering they're often wasting much more energy than they need by compiling themselves instead of using a binary distribution.

nonrandomstring · on April 15, 2024

It's interesting because you can see it as both a very conservative approach and a high risk stance which don't seem like common bedfellows.

If you acknowledge up front that AI is unfit for purpose and is very likely to introduce some serious security problems then it seems wise. When it turns out the LLM models have all been compromised to insert backdoors into the compiler toolchain, you win by being the last distro left standing. You could look at it as a very high risk strategy for the same reason, if you think you'll be "left behind". Either way who dares wins (or dies). Dare to go against the mob, or dare to bet the farm on a principle.

AFAIK Gentoo is one of the more conservative communities. But I'd also expect to see this policy being considered in BSD circles too.

matheusmoreira · on April 15, 2024

Define "use". What kind of uses? Verbatim code sourced from AI or using the technology in general?

I've used AI to learn about massive codebases. It's a bit stupid but still extremely helpful. The free ChatGPT was capable of explaining the concepts in the code and the file system structure of the project, allowing me to get started much faster. It sure as hell beats being a help vampire on some IRC channel or mailing list.

This technology is literally too good to be banned. We should be working on taking it as far as humanly possible by getting it running locally and completely uncensored.

johnisgood · on April 15, 2024

I am pretty sure that you could use AI to re-phrase your own description. Right?

BaculumMeumEst · on April 15, 2024

RFC: Banning contributions written on systems with proprietary software

drrlvn · on April 15, 2024

This is from February, is there any update or progress?

AtNightWeCode · on April 15, 2024

I think it is a more general problem. One can't see in GIT what tools have been used to create and validate the code.

It is impossible to use most modern devtools completely without AI. I think it is better to regulate the usage and enforce transparency.

speed_spread · on April 15, 2024

I haven't used Gentoo in a while but if I went back to it and had to manipulate ebuilds, LLMs would absolutely be involved. Even if it was enforceable, a no-LLM policy would deprive infrequent contributors from using one of the most powerful coding tools we've invented. It would be as inane as banning man pages or syntax highlighting. While I understand the goal of high quality input, the net result would just be gatekeep the process to experienced contributors and be detrimental to the project in the long run. LLMs are here to stay and are way too useful to be ruled out.

BlueTemplar · on April 15, 2024

In completely unrelated news (/s), supposedly Microsoft is silently (?) installing Copilot on Windows Server 2022 systems ?

cqqxo4zV46cp · on April 15, 2024

Not to be too snarky, but this is exactly the proposal and associated conversation that I’d expect to see around Gentoo. It is as impractical as it is unnecessarily standoffish.

There are multiple repliers that very clearly don’t understand how LLMs work. “It’s computers, so I can intuit it!” is typical techie hubris.

blueflow · on April 15, 2024

AI generated contents are just low quality stuff, reviewing them is a huge sink of time. You are better off in terms of manpower/time by banning AI contributors and doing the work yourself.

andybak · on April 15, 2024

There's a vast spectrum of contributions. I use Copilot and most of the time it's just autocomplete with a bit more awareness of context.

Like - it suggests the next 10 characters or so it's either correct or correct enough that it saves me some typing.

So in the same way that people complain about CGI in movies (if the CGI is good CGI then they probably haven't even noticed it) - the only AI you notice will be the bad stuff.

jbandela1 · on April 15, 2024

I am not sure what “AI” means but: Yes, yes. It is past time we do it.

People are producing plausible sounding bullshit because of ease of just cranking out and iterating code quickly. And we have made it way too easy to incorporate potentially copyrighted other people’s code.

As Donald Knuth would know, back when you had to generate punch cards and stand in line to load them on the mainframe, you spent a lot more time carefully designing and logically working through your code, instead of producing massive amounts of plausible bullshit. So yes, I agree.

You are talking about banning interactive editors with copy/paste and interactive compilers and debuggers right?

connorgutman · on April 15, 2024

As a Gentoo cultist… no thanks!

f321x_ · on April 15, 2024

Based