I'm less worried about backdoors accidentally appearing in LLM output and more w...

lb4r · on Feb 29, 2024

Wouldn't it be easier (since they probably have very skilled programmers working for them) and way, way more effective to just set up a team and create a quality open source project with one or two extremely stealthy backdoors?

Or just pay or threaten a struggling company or dev to insert them?

underscoring · on Feb 29, 2024

its all about ease.

easier to clone and infect existing ones. what you are describing might be effective but would be orders of magnitude more time consuming.

cloning and infecting provides 100x more opportunities because these are already popular repos

as to paying or coercing someone, again it costs time and money. far easier to just abuse this loophole

gitaarik · on March 2, 2024

How would you secretly hide something like that in FOSS? And why would that be easier? It's seems to me that it's easier to inject into an existing company than to do all the work yourself. This is what they do with most things as I understand.

Lt_Riza_Hawkeye · on March 2, 2024

The heartbleed vulnerability was hidden in plain sight for the better part of a decade, no?

gitaarik · on March 2, 2024

Yes, but that was a memory leak, giving access to unauthorized random memory. That is not an intentionally created exploit / backdoor which gives the owner easy access to the victim's system.

bee_rider · on Feb 29, 2024

That seems pretty risky and easy to catch. The point of these LLMs is to produce code, we know they aren’t very reliable about it, so you have to check the code. So, it is more likely to get inspected than a random GitHub project, right?

It also seems dangerous in the sense that… if there’s a type of prompt that is likely to create infected code, our intelligence agencies would, I guess, want it to hit our adversaries selectively. So they’ll have more rolls of the dice to detect it. So, it is actively creating a situation where our adversaries are more likely to have knowledge of the vulnerabilities.

dns_snek · on March 1, 2024

I agree that it's more likely to be inspected but I think the vast majority of developers aren't inspecting the code rigorously enough (including me, but I don't use LLMs for development) to catch non-obvious bugs, see for example the "Underhanded C contest" [0]

As you've pointed out, this vector would give them near surgical precision and insight into their target's code & systems, rather than casting a wide net with a vulnerable library on Github. They could use a model trained on "underhanded" code or even selectively overwrite parts of the responses with hand-crafted vulnerabilities while only targeting select organizations.

It makes me wonder what the business model of OpenAI and their peers is going to be over the long term. I can't imagine large corporations using "LLM as a service" indefinitely with the risk of IP theft and "bug injection".

[0] https://en.wikipedia.org/wiki/Underhanded_C_Contest

lolbullshit · on Feb 29, 2024

the government doesnt need to produce viruses anymore. They have escrow services and remote access to radios, processors, firmware chips. All that technology is leased to private investigators who are private entities and then they go after people using the tools. It allows distance between the government and spying, lower salaies and infrastructure costs.

The greatest danger from LLMs is people who beleive they are receiveing data that hasnt been tampered with when we already know that LLMs are filtered before public use for terms. Imagine a day where kids and adults ask a LLM what the meaning of life is, should they go outside, what happened in WW2, etc.

People could be programmed in a more tailopred fashion than todays facebook shorts and youtube can deliver.

butlike · on Feb 29, 2024

One of the more useful settings I have is: "If the answer cannot be stated as it's been blocked upstream, please just respond with 'The answer has been blocked upstream.'"

I've gotten that a few times and it's nice to know it's not a limitation of the LLM.

legohead · on Feb 29, 2024

If you're going to make a wild claim like that on HN you should source it.