I've thought about this, although perhaps not framed the same way, and one of my suggestions is to vibe code in Rust. I don't know how well these models handle Rust's peculariarities, but I believe that one should take all the safety they can get in case the AI assistant makes a mistake.
I think most of the failures of vibe-coding can be fixed by running the agent inside a sandbox (a container or VM) that doesn't have access to any important credentials.
I think the failures like this one, deleting files, etc, are mostly unrelated to the programming language, but rather the llm has a bunch of bash scripting in its training data, and it'll use that bash scripting when it runs into errors that commonly are near to bash scripting online... which is to say, basically all errors in all languages.
I think the other really dangerous failure of vibe coding is if the llm does something like:
cargo add hallucinated-name-crate
cargo build
In rust, doing that is enough to own you. If someone is squatting on that name, they now have arbitrary access to your machine since 'build.rs' runs arbitrary code during 'build'. Ditto for 'npm install'.
I don't really think rust's memory safety or lifetimes are going to make any difference in terms of LLM safety.
That's insightful. So where Rust might help you to program safely (write code free from certain classes of bugs), cargo has much/all the same supply-chain risks we would see in development ecosystems like pip and npm. And your point about operating in the shell is also well-taken.
So yeah, I must narrow my Rust shilling to just the programming piece. I concede that it doesn't protect in other operations of development.
I think Rust is a bad example, but I think the general idea that the design of a programming language can help with the weaknesses of LLMs makes sense. Languages with easy sandboxing (like Deno where workers can be instantiated with their own permissions) or capability-based security could limit the blast radius of LLM mistakes or insecure library choices made by LLMs, while also giving similar benefits to human programmers and code reviewers.
Why is Rust a bad example? Of the code bases I've tried Claude on so far, it's done the best job with the Rust ones. I guess having all the type signatures there and meaningful feedback from the compiler help to steer it in the right direction.
Rust doesn't protect you much further than most typed memory-safe languages do; it won't stop an LLM from writing code to erase your filesystem or from importing a library that sounds useful but is full of malware.