Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This. I've had claude (sonnet 4) delete an entire file by running `rm filename.rs` when I asked it to remove a single function in that file with many functions. I'm sure there's a reasonably probability that it will do much worse.

Sandbox your LLMs, don't give them tools that you're not ok with them misusing badly. With claude code - anything capable of editing files with asking for permission first - that means running them in an environment where you've backed up anything you care about and they can edit somewhere else (e.g. a remote git repository).

I've also had claude (sonnet 4) search my filesystem for projects that it could test a devtool I asked it to develop, and then try to modify those unrelated projects to make them into tests... in place...

These tools are the equivalent of sharp knives with strange designs. You need to be careful with them.



Just to confirm that this is not a rare event, had the same last week (Claude nukes a whole file after asking to remove a single test).

Always make sure you are in full control. Removing a file is usually not impactful with git, etc. but an Anthropic has to even warned that misalignment can cause even worse damage.


The LLM can just as well nuke the `.git` directory as it can any other file in the project. Probably best to run it as a separate user with permissions to edit only the files you want it to edit.


I don't always develop code with AI, but when I do, I do it on my production repository!


Maybe only give it access to files residing on a log-structured file system such as NILFS?


To confirm your confirmation, over a month ago I was debugging an issue with Claude Code itself, and it launched another copy of itself in yolo mode which just started tearing up like a powertool at a belt sander race. These coding agents should really only be used in a separate user account.


Same here. Claude definitely can get very destructive if unwatched.

And on the same note be careful to mention files outside of it's working scope. It could get the urge to "fix" these later.


Before cursor / claude code etc I thought git was ok, now I love git.


Also, make it it auto-pushes somewhere else, I use aider a lot, and I have a regular task that backs everything up at regular interval, just to make sure the LLM doesn't decide to rm -rf .git :-)

Paranoid? me? nahhhhh :-)


I've had similar behavior through Github Copilot. It somehow messed up the diff format to make changes, left a mangled file, said "I'll simply delete the file and recreate it from memory", and then didn't have enough of the original file in context anymore to recreate it. At least Copilot has an easy undo for one step of file changes, although I try to git commit before letting it touch anything.


I think what vibe coding does in some ways is interfere with the make feature/test/change then commit loop. I started doing one thing, then committing it (in vscode or the terminal not Claude code) then going to the next thing. If Claude decides to go crazy then I just reset to HEAD and whatever Claude did is undone. Of course there are more complex environments than this that would not be resilient. But then I guess using new technology comes with some assumptions it will have some bugs in it.


Forget sandboxing. I'd say review every command it puts out and avoid auto-accept. Right now given inference speeds running 2 or 3 parallel Claude sessions in parallel and still manually accept is still giving me a 10x productivity boost without risking disastrous writes. I know I feel like a caveman not having the agent own the end to end code to prod push but the value for me has been in tightening the innerloop. The rest is not a big deal.


Claude Code even lets you whitelist certain mundane commands, e.g. `go test`.

Yes it could write a system call in a test that breaks you, but the odds of that when random web integration tests is very very low.


To paraphrase the meme: "ain't nobody got time for that"

Just either put it in (or ask it to use) a separate branch or create a git worktree for it.

And if you're super paranoid, there are solutions like devcontainers: https://containers.dev


Same thing happened to me. Was writing database migrations, asked it to try a different approach - and it went lol let's delete the whole database instead. Even worse, it didn't prompt me first like it had been doing, and I 100% didn't have auto-accept turned on.

If work wasn't paying for it, I wouldn't be.


You can create hooks for claude code to prevent a lot of the behavior, especially if you work with the same tooling always, you can write hooks to prevent most bad behaviour and execute certain things yourself while claude continues afterwards.


Claude tried to hard-reset a git repo for me once, without first verifying if the only changes present were the ones that it itself had added.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: