> I think I'm ready to open my wallet for that Claude subscription for now. I'm ...

gpm · 2025-07-22T19:52:42 1753213962

This. I've had claude (sonnet 4) delete an entire file by running `rm filename.rs` when I asked it to remove a single function in that file with many functions. I'm sure there's a reasonably probability that it will do much worse.

Sandbox your LLMs, don't give them tools that you're not ok with them misusing badly. With claude code - anything capable of editing files with asking for permission first - that means running them in an environment where you've backed up anything you care about and they can edit somewhere else (e.g. a remote git repository).

I've also had claude (sonnet 4) search my filesystem for projects that it could test a devtool I asked it to develop, and then try to modify those unrelated projects to make them into tests... in place...

These tools are the equivalent of sharp knives with strange designs. You need to be careful with them.

microtonal · 2025-07-22T20:13:02 1753215182

Just to confirm that this is not a rare event, had the same last week (Claude nukes a whole file after asking to remove a single test).

Always make sure you are in full control. Removing a file is usually not impactful with git, etc. but an Anthropic has to even warned that misalignment can cause even worse damage.

SAI_Peregrinus · 2025-07-23T00:46:25 1753231585

The LLM can just as well nuke the `.git` directory as it can any other file in the project. Probably best to run it as a separate user with permissions to edit only the files you want it to edit.

kurthr · 2025-07-23T01:02:47 1753232567

I don't always develop code with AI, but when I do, I do it on my production repository!

fc417fc802 · 2025-07-23T03:28:27 1753241307

Maybe only give it access to files residing on a log-structured file system such as NILFS?

sitkack · 2025-07-25T15:05:34 1753455934

To confirm your confirmation, over a month ago I was debugging an issue with Claude Code itself, and it launched another copy of itself in yolo mode which just started tearing up like a powertool at a belt sander race. These coding agents should really only be used in a separate user account.

herbst · 2025-07-23T07:23:34 1753255414

Same here. Claude definitely can get very destructive if unwatched.

And on the same note be careful to mention files outside of it's working scope. It could get the urge to "fix" these later.

blitzar · 2025-07-22T20:38:07 1753216687

Before cursor / claude code etc I thought git was ok, now I love git.

buserror · 2025-07-23T10:39:49 1753267189

Also, make it it auto-pushes somewhere else, I use aider a lot, and I have a regular task that backs everything up at regular interval, just to make sure the LLM doesn't decide to rm -rf .git :-)

Paranoid? me? nahhhhh :-)

gs17 · 2025-07-23T02:51:29 1753239089

I've had similar behavior through Github Copilot. It somehow messed up the diff format to make changes, left a mangled file, said "I'll simply delete the file and recreate it from memory", and then didn't have enough of the original file in context anymore to recreate it. At least Copilot has an easy undo for one step of file changes, although I try to git commit before letting it touch anything.

mnky9800n · 2025-07-22T20:12:58 1753215178

I think what vibe coding does in some ways is interfere with the make feature/test/change then commit loop. I started doing one thing, then committing it (in vscode or the terminal not Claude code) then going to the next thing. If Claude decides to go crazy then I just reset to HEAD and whatever Claude did is undone. Of course there are more complex environments than this that would not be resilient. But then I guess using new technology comes with some assumptions it will have some bugs in it.

flashgordon · 2025-07-22T23:48:09 1753228089

Forget sandboxing. I'd say review every command it puts out and avoid auto-accept. Right now given inference speeds running 2 or 3 parallel Claude sessions in parallel and still manually accept is still giving me a 10x productivity boost without risking disastrous writes. I know I feel like a caveman not having the agent own the end to end code to prod push but the value for me has been in tightening the innerloop. The rest is not a big deal.

margalabargala · 2025-07-23T02:44:05 1753238645

Claude Code even lets you whitelist certain mundane commands, e.g. `go test`.

Yes it could write a system call in a test that breaks you, but the odds of that when random web integration tests is very very low.

theshrike79 · 2025-07-23T07:15:19 1753254919

To paraphrase the meme: "ain't nobody got time for that"

Just either put it in (or ask it to use) a separate branch or create a git worktree for it.

And if you're super paranoid, there are solutions like devcontainers: https://containers.dev

wibbily · 2025-07-23T08:28:43 1753259323

Same thing happened to me. Was writing database migrations, asked it to try a different approach - and it went lol let's delete the whole database instead. Even worse, it didn't prompt me first like it had been doing, and I 100% didn't have auto-accept turned on.

If work wasn't paying for it, I wouldn't be.

anonzzzies · 2025-07-23T04:18:00 1753244280

You can create hooks for claude code to prevent a lot of the behavior, especially if you work with the same tooling always, you can write hooks to prevent most bad behaviour and execute certain things yourself while claude continues afterwards.

syndeo · 2025-07-23T03:21:20 1753240880

Claude tried to hard-reset a git repo for me once, without first verifying if the only changes present were the ones that it itself had added.

godelski · 2025-07-23T03:14:35 1753240475

  > Why does the author feel confident that Claude won't do this?

I have a guess

  | (I have almost zero knowledge of how the Windows CLI tool actually works. What follows below was analyzed and written with the help of AI. If you are an expert reading this, would love to know if this is accurate)

I'm not sure why this doesn't make people distrust these systems.

Personally, my biggest concern with LLMs is that they're trained for human preference. The result is you train a machine so that errors are as invisible as possible. God tools need to make errors loud, not quiet. The less trust you have for them the more important this is. But I guess they really are like junior devs. Junior devs will make mistakes and then try to hide it and let no one know

oskarw85 · 2025-07-23T04:41:00 1753245660

This is a spot-on observation. All LLMs have that "fake it till you make it" attitude together with "failure is not an option" - exactly like junior devs on their first job.

Polizeiposaune · 2025-07-23T05:11:16 1753247476

AI = Amnesiac Intern

ryandrake · 2025-07-23T04:57:28 1753246648

Or like those insufferable grindset IndieHackers hustling their way through their 34th project this month. It’s like these things are trained on LinkedIn posts.

dkersten · 2025-07-22T19:50:40 1753213840

Jsut today I was doing some vibe coding ish experiments where I had a todo list and getting the AI tools to work through the list. Claude decided to do an item that was already checked off, which was something like “write database queries for the app” kind of thing. It first deleted all of the files in the db source directory and wrote new stuff. I stopped it and asked why it’s doing an already completed task and it responded with something like “oh sorry I thought I was supposed to do that task, I saw the directory already had files, so I deleted them”.

Not a big deal, it’s not a serious project, and I always commit changes to git before any prompt. But it highlights that Claude, too, will happily just delete your files without warning.

chowells · 2025-07-22T20:52:33 1753217553

Why would you ask one of these tools why they did something? There's no capacity for metacognition there. All they'll do is roleplay how human might answer that question. They'll never give you any feedback with predictive power.

gpm · 2025-07-22T21:04:20 1753218260

They have no metacognition abilities, but they do have the ability to read the context window. With how most of these tools work anyways, where the same context is fed to the followup request as the original.

There's two subreasons why that might make asking them valuable. One is that with some frontends you can't actually get the raw context window so the LLM is actually more capable of seeing what happened than you are. The other is that these context windows are often giant and making the LLM read it for you and guess at what happened is a lot faster than reading it yourself to guess what happened.

Meanwhile understanding what happens goes towards understanding how to make use of these tools better. For example what patterns in the context window do you need to avoid, and what bugs there are in your tool where it's just outright feeding it the wrong context... e.g. does it know whether or not a command failed (I've seen it not know this for terminal commands)? Does it have the full output from a command it ran (I've seen this be truncated to the point of making the output useless)? Did the editor just entirely omit the contents of a file you told it to send to the AI (A real bug I've hit...)?

ilikepi · 2025-07-23T03:25:19 1753241119

> One is that with some frontends you can't actually get the raw context window so the LLM is actually more capable of seeing what happened than you are. The other is that these context windows are often giant and making the LLM read it for you and guess at what happened is a lot faster than reading it yourself to guess what happened.

I feel like this is some bizzaro-world variant of the halting problem. Like...it seems bonkers to me that having the AI re-read the context window would produce a meaningful answer about what went wrong...because it itself is the thing that produced the bad result given all of the context.

gpm · 2025-07-23T03:43:40 1753242220

It seems like a totally different task to me, which should have totally different failure conditions. Not being able to work out the right thing to do doesn't mean it shouldn't be able to guess why it did what it did do. It's also notable here that these are probabilistic approximators, just because it did the wrong thing (with some probability) doesn't mean its not also capable of doing the right thing (with some probability)... but that's not even necessary here...

You also see behaviour when using them where they understand that previous "AI-turns" weren't perfect, so they aren't entirely over indexing on "I did the right thing for sure". Here's an actual snippet of a transcript where, without my intervention, claude realized it did the wrong thing and attempted to undo it

> Let me also remove the unused function to clean up the warning:

> * Search files for regex `run_query_with_visibility_and_fields`

> * Delete `<redacted>/src/main.rs`

> Oops! I made a mistake. Let me restore the file:

> * Terminal `jj undo ; ji commit -m "Undid accidental file deletion"`

It more or less succeeded too, `jj undo` is objectively the wrong command to run here, but it was running with a prompt asking it to commit after every terminal command, which meant it had just committed prior to this, which made this work basically as intended.

nullc · 2025-07-23T04:22:28 1753244548

> They have no metacognition abilities, but they do have the ability to read the context window.

Sure, but so can you-- you're going to have more insight into why they did it than they do-- because you've actually driven an LLM and have experience from doing so.

It's gonna look at the context window and make something up. The result will sound plausible but have no relation to what it actually did.

A fun example is to just make up the window yourself then ask the AI why it did the things above then watch it gaslight you. "I was testing to see if you were paying attention", "I forgot that a foobaz is not a bazfoo.", etc.

gpm · 2025-07-23T04:36:18 1753245378

I've found it to be almost universally the case that the LLM isn't better than me, just faster. That applies here, it does a worse job than I would if I did it, but it's a useful tool because it enables me to make queries that would cost too much of my time to do myself.

If the query returns something interesting, or just unexpected, that's at least a signal that I might want to invest my own time into it.

herbst · 2025-07-23T07:26:41 1753255601

I ask it why when it acts stupid and then ask it to summarize what just happened and how to avoid it into claude.md

With varied success, sometimes it works sometimes it doesn't. But the more of these Claude.md patches I let it write the more unpredictable it turns after a while.

Sometimes we can clearly identify the misunderstanding. Usually it just mixes prior prompts to something different it can act on.

So I ask it to summarize it's changes in the file after a while. And this is where it usually starts doing the same mistakes again

uludag · 2025-07-22T20:21:58 1753215718

It's magical thinking all the way down: convinced they have the one true prompt to unlock LLMs true potential, finding comfort from finding the right model for the right job, assuming the most benevolent of intentions to the companies backing LLMs, etc.

I can't say I necessarily blame this behavior though. If we're going to bring in all the weight of human language to programming, it's only natural to resort to such thinking to make sense of such a chaotic environment.

monatron · 2025-07-22T20:09:26 1753214966

Claude will do this. I've seen it create "migration scripts" to make wholesale file changes -- botch them -- and have no recourse. It's obviously _not great_ when this happens. You can mitigate this by running these agents in sandbox environments and/or frequently checkpointing your code - ideally in a SCM like git.

Faark · 2025-07-23T06:17:50 1753251470

It will! Just yesterday had it run

> git reset --hard HEAD~1

After it commited some unrelated files and telling it to fix it.

Am enough of a dev to look up some dangling heads, thankfully

nicce · 2025-07-22T19:55:18 1753214118

I haven't used Claude Code but Claude 4 Opus has happily suggested on deleting entire databases. I haven't given yet permission to run commands without me pressing the button.

bdhcuidbebe · 2025-07-22T20:01:27 1753214487

Because AI apologists keep redefining acceptable outcome.

aNapierkowski · 2025-07-22T20:14:21 1753215261

its the funniest takeaway the author could have tbh

AndyNemmity · 2025-07-23T04:20:03 1753244403

I'm confident it will. It's happened to me multiple times.

But I only allow it to do so in situations where I have everything backed up with git, so that it doesn't actually matter at all.

thekevan · 2025-07-23T03:39:35 1753241975

The author doesn't say it won't.

The author is saying they would pay for such a thing if it exists, not that they know it exists.

starfallg · 2025-07-23T01:27:55 1753234075

Bingo. Because it's just another Claude Code fanpost.

I mean I like Claude Code too, but there is enough room for more than one CLI agentic coding framework (not Codex though, cuz that sucks j/k).