Nevermind designing _systems_ that account for this, even just debugging such er...

fragmede · 2025-03-19T22:21:10 1742422870

For that case, it sounds more like having your tools commit for you after each change, as is the default for Aider, is the real winner. "git log -p" would have exposed that crazy import in minutes instead of hours.

commit early, commit often.

danenania · 2025-03-20T00:08:52 1742429332

I’m working an AI coding agent[1], and all changes accumulate in a sandbox by default that is isolated from the project.

Auto-commit is also enabled (by default) when you do apply the changes to your project, but I think keeping them separated until you review is better for higher stakes work and goes a long way to protect you from stray edits getting left behind.

1 - https://github.com/plandex-ai/plandex

ezyang · 2025-03-20T00:38:36 1742431116

One problem with keeping the changes separate is the LLM usually wants to test the code with the incremental new changes. So you need a working tree that has all the new changes. But then... why not use the real one?

danenania · 2025-03-20T01:09:32 1742432972

Plandex can tentatively apply the changes in order to execute commands (tests, builds, or whatever), then commit if they succeed or roll back if they fail.

fragmede · 2025-03-20T00:16:24 1742429784

If you implement the sandbox as a git branch, then we're on the same page.

danenania · 2025-03-20T00:25:48 1742430348

It's built on top of git, but offers better separation imho than just a separate branch.

For one thing, you have to always remember to check out that branch before you start making changes with the LLM. It's easy to forget.

Second, even if you're on a branch, it doesn't protect you from your own changes getting interleaved with the model's changes. You can get into a situation where you can't easily roll back and instead have to pick apart your work and the model's output.

By defaulting to the sandbox, it 'just works' and you can be sure that nothing will end up in the codebase without being checked first.

fragmede · 2025-03-20T01:15:33 1742433333

If the latest change is bad, how do you go back in your sandbox? How do you go back three steps? If you make a change outside the sandbox, how do you copy it in? How do you copy them out? How do you deinterleave the changes then?

In order for this sandbox to actually be useful, you're going to end up implementing a source control mechanism. If you're going to do that, might as well just use git, even if just on the backend and commit to a branch behind the scenes that the user never sees, or by using worktree, or any other pieces of it.

Take a good long think about how this sandbox will actually work in practice. Switch to the sandbox, LLM some code, save it, handwrite some code, then switch to the sandbox again, LLM some code, switch out. Try and go backwards half the LLM change. Wish you'd committed the LLM changes while you were working on the.

By the time you've got a handle on it, rembering to switch git branch is the least of your troubles.

danenania · 2025-03-20T01:45:47 1742435147

This is all implemented and working, just to be clear, and is being used in production. Everything you mentioned in your comment is covered.

You can also create branches within the sandbox to try different approaches, again with no risk of anything being left behind in your project until it’s ready.

It does use git underneath.

Here are some more details if you’re interested: https://docs.plandex.ai/core-concepts/version-control

fragmede · 2025-03-21T09:54:58 1742550898

So instead of just learning git, which everyone uses, your users now have to learn git AND plandex commands? In addition to knowing git branch -D, I also need to know plandex delete-branch?

I'm sure it's a win for you since I'm guessing you're the writer of plandex, but you do see how that's just extra overhead instead of just learning git, yeah?

I don't know your target market, so maybe there is a PMF to be found with people who are scared of git and would rather the added overhead of yet another command to learn so they can avoid learning git while using AI.

danenania · 2025-03-21T10:15:26 1742552126

I hear you, but I don't think git alone (a single repository, at least) provides what is needed for the ideal workflow. Would you agree there are drawbacks to committing by default compared to a sandbox?

Version control in Plandex is like 4 commands. It’s objectively far simpler than using git directly, providing you the few operations you need without all the baggage. It wouldn't be a win for me to add new commands if only git was necessary, because then the user experience would be worse, but I truly think there's a lot of extra value for the developer in a sandbox layer with a very simple interface.

I should also mention that Plandex also integrates with the project's git repo just like aider does, so you can turn on auto-apply for effectively the same exact functionality if that's what you prefer. Just check out a new branch in git, start the Plandex REPL in a project directory with `plandex`, and run `\set-config auto-apply true`. But if you want additional safety, the sandbox is there for you to use.

fragmede · 2025-03-21T23:21:00 1742599260

The problem is I'm too comfortable with git, so I don't see the drawbacks to committing by default. I'm open to hearing about the shortcomings and how I'd address them, though that may not be reasonable to expect for your users.

The problem isn't the four Plandex version control commands or how hard they are to understand in isolation, it's that users now have to adjust their mental model of the system and bolt that onto the side of their limited understanding of git because there's now a plandex branch and there's a git branch and which one was I on and oh god how do they work together?

zahlman · 2025-03-19T23:51:34 1742428294

FTA:

> Note that it took me about two hours to debug this, despite the problem being freshly introduced. (Because I hadn’t committed yet, and had established that the previous commit was fine, I could have just run git diff to see what had changed).

> In fact, I did run git diff and git diff --staged multiple times. But who would think to look at the import statements? The import statement is the last place you’d expect a bug to be introduced.

fragmede · 2025-03-20T00:11:00 1742429460

git diff != git log.

To expand on that, the problem with only having git diff is there's no way to go backwards halfway. You can't step backwards in time until you get to the bad commit just before the good commit, and then do a precise diff between the two. (aka git bisect) Reviewing 300 lines out of git diff and trying to find the bug somewhere in there is harder than when there are only 10.