> AI has help me refactor things where I normally couldn’t.
Reading "couldn't" as, you would technically not be able to do it because of the complexity or intricacy of the problem, how did you guarantee that the change offered by the AI made proper sense and didn't leave out critical patterns that were too complex for you to detect ?
Your comment makes it sound like you're now dependent on AI to refactor again if dire consequences are detected way down the line (in a few months for instance), and the problem space is already just not graspable by a mere human. Which sounds really bad if that's the case.
Before I started using advanced IDEs that could navigate project structures very quickly, it was normal to have a relatively poor visibility -- call it "fog of war/code". In a 500,000 line C++ project (I have seen a few in my career), as a junior dev, I might only understand a few thousand lines from a few files I have studied. And, I had very little idea of the overall architecture. I see LLMs here as a big opportunity. I assume that most huge software projects developed by non-tech companies look pretty similar -- organic, and poorly documented and tested.
I have a question: Many people have spoken about their experience of using LLMs to summarise long, complex PDFs. I am so ignorant on this matter. What is so different about reading a long PDF vs reading a large source base? Or can a modern LLM handle, say, 100 pages, but 10,000 pages is way too much? What happens to an LLM that tries to read 10,000 pages and summarise it? Is the summary rubbish?
Get the LLM to read and summarise N pages at a time, and store the outputs. Then, you concatenate those outputs into one "super summary" and use _that_ as context.
Theres some fidelity loss but it works for text, because there's quite often so much redundancy.
However, I'm not sure this technique could work on code.
You raise a good point. I had a former teammate who swore by Source Insight. To repeat myself, I wrote: <<Before I started using advanced IDEs that could navigate project structures very quickly>>. So, I was really talking about my life before I started using advanced IDEs. It was so hard to get a good grasp of a project and navigate quickly.
Sometimes a problem is a weird combination of hairy/obscure/tedious where I simply don’t have the activation energy to get started. Like, I could do it with a gun to my head.
But if someone else were to do it for me I would gratefully review the merge request.
Reviewing a merge request should require at least the same activation energy as writing the solution yourself, as in order to adequately evaluate a solution you first need to acquire a reference point in mind as to what the right solution should be in the first place.
For me personally, the activation energy is higher when reviewing: it’s fun to come up with the solution that ends up being used, not so fun to come up with a solution that just serves as a reference point for evaluation and then gets immediately thrown away. Plus, I know in advance that a lot of cycles will be wasted on trying to understand how someone else’s vision maps onto my solution, especially when that vision is muddy.
The submitter should also have thoroughly reviewed their own MR/PR. Even before LLMs, coders not having reviewed their own code would be completely discourteous and disrespectful to the reviewer. It's an embarrassing faux pas that makes the submitter and the team all look and feel bad when there are obvious problems that need to be called out and fixed.
Submitting LLM barf for review and not reviewing it should be grounds for termination. The only way I can envision LLM barf being sustainable, or plausible, is if you removed code review altogether.
Writing/reading code and reviewing code are distinct and separate activities. It's completely common to contribute code which is not production ready.
If you need an example, it's easy to add a debugging/logging statement like `console.log`, but if the coder committed and submitted the log statement, then they clearly didn't review the code at all, and there are probably much bigger code issues at stake. This is a problem even without LLMs.
Just call it “committing bad code”. LLM autocomplete aside, I don’t see how reviewing own code can happen without either a split personality, or putting enough time that you completely forgot what exactly you were doing and have fresh eyes and mind.
If person A committed code that looks bad to person B, it just means person A commits bad code by the standard of person B, not that person A “does not review own code”.
Maybe it’s a subjective difference, same as you could call someone “rude” or you could say the same person “didn’t think before saying”.
Person A as can commit atrocious code all day, that's fine, but they still need to proofread their MR/PR and fix the outstanding issues. The only way to see outstanding issues is by reviewing the MR/PR. Good writers proofread their documents.
My preferred workflow requires me to go through every changed chunk and stage them one by one. It’s very easy with vim-fugitive. To keep commits focused, it requires reading every chunk, which I guess is an implicit review of sorts.
I think, if it’s similar to how I feel about it, that it’s more about always being able to do it, but not wanting to expend the mental effort to correctly adjust all those 30 places. Your boss is not going to care, so while it’s a bit better going forward, justifying the time to do it manually doesn’t make sense even to yourself.
If you can do it using an LLM in a few hours however, suddenly making your life, and the lives of everyone that comes after you, easier becomes a pretty simple decision.
AI is a sharp tool, use it well and it cuts. Use it poorly and it'll cut you.
Helping you overcome the activation barrier to make that redactor is great if that truly is what it is. That is probably still worth billions in the aggregate given git is considered billion dollar software.
But slop piled on top of slop piled on top of slop is only going to compound all the bad things we already knew about bad software. I have always enjoyed the anecdote that in China, Tencent had over 6k mediocre engineers servicing QQ then hired fewer than 30 great ones to build the core of WeChat...
AI isn't exactly free and software maintenance doesn't scale linearly
> But slop piled on top of slop piled on top of slop is only going to compound all the bad things we already knew about bad software
While that is true, AI isn’t going to make the big difference here. Whether the slop is written by AI or 6000 mediocre engineers is of no matter to the end result. One might argue that if it were written by AI at least those engineers could do something useful with their lives.
There's a difference between not intellectually understanding something and not being able to refactor something because if you start pulling on a thread, you are not sure what will unravel!
And often there just isn't time allocated in a budget to begin an unlimited game of bug testing whack-a-mole!
To makeitdouble's point, how is this any different with an LLM provided solution? What confidence do you have that isn't also beginning an unlimited game of bug testing whack-a-mole?
My confidence in LLMs is not that high and I use Claude a lot. The limitations are very apparent very quickly. They're great for simple refactors and doing some busy work, but if you're refactoring something you're too afraid to do by hand then I fear you've simply deferred responsibility to the LLM - assuming it will understand the code better than you do, which seems foolhardy.
As the op, for the case I was thinking about, it’s “couldn’t” as in “I don’t have the time to go checking file by file and the variation is not straightforward enough that grepping will surface cases straightforwardly”.
I’m very much able to understand the result and test for consequences, I wouldn’t think of putting code I don’t understand in production.
> Your comment makes it sound like you're now dependent on AI to refactor again
Not necessarily. It may have refactored the codebase in a way that is more organized and easier to follow.
> how did you guarantee that the change offered by the AI made proper sense and didn't leave out critical patterns that were too complex for you to detect ?
Reading "couldn't" as, you would technically not be able to do it because of the complexity or intricacy of the problem, how did you guarantee that the change offered by the AI made proper sense and didn't leave out critical patterns that were too complex for you to detect ?
Your comment makes it sound like you're now dependent on AI to refactor again if dire consequences are detected way down the line (in a few months for instance), and the problem space is already just not graspable by a mere human. Which sounds really bad if that's the case.