> AI has help me refactor things where I normally couldn’t. Reading "couldn't" a...

throwaway2037 · 2025-05-26T10:02:47 1748253767

Before I started using advanced IDEs that could navigate project structures very quickly, it was normal to have a relatively poor visibility -- call it "fog of war/code". In a 500,000 line C++ project (I have seen a few in my career), as a junior dev, I might only understand a few thousand lines from a few files I have studied. And, I had very little idea of the overall architecture. I see LLMs here as a big opportunity. I assume that most huge software projects developed by non-tech companies look pretty similar -- organic, and poorly documented and tested.

guappa · 2025-05-26T13:30:50 1748266250

That's what documentation is for. If you don't have that, AI won't figure it out either.

jrochkind1 · 2025-05-27T03:01:43 1748314903

I'm not sure that's true?

guappa · 2025-05-27T06:19:52 1748326792

Such a project is way too large for AI to process as a whole. So yes it's true.

throwaway2037 · 2025-05-27T08:37:20 1748335040

I have a question: Many people have spoken about their experience of using LLMs to summarise long, complex PDFs. I am so ignorant on this matter. What is so different about reading a long PDF vs reading a large source base? Or can a modern LLM handle, say, 100 pages, but 10,000 pages is way too much? What happens to an LLM that tries to read 10,000 pages and summarise it? Is the summary rubbish?

srazzaque · 2025-05-28T12:09:27 1748434167

Get the LLM to read and summarise N pages at a time, and store the outputs. Then, you concatenate those outputs into one "super summary" and use _that_ as context.

Theres some fidelity loss but it works for text, because there's quite often so much redundancy.

However, I'm not sure this technique could work on code.

guappa · 2025-05-27T09:35:24 1748338524

It can't handle large contexts, so the way they often do it is file by file, which loses the overall context.

randomjoe2 · 2025-05-28T21:17:54 1748467074

Lots of models CAN handle large contexts, gemini 2.5 pro their latest model can take 1 million tokens of context

markus_zhang · 2025-05-26T10:33:11 1748255591

What do you think about software such as source insight that gives developers an eagle eye of view of the project?

throwaway2037 · 2025-05-26T16:58:12 1748278692

You raise a good point. I had a former teammate who swore by Source Insight. To repeat myself, I wrote: <<Before I started using advanced IDEs that could navigate project structures very quickly>>. So, I was really talking about my life before I started using advanced IDEs. It was so hard to get a good grasp of a project and navigate quickly.

markus_zhang · 2025-05-27T11:01:00 1748343660

This makes sense.

pshc · 2025-05-26T02:45:40 1748227540

Sometimes a problem is a weird combination of hairy/obscure/tedious where I simply don’t have the activation energy to get started. Like, I could do it with a gun to my head.

But if someone else were to do it for me I would gratefully review the merge request.

strogonoff · 2025-05-26T07:18:29 1748243909

Reviewing a merge request should require at least the same activation energy as writing the solution yourself, as in order to adequately evaluate a solution you first need to acquire a reference point in mind as to what the right solution should be in the first place.

For me personally, the activation energy is higher when reviewing: it’s fun to come up with the solution that ends up being used, not so fun to come up with a solution that just serves as a reference point for evaluation and then gets immediately thrown away. Plus, I know in advance that a lot of cycles will be wasted on trying to understand how someone else’s vision maps onto my solution, especially when that vision is muddy.

solaire_oa · 2025-05-26T15:14:06 1748272446

The submitter should also have thoroughly reviewed their own MR/PR. Even before LLMs, coders not having reviewed their own code would be completely discourteous and disrespectful to the reviewer. It's an embarrassing faux pas that makes the submitter and the team all look and feel bad when there are obvious problems that need to be called out and fixed.

Submitting LLM barf for review and not reviewing it should be grounds for termination. The only way I can envision LLM barf being sustainable, or plausible, is if you removed code review altogether.

strogonoff · 2025-05-26T15:50:10 1748274610

> The submitter should also have thoroughly reviewed their own MR/PR

What does it mean to have to review your own code as a separate activity? Do many people contribute code that they wrote but… never read?

> Submitting LLM barf

Oh right…

solaire_oa · 2025-05-26T23:09:20 1748300960

Writing/reading code and reviewing code are distinct and separate activities. It's completely common to contribute code which is not production ready.

If you need an example, it's easy to add a debugging/logging statement like `console.log`, but if the coder committed and submitted the log statement, then they clearly didn't review the code at all, and there are probably much bigger code issues at stake. This is a problem even without LLMs.

strogonoff · 2025-05-27T07:53:10 1748332390

Just call it “committing bad code”. LLM autocomplete aside, I don’t see how reviewing own code can happen without either a split personality, or putting enough time that you completely forgot what exactly you were doing and have fresh eyes and mind.

If person A committed code that looks bad to person B, it just means person A commits bad code by the standard of person B, not that person A “does not review own code”.

Maybe it’s a subjective difference, same as you could call someone “rude” or you could say the same person “didn’t think before saying”.

solaire_oa · 2025-05-27T15:16:34 1748358994

Person A as can commit atrocious code all day, that's fine, but they still need to proofread their MR/PR and fix the outstanding issues. The only way to see outstanding issues is by reviewing the MR/PR. Good writers proofread their documents.

strogonoff · 2025-05-27T16:03:57 1748361837

I just don’t see reading your own stuff as a different activity from writing. Generally, there is the author, and proofreader is a dedicated role.

blub · 2025-05-26T18:51:28 1748285488

I always review the local diff before pushing. Can sometimes catch typos, or unclear comments or naming issues.

The concept and design were by that point iterated on, so it doesn’t happen that I need to rewrite a significant amount of code.

strogonoff · 2025-05-27T07:57:49 1748332669

My preferred workflow requires me to go through every changed chunk and stage them one by one. It’s very easy with vim-fugitive. To keep commits focused, it requires reading every chunk, which I guess is an implicit review of sorts.

Aeolun · 2025-05-26T04:15:56 1748232956

I think, if it’s similar to how I feel about it, that it’s more about always being able to do it, but not wanting to expend the mental effort to correctly adjust all those 30 places. Your boss is not going to care, so while it’s a bit better going forward, justifying the time to do it manually doesn’t make sense even to yourself.

If you can do it using an LLM in a few hours however, suddenly making your life, and the lives of everyone that comes after you, easier becomes a pretty simple decision.

djtango · 2025-05-26T06:12:55 1748239975

So everyone is talking across each other...

AI is a sharp tool, use it well and it cuts. Use it poorly and it'll cut you.

Helping you overcome the activation barrier to make that redactor is great if that truly is what it is. That is probably still worth billions in the aggregate given git is considered billion dollar software.

But slop piled on top of slop piled on top of slop is only going to compound all the bad things we already knew about bad software. I have always enjoyed the anecdote that in China, Tencent had over 6k mediocre engineers servicing QQ then hired fewer than 30 great ones to build the core of WeChat...

AI isn't exactly free and software maintenance doesn't scale linearly

Aeolun · 2025-05-27T01:02:38 1748307758

> But slop piled on top of slop piled on top of slop is only going to compound all the bad things we already knew about bad software

While that is true, AI isn’t going to make the big difference here. Whether the slop is written by AI or 6000 mediocre engineers is of no matter to the end result. One might argue that if it were written by AI at least those engineers could do something useful with their lives.

firecall · 2025-05-26T02:33:15 1748226795

That's an important distinction!

There's a difference between not intellectually understanding something and not being able to refactor something because if you start pulling on a thread, you are not sure what will unravel!

And often there just isn't time allocated in a budget to begin an unlimited game of bug testing whack-a-mole!

OccamsMirror · 2025-05-26T03:36:53 1748230613

To makeitdouble's point, how is this any different with an LLM provided solution? What confidence do you have that isn't also beginning an unlimited game of bug testing whack-a-mole?

My confidence in LLMs is not that high and I use Claude a lot. The limitations are very apparent very quickly. They're great for simple refactors and doing some busy work, but if you're refactoring something you're too afraid to do by hand then I fear you've simply deferred responsibility to the LLM - assuming it will understand the code better than you do, which seems foolhardy.

rocqua · 2025-05-26T05:43:48 1748238228

Especially with refactoring, it tends to be tedious and repetitive work that is slightly too complicated for a (regex) search replace.

A lot of repetitive slight variations on the same easy to describe change sounds pretty good to ask an LLM to do quickly.

kace91 · 2025-05-26T06:38:48 1748241528

As the op, for the case I was thinking about, it’s “couldn’t” as in “I don’t have the time to go checking file by file and the variation is not straightforward enough that grepping will surface cases straightforwardly”.

I’m very much able to understand the result and test for consequences, I wouldn’t think of putting code I don’t understand in production.

johnisgood · 2025-05-26T07:18:21 1748243901

> Your comment makes it sound like you're now dependent on AI to refactor again

Not necessarily. It may have refactored the codebase in a way that is more organized and easier to follow.

> how did you guarantee that the change offered by the AI made proper sense and didn't leave out critical patterns that were too complex for you to detect ?

Perhaps extensive testing? Or a prayer.