I'm doing a reasonable size project with Claude Code doing almost all of the programming, and it's quite challenging.
Vibe coding is easy and fast, but you end up not being an expert in the code base or really having any idea about it. And once it reaches a certain size, the LLM isn't an expert on it either. It is only an expert on cleanly isolated sections of it, which, by the way, it's really bad at producing without lots of good guidance.
Bespoke coding is easy and slow, and you end up an expert in the systems you make.
What I've found is once the system is beyond the size that an LLM can reasonably handle, it's faster to be an expert than to try to get the LLM to do things; in some cases infinitely faster (the LLM can't do it.)
Maybe you get a system of that size done in a week instead of 3 months with vibe coding, and this applies to subsystems as well if they're isolated. But for now you really still want someone to end up an expert in all the code that was produced.
So for now I think there's a big skill point that is still only achieved by humans -- guiding the LLM to produce systems that both the human and the LLM will be good at long term. That is, NOT vibe coding from the beginning, but doing something a bit slower than vibe coding and much faster than bespoke coding, with an eye toward clean, testable, isolated systems - something the LLM is not naturally any good at making but can be done with good guidance.
I spent the last two months "vibe coding". I really think VC as it is defined (keep smashing the accept button and let the LLM eventually "get there") is terrible. My flow has been to use Claude code for an incredibly amazing code gen where:
1. I know exactly what I want architecturally
2. I know how I want it
In this mode the flow is then about validating the code to make sure it is my "image" frequently rather than me dreading not know what rube Goldberg machine it generated after 10s or Ks of lines.
Sometimes I even let it get "close" (ie it took care of all the nitty gritty) and I take over and finish the interesting bits then tell CC what I did and why to update the project memory. Frequent checkpointing and sprinkling .md files with latest understanding is very important (it also has the advantage of making your code llm-portable).
I think the biggest irony is PMs, VPs, CEO had traditional been pretty derisive of "clean code" and yet clean code is absolutely essential to make vibe coding work. Feels like a huge vindication.
And you have to be vigilant, too. You can spend a day or two in full vibe code mode if you really want to ship a bunch of features fast, and they'll all work and it will feel amazing, all while it's secretly shitting all over your codebase, and you won't know it until it's too late. And not in a "oh now you just have to fix it up" way - if you go too long it may be just about as difficult to fix as it would have been to write.
In my experience there is a big difference between front-end and back-end stuff (i.e., formulaic/templated code and code that needs to solve an actual problem). LLMs breeze through the former, typically get stuck in the latter because of context and breadth limitations.
That doesn’t mean you can’t churn out all the surface accoutrements of an app (and some of the dull back-end stuff like user databases, etc.) faster—-but it is very far from replacing someone with system design skills in the long run.
> Bespoke coding is easy and slow, and you end up an expert in the systems you make.
What good is that when the code gets written by a fresher from Accenture, or someone who ends up leaving the company (average job length for a software dev is just over two years)
I have had a lot of success building small isolated features / projects, but have found Claude to be frustratingly inadequate for any non-trivial work. Ran it on an old and complex C++ codebase, and spent an hour unsuccessfully trying to coax it into fixing a bug. I even helped it by writing a failing test for it by hand. The tools need to improve a ton before software developers can forget how to code.
The LLMs debates have been raging for a while now, and I think this is one of the most insightful comments I've read on this topic. I also think what you describe can also be applied to writing with LLMs. In short, your idea of a 3rd way forward, i.e., bespoke coding (or writing) is the only viable path forward, but using LLMs to accelerate that. This is actually sort of what I have been doing with my use of LLMs. The problem is you have to be really careful in not letting yourself take mental shortcuts as LLMs can sometime produce deceivingly dazzling results until you realize some of that is front of you is BS, however it did still produce something useful that you now have to clean up leading to productivity gains.
I've seen a different kind of usage pattern recently, which is to find a problem that the LLM is good at, something which is very local in reasoning, and do it at big scale across the whole codebase.
I've used it to add test coverage. Granted it wasn't for new major features but small features that still necessitated having tests.
IME so far that's what it's best at. As long as there exists tests in the codebase that provide enough of the basic skeleton (i.e. if they already test part of a controller or code that interacts with a DB) to guess what's needed then it can do a decent job. It's still not perfect though and when it comes to inventing bespoke tests that are uniquely different from other tests it needs much more of a helping hand.
A big Bay Area data company recently refactored their old JS codebase from var to let and const. It was done at a scale that is way beyond "senior engineer double-checks LLM to see if code is good."
LLMs are terrible at that in my experience. In what world is refactoring "very local in reasoning"?
Switching libraries/frameworks or switching piecemeal to a new language for a codebase that's already well structured seems like it would be noticeably less costly though.
In a situation where the transformations are explicit and can almost be mechanical with little to no involved reasoning. With those requirements, the LLM just has to recognize the patterns and "translate" them to the new pattern. Which is almost exactly what they were designed to do.
See, using AI as the equivalet of super-IDE snippets or to generate things in isolation is probably really good! It's also categorically not the same thing as what all of the AI hypemen (including the OP) are describing, with it replacing wide swathes of software developers. It devolves into a motte and bailey argument where it is actually possible for AI to be a useful tool in a programmer's toolbox, and make people more productive in isolated ways, without also agreeing with frankly anything the OP thread is saying.
I wrote a comment in a similar thread a few weeks ago describing my LLM-coding experience - here's a copy+paste (so any quote replies will be out of context / not actually replying to your comment):
I'll preface this comment with: I am a recent startup owner (so only dev, which is important) and my entire codebase has been generated via Sonnet (mostly 3.7, now using 4.0). If you actually looked at the work I'm (personally) producing, I guess I'm more of a product-owner/project-manager as I'm really just overseeing the development.
> I have yet to see an LLM-generated app not collapse under it’s own weight after enough iterations/prompts.
There's a few crucial steps to make an LLM-generated app maintainable (by the LLM):
- _have a very, very strong SWE background_; ideally as a "strong" Lead Dev, _this is critical_
- your entire workflow NEEDS to be centered around LLM-development (or even model-specific):
- use MCPs wherever possible and make sure they're specifically configured for your project
- don't write "human" documentation; use rule + reusable prompt files
- you MUST do this in a *very* granular but specialized way; keep rules/prompts very small (like you would when creating tickets)
- make sure rules are conditionally applied (using globs); do not auto include anything except your "system rules"
- use the LLM to generate said prompts and rules; this forces consistency across prompts, very important
- follow a typical agile workflow (creating epics, tickets, backlogs etc)
- TESTS TESTS AND MORE TESTS; add automated tools (like linters) EVERYWHERE you can
- keep your code VERY modular so the LLM can keep a focused context, rules should provide all key context (like the broader architecture); the goal is for your LLM to only need to read or interact with files related to the strict 'current task' scope
- iterating on code is almost always more difficult than writing it from scratch: provided your code is well architected, no single rewrite should be larger than a regular ticket (if the ticket is too large then it needs to be split up)
This is off the top of my head so it's pretty broad/messy but I can expand on my points.
LLM-coding requires a complete overhaul of your workflow so it is tailored specifically to an LLM, not a human, but this is also a massive learning curve (that take's a lot of time to figure out and optimize). Would I bother doing this if I were still working on a team? Probably not, I don't think it would've saved me much time in a "regular" codebase. As a single developer at a startup? This is the only way I've been able to get "other startup-y" work done while also progressing the codebase - the value of being able to do multiple things at a time, let the LLM and intermittently review the output while you get to work on other things.
The biggest tip I can give: LLMs struggle at "coding like a human" and are much better at "bad-practice" workflows (e.g. throwing away large parts of code in favour of a total rewrite) - let the LLM lead the development process, with the rules/prompts as guardrails, and try stay out of it's way while it works (instead of saying "hey X thing didn't work, go fix that now") - hold its hand but let it experiment before jumping in.
Vibe coding is easy and fast, but you end up not being an expert in the code base or really having any idea about it. And once it reaches a certain size, the LLM isn't an expert on it either. It is only an expert on cleanly isolated sections of it, which, by the way, it's really bad at producing without lots of good guidance.
Bespoke coding is easy and slow, and you end up an expert in the systems you make.
What I've found is once the system is beyond the size that an LLM can reasonably handle, it's faster to be an expert than to try to get the LLM to do things; in some cases infinitely faster (the LLM can't do it.)
Maybe you get a system of that size done in a week instead of 3 months with vibe coding, and this applies to subsystems as well if they're isolated. But for now you really still want someone to end up an expert in all the code that was produced.
So for now I think there's a big skill point that is still only achieved by humans -- guiding the LLM to produce systems that both the human and the LLM will be good at long term. That is, NOT vibe coding from the beginning, but doing something a bit slower than vibe coding and much faster than bespoke coding, with an eye toward clean, testable, isolated systems - something the LLM is not naturally any good at making but can be done with good guidance.