I decided to check this out after seeing the discussion here. I had previously misunderstood that it required a Claude.ai plan, but it actually just uses your API keys.
I did a comparison between Claude Code and Aider (my normal go-to): I asked it to do clone a minor feature in my existing app with some minor modifications (specifically, a new global keyboard shortcut in a Swift app).
Claude Code spent about 60 seconds and $0.73 to search the code base and make a +51 line diff. After it finished, I was quite impressed by its results; it did exactly the correct set of changes I would have done.
Now, this is a higher level of task than I would normally give to Aider (because I didn't provide any file names, and it requires changing multiple files), so I was not surprised that Aider completely missed the files it needed to modify to start (asking me to add 1 correct file and 2 incorrect files). I did a second attempt after manually adding the correct files. After doing this, it produced an equivalent diff to Claude Code. Aider did this in 1 LLM prompt, or about 15 seconds, with a total cost of $0.07, about 10% of the Claude Code cost.
Overall, it seems clear that the higher level of autonomy carries a higher cost with it. My project here was 7k SLOC; I would worry about ballooning costs on much larger projects.
Probably about 3 minutes? That's my main usage of these types of coding tools, honestly. I already know generally what I want to happen and validating that the LLM is on the right track / reached the right solution is easy.
I'm not saying that "Claude Code makes me 300% more effective", but I guess it did for this (simple) task.
Are you comparing only the generation time with your coding time? How are the figures if you include the necessary step of checking the generated code? And how do these times change if you are coding in a very complex environment?
To be clear, I explicitly picked this task because my judgement call was that it was going to be faster to use an AI than coding it myself. Checking the generated code was easy for me to do (whether I or the AI wrote it).
I don't know what you mean by "how do these times change if you are coding in a very complex environment", but to restate my original post: I'm fearful that Claude's additional autonomy will allow it to waste money (and time) pursuing useless ideas.
This shouldn't be factored in unless you never debug your own code. At any rate, I don't know about OP but usually when I get an LLM to write out some code, I also have it write out the tests for it as well.
I recently made some changes to a website generated by a non-technical user using Dreamweaver. The initial state was quite poor, with inline CSS and what appeared to be haphazard copy-pasting within a WYSIWYG editor.
Although I’m not proficient in HTML, CSS, or JavaScript, I have some understanding of what good code looks like. Through several iterations, I managed to complete the task in a single evening, which would have required me a week or two to relearn and apply the necessary skills. Not only the code is better organised, it’s half the size and the website looks better.
time spent is not the only question. how much thought it takes, however impossible that may be to measure, is another one. If an LLM assisted programmer is able to solve the problem without deep focus, while responding to emails and attending meetings, vs the programmer who can't, is time really the only metric we can have here?
My scrum/agile coach says, by parallelizing prompts, a single developer can babysit multiple changes in the same time slice. By having a sequence of prompts ready before hand, a single developer can pipeline those one after the other. With an IDE that helps schedule such work, a single developer can effectively hyper-thread their developmental workflow. If the developer is epoll'ing at 10x the hertz... that's another force multiplier. Of course context switches & side-channels are of concern, but a voice over my shoulder tells me that as long as memory safety is guaranteed, everything should turn up alrigd3adb33f.
Same here, I did a few small taks with Claude Code after seeing this discussion here and is too expensive for me.
A small change to create a script file (20 LoC) was 10cts, a quick edit to a README was 7ct
Yes yes engineers make more than that blah blah but the cost would quickly jump out of control for bigger tasks. I’d easy burn through $10-20 upwards a day with this, or upwards $100-$300 a month. Unless you have a Silicon Valley salary, that’s too expensive.
I use other tools like Cody (the tool the author created) or Copilot because I pay $10 a month and that’s it. Yes I get rate limited almost daily but I don’t need to worry that my tool cost is going out of control suddenly.
I hope Anthropic introduces a new plan that bundles Claude Code into it, I’d be much more comfortable using that knowing it won’t suddenly be more than my $50/mo (or whatever)
It's an interesting question. As a freelance consultant, theoretically a tool like this could allow me to massively scale up my income, assuming I could find enough clients.
I'm a bit nervous where I'd end up though - with code I'd "written" but wasn't familiar with, and with who knows what kinds of limitations or subtle bugs baked in.
I currently pay around $200-300 to a combination of Cursor + Anthropic through the API. I have both a full time job and freelance work. It pays for itself. I end up reviewing more than manual coding, to ensure the quality of the results. Funnily, the work I did through this method has received more praise than my usual work.
Did you outgrow the vase 500 searches that Cursor gives you per month and connect your API key for usage based pricing?
I’m having a hard time coming close to the 500 included in the monthly subscription and I use it like, a lot.
Just curious how you’re hitting that 200-300 mark unless you’re talking about paying Anthropic outside of cursor. Which I just now realized is probably the case.
> I'm a bit nervous where I'd end up though - with code I'd "written" but wasn't familiar with
This does seem like quite a big downside. It turns every new feature into “implement this in someone else’s code base”. I imagine you’d very quickly have complete dependency on the AI. Maybe that’s an inevitability in this new world?
It sounds fine as long as you can fully trust the AI to do good work right?
I don't think there's any current AI that is fully trustworthy this way though.
I wouldn't even put them at 50% trustworthy
I think we are going to see a cliff where they become 80% good, and every tiny bit of improvement past that point will be exponentially more difficult and expensive to achieve. I don't think we reach 100% reliable AI in any of our lifetimes
I think we are going to reach a cliff where a type of old school developers keep saying, "it just can't write code like I can" while at the same time wondering why they can't land a job.
Current AI is likely already beyond 50% trustworthiness, whatever that means.
> "it just can't write code like I can" while at the same time wondering why they can't land a job
People had this same prediction about offshore development
Those old school devs are able to find well paying work fixing broken software churned out by overseas code sweatshops
I predict if you can read and understand code without the help of AI models you will be in even higher demand to fix the endless broken software built by AI assisted coders who cannot function without AI help
> Yes yes engineers make more than that blah blah but the cost would quickly jump out of control for bigger tasks.
Also (most) engineers don't hallucinate answers. Claude still does regularly. When it does it in chat mode via a flat rate Pro plan I can laugh it off and modify the prompt to give it the context it clearly didn't understand but if its costing me very real money for the LLM to over-eagerly over-engineer an incorrect implementation of the stated feature its a lot less funny.
Exactly! Especially agentic tools like Aider and Claude that are designed to pull in more files into their context automatically, based on what the LLM thinks it should read. That can very quickly go out of control and result in huge context windows.
Right now with Copilot or other fixed subscriptions I can also laugh it off and just create a new tab with fresh context. Or if I get rate-limited because of too much token use I can wait 1 day. But if these actions are linked to directly costing money on my card, then that's becoming a lot more scary.
Bugs from engineers comes from a variety of reasons and most have nothing in common with an LLM hallucinating.
For exemple I can’t remember seing a PR with an API that seems plausible but never ever existed, or an interpretation of the specs so convoluted
and edgy that you couldn’t even use sarcasm as a justification for that code.
Don’t take me wrong: some LLMs are capable of producing bugs that looks like humans ones, but the term hallucinate is something else’s and doesn’t fit with much humans bugs.
> For exemple I can’t remember seing a PR with an API that seems plausible but never ever existed
A PR is code that has already been tested and refined, which is not comparable to the output of an LLM. The output of an LLM is comparable to the first, untested code that you wrote based off of your sometimes vague memory of how some API works. It's not at all uncommon to forget some details of how an API works, what calls it supports, the details of the parameters, etc.
It’s kind of uncommon to be aware that you have only a vague recall of the API and not go check the documentation or code to refresh your memory. That self knowledge that you knew something and aren’t sure of the details is indeed the thing that these tools lack. So far.
Human programmers have continuous assistance on every keystroke - autocomplete, syntax highlighting, and ultimately, also the compilation/build step itself.
For an LLM-equivalent experience, go open notepad.exe and make substantial changes there, and then rebuild - and let the compiler tell you what's your base rate of hallucinating function names and such.
In the 1990s, that is closer to what making software was like. There, one had an even more heightened awareness of how confident one was in what one was typing. We would then go to the manual (physical in many cases) and look it up.
And we never made up APIs, as there just weren't that many APIs. We would open the .h file for the API we were targeting as we typed into the other window. And the LLMs have ingested all the documentation and .h files (or the modern equivalent) so they don't have a real excuse.
But I use the LLMs all the time for math, and they do profusely hallucinate in a way people do not. I think it's a bit disingenuous to say that LLMs don't have that failure mode that people don't really have.
I use Grok and it's free (even Grok3). I definitely don't hit limits unless it's a pretty heavy day and I do a lot of adjustments. Also, don't send entire codebases to it, just one-off files. What's quite amazing is how it doesn't matter that it doesn't have the source to dependent files, it figures it out and infers what each method does based on its name and context, frigging amazing if you ask me.
And it doesn't fight me like the OpenAI tooling does that logs me out randomly every day and I have to login and spend 4 minutes copying login codes from my email or answering their stupid Captcha test. And this is on their API playground where I pay for every single call - so not like I'm trying to scrape my free chat usage as an API.
> I use Grok and it's free (even Grok3). I definitely don't hit limits unless it's a pretty heavy day and I do a lot of adjustments
Okay maybe need to clarify: I hit those limits when I do agentic stuff, which is what Claude Code does: So let the LLM automatically pull in files into the context it thinks it needs, analyze my codebase, follow imports, add more code, etc. It can quickly balloon out of control when the LLM pulls in too many LoC and the context window gets too big.
Then do a few back and forth actions like "let's refine this plan, instead of X pls do Y", or "hmm I think maybe we should also look into file blah.ts" and you quickly hit 500k tokens.
If I use Cody only, which has some agentic capabilities but is much more "how can I implement Y in this file @src/file1.ts db models are in @src/models/foo.ts", then I rarely ever hit any rate limitations. That's more similar to what you describe of copying code back and forth, except it's in the editor and you can do it by writing @somefile.
I think that tools like this have to operate on a subscription model like Cursor does in order to make any kind of sense for most users. The pay as you go model for agentic code tools makes you responsible for paying for:
* Whatever context the agent decides to pull in.
* However many iterations the model decides to run.
* Any result you get, regardless of how bad it is.
With pay as you go, the tool developer has no incentive to minimize any of these costs—they get paid more if it's inefficient, as long as it's not so inefficient that no one uses it. They don't even need it to be especially popular, they just need some subset of the population to decide that costs don't matter (i.e. those with Silicon Valley salaries).
With Cursor's model of slow and fast queries, they are taking responsibility for ensuring that the agents are as cost efficient as possible. The more efficient the agent the larger their cut. The fewer times that people have to ask a question a second time, the larger their cut. This can incentivize cutting corners, but that somewhat balanced out by the need to keep people renewing their subscription, and on the whole for most users it's better to have a flat subscription price and a company that's optimizing their costs than to have a pay-as-you-go model and a company that has no incentive to improve efficiency.
I think this core business model question is happening at all levels in these companies. Each time the model goes in the wrong direction, and I stop it - or I need to go back and reset context and try again - I get charged. The thing is, this is actually a useful and productive way to work sometimes. Like when pairing with another developer, you need to be able to interrupt each other, or even try something and fail.
I don't mind paying per-request, but I can't help but think the daily API revenue graph at Cursor is going up whenever they have released a change that trips up development and forces users to intervene or retry things. And of course that's balanced by churn if users get frustrated and stop or leave. But no product team wants to have that situation.
In the end I think developers will want to pay a fair and predictable price for a product that does a good job most of the time. I don't personally care about switching models, I tend to gravitate towards the one that works best for me. Eventually, I think most coding models will soon be good at most things and the prices will go down. Where will that leave the tool vendors?
I am afraid that the endgame of programming will be who has the biggest budget for an LLM, further consolidating programming to megacorps and raising barrier to entry.
No, I'm paid much more to do much more than what I did in this simple task. Claude didn't even test the changes (in this case, it does not have the hardware required to do that), or decide that the feature needed to be implemented in the first place. But my comparison wasn't "how do I compare to Claude Code", it was "how does Aider compare to Claude Code". My boss does not use Aider or Claude Code, and would not be happy with the results of replacing me with it (yet).
I said that the AI literally does not have the hardware required to do the testing necessary. But ignoring that, automated testing is not sufficient for shipping software. Imagine shipping a website that has full test coverage but never once opening the browser. This isn't a fundamentally impossible problem for AI, but no amount of "good prompting" is going to get you there today.
I think I pretty directly addressed that point. Yes, it would be more expensive to hire me to do what Claude Code / Aider did, but nobody would be satisfied with my work if I stopped where Claude Code / Aider did.
They aren't necessarily saying it can replace you. They're saying that even though it's expensive, it's cheaper than your time (which can be better spent on other tasks, as you point out.)
The first half is correct, but the conclusions shouldn’t be ‘we’re replicating our software engineers with Claude today’, they’re ‘our experienced engineers just 10x their productivity, we’ll never need to hire an intern’
Productivity gains decrease exponentially after a few weeks as your engineering skills become rusty very fast (yes, they do, in 100% cases)
Thats the biggest part everyone misses. It’s all sunshine and rainbows until in a month you realize you start asking llm to think for you and at that point the code becomes shit and degrades fast.
Like with everything else “use it or lose it”
If you don’t code yourself- you will lose the ability to properly so it very fast, and you won’t realize it until too late
If you're using the LLM poorly. Many team leads spend very little time programming, and spend a lot of time reviewing code, which is basically what working with LLMs is. If the LLM writes code that you couldn't have written yourself, you aren't qualified to approve it.
I'm pondering where this "AI-automated programming" trend is heading.
For example: thirty years ago, FX trading was executed by a bunch of human traders. Then, computers arrived on the scene, which made all of them practically obsolete. Nowadays FX trading is executed by a collection of automated algorithms, being monitored by few quants.
My question is: is the software development in 2025 basically like what the foreign exchange was in the 2000s?
With industrialisation blacksmiths were replaced by assembly lines.
I'm sure that blacksmiths are more flexible and capable in almost any important dimension, but the economics of factories just made more sense.
I expect that when the dust settles (assuming that the dust settles), that most software will be an industrial product. The humans involved in its creation will be engineers and not craftsmen. Today we have machinists and industrial engineers - not blacksmiths.
Quality and quality assurance processes will become more important, I also expect optimised production processes.
I think a lot of the software ecosystem is a baroque set of over-engineered (or over crafted) steps and processes and this will probably be refactored.
I expect code quality metrics to be super refined.
Craftsmen don't usually produce artifacts to the tolerances that our machines do now - code will be the same.
I expect automated correctness proofs, specification languages, enhanced type systems to have a renaissance.
I know this is not really in the spirit of the room here, but before I ever dreamed of getting paid to code, I only learned to at all because I was cheap and/or poor cook/grad student that wanted to make little artsy musical things on the computer. I remember the first time I just downloaded pure data. No torrent, no cracks, it was just there for me and all it asked for was my patience.
The only reason I ever got into linux at all was because I ended up with some dinky acer chromebook for school but didn't want to stop making stuff. Crouton changed my life in a small way with that.
As I branched out and got more serious, learning web development, emacs, java, I never stopped feeling so irrationally lucky that it was all free, and always would be. Coming on here and other places to keep learning. It is to this day still the lovely forever hole I can excavate that costs only my sleep and electricity.
This is all not gone, but if I was just starting now, I'd find hn and so and coding twitter just like I did 10 years ago, but would be immediately turned off by this pervasive sense that "the way to do things now" is seemingly inseparable from a credit card number and monthly charge, however small. I just probably would not of gotten into it. It just wouldn't feel like its for me: "oh well I don't really know how to do this anyway, I can't justify spending money on it!" $0.76 for
50 loc is definitely nuts, but even $0.10 would of turned me way off. I had the same thoughts with all the web3 stuff too...
I know this speaks more to my money eccentricities than anything, and I know we dont really care on here about organic weirdo self teachers anymore (just productivity I guess). I am truly not even bemoaning the present situation, everyone has different priorities, and I am sure people are still having the exciting discovery of the computer like I did on their cursor ide or whatever. But I am personally just so so grateful the timeline lined up for me. I don't know if I'd have my passion for this stuff if I was born 10 years later than I was, or otherwise started learning now. But I guess we don't need the passion anymore anyway, its all been vectorized!
> But I am personally just so so grateful the timeline lined up for me.
I know the feeling. We still have access to the engineering thought processes responsible for some of the most amazing software feats ever accomplished (thru source repo history and mailing lists), just with access to the Internet. Of course there's a wealth of info available for free on the web for basically any profession, but for software engineering in particular it's almost direct access to world class teams/projects to learn from.
> but would be immediately turned off by this pervasive sense that "the way to do things now" is seemingly inseparable from a credit card number and monthly charge
To be effective you still need to understand and evaluate the quality of the output. There will always be a certain amount of time/effort required to get to that point (i.e., there's still no silver bullet).
> But I guess we don't need the passion anymore anyway, its all been vectorized!
We're not running out of things that can be improved. With or without these tools, the better you get, the more of the passion/energy that gets directed at higher levels of abstraction, i.e. thinking more about what to solve, tradeoffs in approaches, etc. instead of the minute details of specific solutions.
This doesn't make much sense to me. Is there some reason a kid today can't still learn to code? In the contrary, you have LLMs available that can answer your exact personalized questions. It's like having a free tutor. It's easier than it's ever been to learn for free.
I’m approaching middle age and have always wanted to learn to code and run servers, but would get caught up somewhere on tutorials and eventually give up in frustration.
Over the past year I have accomplished so much with the ever patient LLMs as my guide. It has been so much fun too. I imagine there are many others in my shoes and those that want to learn now have a much friendlier experience.
Yeah I'm middle aged and a competent programmer. But I hate learning new technologies these days. TypeScript was a huge hurdle to overcome. Working with ChatGPT made it so much more bearable. "Explain this to me. I need to do X, how? Why doesn't this work" etc.
This is definitely true in some ways. I was just talking around this point about spending money incrementally on aider, cursor, etc, and how it would have been a turnoff to me. But yes, all that I had back then people still have, and thats great.
Why learn if the computer can do it better than you and by the time you learn the roi on the market approaches 0? This wave of llm removed a lot of my interest in coding professionally
The sounds like the best way to get into coding. (For me it was wanting to realize game ideas to entertain myself.)
Money for a computer when I was getting into it was the credit-card part of it — there were no cheap Chromebooks then. (A student loan took care of the $1200 or so I needed for a Macintosh Plus.)
I suspect that's always the way of it though. There will be an easier way throwing money at a thing and there will be the "programming finds a way" way.
> this pervasive sense that "the way to do things now" is seemingly inseparable from a credit card number and monthly charge
…is true, but it only applies to experienced engineers who can sculpt the whole solution using these tools, not just random code. You need the whole learning effort to be able to ground the code the slop generators make. The passion absolutely helps here.
Note this is valid today. I have concerns that I’ll have different advice in 2027…
In 2028, the question will be who spent more money on lawsuits, and who spent more money on consultants to clean up their code base.
Jokes aside, code tools are best used in the hands of someone who is already trained and can verify bad code, and bad patterns at a glance.
AI code passes many tests. So does a lot of code written by us, for ourselves. When the code gets in front of users, especially the kind of genius users who learn how to fly by forgetting how to fall, then we learn many good habits.
In 2027 we'll have LLMs downloaded to our devices that are as good as Claude Code is today. (But as I have seen, as the leading edge of this stuff is always cooler than what you can run locally, we'll not be satisfied then with today's Claude Code.)
Tangential, but this reminds me of something someone said on Twitter that has resonated with me ever since. Startups targeting developers / building developer tooling are arguably one of the worst startups to build, because no matter how much of a steal the price is relative to the value you get, developers insist they can build their own or get by with an open-source competitor. We're as misguided on value as we are on efficiency and automation (more specifically, the old trope of a dev spending hours to automate something that takes minutes to do).
This is also why devs are not in charge of purchase decisions at tech companies. I don't mean FAANG but the usual tech shops.
Someone buys a tool and you have to use it. I think the startups selling dev tools are not selling to developers at all, but to the IT folks of these big tech firms.
Should they pull it off, it's not at all a bad startup to build. However, you need to now invest in a sales force that can sell to the Fortune 500. As a tech founder with no sales trope, this will be incredibly hard to pull off.
I digress, but yeah selling to devs is almost always a terrible idea since we all want to build our own stuff. That spirit may also be waning with the advent of Replit agent, Claude code and other tools.
> I think the startups selling dev tools are not selling to developers at all, but to the IT folks of these big tech firms.
They are often selling to IT managers against the advice of the developers and IT folks, and then they mostly don't get used because they don't actually add any value to the process.
I've noticed this tendency in myself and thought about the 'why' a lot, and I think it comes down to subconsciously factoring in the cost of lock-in, or worse, lack of access to fix/improve a tool I've come to rely on
For me, a larger part than "cost of lock-in" is the "hacker spirit", the curiosity to understand how it works.
Sure, I can pay google or fastmail to host a mailserver for me, but that deprives me of the joy of configuring and updating dovecot/postfix/etc, writing custom backup scripts, writing my own anti-spam tooling, etc. I want to know how all those pieces work.
Sure, I can pay kagi to search its index of webpages for me, but that deprives me of the joy of creating and running a botnet to index webpages, storing 100s of terrabytes of scraped data, and writing my own search code.
I think this spirit is totally lost on most people in this field. It’s tempting to say younger generations but it’s everyone. It always amazes me when I meet someone who has spent 10+ years in this field and doesn’t even care how anything but their shitty Kafka-Flink pipelines work.
If that; I’ve met plenty who only care that they work, not how they work.
As someone who works in infra and dabbles in coding, this is a continual bugbear, because often I’ll find an optimization while troubleshooting “my” problem, and the dev team is disinterested in implementing it. Their endpoints are meeting their SLO, so who cares?
I've honestly thought of hacker spirit as embodying a kind of homesteader ethos in a way. There's this homesteading book I bought a long time ago when I was in college, rich with illustrations on how to do everything from raise animals and grow food to building a house, processing lumber, drilling a well, everything. The same fascination I have with homesteading and DIY culture extends into my interest in technology, and I suspect this is the same with a lot of developers as well.
>We're as misguided on value as we are on efficiency and automation (more specifically, the old trope of a dev spending hours to automate something that takes minutes to do).
but automating something that takes minutes to do is Larry Wall's example of programmer laziness, and is a virtue.
of course - this needs obligatory conflicting XKCD comics
My post above was with sonnet-3.5. When I used sonnet-3.7, it didn't speculate at the files at all; it simply requested that I add the appropriate ones.
I did a comparison between Claude Code and Aider (my normal go-to): I asked it to do clone a minor feature in my existing app with some minor modifications (specifically, a new global keyboard shortcut in a Swift app).
Claude Code spent about 60 seconds and $0.73 to search the code base and make a +51 line diff. After it finished, I was quite impressed by its results; it did exactly the correct set of changes I would have done.
Now, this is a higher level of task than I would normally give to Aider (because I didn't provide any file names, and it requires changing multiple files), so I was not surprised that Aider completely missed the files it needed to modify to start (asking me to add 1 correct file and 2 incorrect files). I did a second attempt after manually adding the correct files. After doing this, it produced an equivalent diff to Claude Code. Aider did this in 1 LLM prompt, or about 15 seconds, with a total cost of $0.07, about 10% of the Claude Code cost.
Overall, it seems clear that the higher level of autonomy carries a higher cost with it. My project here was 7k SLOC; I would worry about ballooning costs on much larger projects.