IMO other than the Microsoft IP issue, I think the biggest thing that has shifted since this acquisition was first in the works is Claude Code has absolutely exploded. Forking an IDE and all the expense that comes with that feels like a waste of effort, considering the number of free/open source CLI agentic tools that are out there.
Let's review the current state of things:
- Terminal CLI agents are several orders of magnitude less $$$ to develop than forking an entire IDE.
- CC is dead simple to onboard (use whatever IDE you're using now, with a simple extension for some UX improvements).
- Anthropic is free to aggressively undercut their own API margins (and middlemen like Cursor) in exchange for more predictable subscription revenue + training data access.
What does Cursor/Windsurf offer over VS Code + CC?
- Tab completion model (Cursor's remaining moat)
- Some UI niceties like "add selection to chat", and etc.
Personally I think this is a harbinger of where things are going. Cursor was fastest to $900M ARR and IMO will be fastest back down again.
Agreed on everything. Just to add, not only anthropic is offering CC at like a 500% loss, they restricted sonnet/opus 4 access to windsurf, and jacked up their enterprise deal to Cursor. The increase in price was so big that it forced cursor to make that disastrous downgrade to their plans.
I think only way Cursor and other UX wrappers still win is if on device models or at least open source models catch up in the next 2 years. Then i can see a big push for UX if models are truly a commodity. But as long as claude is much better then yes they hold all the cards. (And don't have a bigger company to have a civil war with like openai)
The way I am doing the math with my Max subscription and assuming DeepSeek API prices, it is still x5 times cheaper. So either DeepSeek is losing money (unlikely) or Anthropic is losing lots of money (more likely). Grok kinda confirms my suspicions. Assuming DeepSeek prices, I've probably spent north of $100 of Grok compute. I didn't pay Grok or Twitter a single cent. $100 is a lot of loss for a single user.
Claude API pricing has significant margin baked in. I think it's safe to assume that anthropic is getting 80% margin on their api and they are selling claude code for less than that.
To me, claude usually feels like a bumbling idiot. But in extremely rare cases it feels like a sentient super intelligence. I facetiously assumed that in those cases it ran on the correct RNG seed.
I'm also curious about this. Claude Code feels very expensive to me, but at the same time I don't have much perspective (nothing to compare it to, really, other than Codex or other agent editors I guess. And CC is way better so likely worth the extra money anyway)
Pretty easy to hit $100 an hour using Opus on API credits. The model providers are heavily subsidized, the datacenters appear to be too. If you look at the Coreweave stuff and the private datacenters it starts looking like the telecom bubble. Even Meta is looking to finance datacenter expansion - https://www.reuters.com/business/meta-seeks-29-billion-priva...
The reason they are talking about building new nuclear power plants in the US isn't just for a few training runs, its for inference. At scale the AI tools are going to be extremely expensive.
Also note China produces twice as much electricity as the United States. Software development and agent demand is going to be competitive across industries. You may think, oh I can just use a few hours of this a day and I got a week of work done (happens to me some days), but you are going to end up needing to match what your competitors are doing - not what you got comfortable with. This is the recurring trap of new technology (no capitalism required.)
There is a danger to independent developers becoming reliant on models. $100-$200 is a customer acquisition cost giveaway. The state of the art models probably will end up costing hourly what a human developer costs. There is also the speed and batching part. How willing is the developer to, for example, get 50% off but maybe wait twice as long for the output. Hopefully the good dev models end up only costing $1000-$2000 a month in a year. At least that will be more accessible.
Somewhere in the future these good models will run on device and just cost the price of your hardware. Will it be the AGI models? We will find out.
I wonder how this comment will age, will look back at it in 5 or 10 years.
Your excellent comments make me grateful that I am retired and just work part time on my own research and learning. I believe you when you say professional developers will need large inference compute budgets.
Probably because I am an old man, but I don’t personally vibe with full time AI assistant use, rather I will use the best models available for brief periods on specific problems.
Ironically, when I do use the best models available to me it is almost always to work on making weaker and smaller models running on Ollama more effective for my interests.
BTW, I have used neural network tech in production since 1985, and I am thrilled by the rate of progress, but worry about such externalities as energy use, environmental factors, and hurting the job market for many young people.
I've been around for a while (not quite retirement age) and this time is the closest to the new feeling I had using the internet and web in the early days. There are simultaneously infinite possibilities but also great uncertainty what pathways will be taken and how things will end up.
There are a lot of parts in the near term to dislike here, especially the consequences for privacy, adtech, energy use. I do have concerns that the greatest pitfalls in the short terms are being ignored while other uncertainties are being exaggerated. (I've been warning on deep learning model use for recommendation engines for years, and only a sliver of people seem to have picked up on that one, for example.)
On the other hand, if good enough models can run locally, humans can end up with a lot more autonomy and choice with their software and operating systems than they have today. The most powerful models might run on supercomputers and just be solving the really big science problems. There is a lot of fantastic software out there that does not improve by throwing infinite resources at it.
Another consideration is while the big tech firms are spending (what will likely approach) hundreds of billions of dollars in a race to "AGI", what matters to those same companies even more than winning is making sure that the winner isn't a winner takes all. In that case, hopefully the outcome looks more like open source.
The SOTA models will always run in data centers, because they have 5x or more VRAM and 10-100x the compute allowance. Plus, they can make good use of scaling w/ batch inference which is a huge power savings, and which a single developer machine doesn’t make full use of.
Yes I do, it’s just for new comers who are used to cursor where, without careful prompting you just lock yourself out of premium requests, that’s not immediately a given that CC is more dangerous and does not work the same way at all.
Of course it requires careful planning but the trap is easy to fall into.
In my experience, it's more about the tool's local indexing and aggressive automatic upload and model usage limitations to avoid them (and you) overpaying.
People are recreating this with local toolchains now.
This is around what what Cursor was costing me with Claude 4 Opus before I switched to Claude Code. Sonnet works fine for some things, but for some projects it spews unusable garbage, unless the specification is so detailed that it's almost the implementation already.
This is where something like Perplexity's "memory" feature is really great. It treats other threads similarly to web resources.
I would love to understand better just how Perplexity is able to integrate up-to-date sources like other theads (and presumably recent web searches, but I haven't verified this, they could be just from the latest model) into it's query responses. It feels seamless.
Have you been human before? competition for resources and status is an instinctive trait.
It rears its head regardless of what sociopolitical environment you place us in.
You’re either competing to offer better products or services to customers…or you’re competing for your position in the breadline or politburo via black markets.
Even in the Soviet Union there were multiple design bureaus competing for designs of things like aircraft. Tupolec, Ilyushin, Sukhoi, Mikoyan-Gyurevich (MiG), Yakolev, Mil. There were quite a lot. Several (not all, they had their specialisations) provided designs when a requirement was raised. Not too different from the US yet not capitalist.
Not really, it's possible with any market economy, even a hypothetical socialist one (that is, one where all market actors are worker-owned co-ops).
And, since there is no global super-state, the world economy is a market economy, so even if every state were a state-owned planned economy, North Korea style, still there would exist this type of competition between states.
Consider also that VC funds often have pension funds as their limited partners. Workers have a claim to their pension, and thus a claim to the startup returns that the VC invests in.
So yeah it basically comes down to your definition of "worker-owned". What fraction of worker ownership is necessary? Do C-level execs count as workers? Can it be "worker-owned" if the "workers" are people working elsewhere?
Beyond the "worker-owned" terminology, why is this distinction supposed to matter exactly? Supposing there was an SV startup that was relatively generous with equity compensation, so over 50% of equity is owned by non-C-level employees. What would you expect to change, if anything, if that threshold was passed?
> Supposing there was an SV startup that was relatively generous with equity compensation, so over 50% of equity is owned by non-C-level employees. What would you expect to change, if anything, if that threshold was passed?
If the workers are majority owners, then they can, for example, fire a CEO that is leading the company in the wrong direction, or trying to cut their salaries, or anything like that.
>If the workers are majority owners, then they can, for example, fire a CEO that is leading the company in the wrong direction, or trying to cut their salaries, or anything like that.
Why wouldn't the board fire said CEO?
The most common reason to cut salaries is if the company is in dire financial straits regardless. Co-ops are more likely to cut salary and less likely to do layoffs.
Because the board doesn't understand the business at the level that employees do. Or because the board has different goals for the business than employees do. Or because the board is filled with friends of the CEO who let them do whatever.
Also, lots of companies reduce salaries or headcount if they feel they can get away with it. They don't need to be in dire financial straights, it's enough to have a few quarters of no or low growth and to want to show a positive change.
How specifically would you expect a typical SV corp's policies to change if employee equity passes from 49% to 51%?
Remember, if employees own 49%, if they can persuade just 2% of the other shareholders that a change will be positive for the business, they can make that change. So minority vs majority is not as significant as it may seem.
Can you give me an idea of how much interaction would be $50-$100 per day? Like are you pretty constantly in a back and forth with CC? And if you wouldn’t mind, any chance you can give me an idea of productivity gains pre/post LLM?
Yes, a lot of usage, I’d guess top 10% among my peers. I do 6-10hrs of constant iterating across mid-size codebases of 750k tokens. CC is set to use Opus by default, which further drives up costs.
Estimating productivity gains is a flame war I don’t want to start, but as a signal: if the CC Max plan goes up 10x in price, I’m still keeping my subscription.
I maintain top-tier subscription to every frontier service (~$1k/mo) and throughout the week spend multiple hours with each of Cursor, Amp, Augment, Windsurf, Codex CLI, Gemini CLI, but keep on defaulting to Claude Code.
I am curious what kind of development you’re doing and where your projects fall on the fast iteration<->correctness curve (no judgment). I’ve used CC Pro for a few weeks now and I will keep it, it’s fantastically useful for some things, but it has wasted more of my time than it saved when I’ve experimented with giving it harder tasks.
It's interesting to work with a number of people using various models and interaction modes in slightly different capacities. I can see where the huge productivity gains are and can feel them, but the same is true for the opposite. I'm pretty sure I lost a full day or more trying to track down a build error because it was relatively trivial fpr someone to ask CC or something to refactor a ton of files, which it seems to have done a bit too eagerly. On the other hand, that refactor would have been super tedious, so maybe worth it?
Mostly to save money (I am retired) I mostly use Gemini APIs. I used to also use good open weight models on groq.com, but life is simpler just using Gemini.
Ultimately, my not using the best tools for my personal research projects has zero effect on the world but I am still very curious what elite developers with the best tools can accomplish, and what capability I am ‘leaving on the table.’
I’m a founder/CTO of an enterprise SaaS, and I code everything from data modeling, to algos, backend integrations, frontend architecture, UI widgets, etc. All in TypeScript, which is perfectly suited to LLMs because we can fit the types and repo map into context without loading all code.
As to “why”: I’ve been coding for 25 years, and LLMs is the first technology that has a non-linear impact on my output. It’s simultaneously moronic and jaw-dropping. I’m good at what I do (eg, merged fixes into Node) and Claude/o3 regularly finds material edge cases in my code that I was confident in. Then they add a test case (as per our style), write a fix, and update docs/examples within two minutes.
I love coding and the art&craft of software development. I’ve written millions of lines of revenue generating code, and made millions doing it. If someone forced me to stop using LLMs in my production process, I’d quit on the spot.
Why not self host: open source models are a generation behind SOTA. R1 is just not in the same league as the pro commercial models.
> If someone forced me to stop using LLMs in my production process, I’d quit on the spot.
Yup 100% agree. I’d rather try to convince them of the benefits than go back to what feels like an unnecessarily inefficient process of writing all code by hand again.
And I’ve got 25+ years of solid coding experience. Never going back.
> data modeling, to algos, backend integrations, frontend architecture, UI widgets, etc. All in TypeScript, which is perfectly suited to LLMs because we can fit the types and repo map into context without loading all code.
Which frameworks & libraries have you found work well in this (agentic) context? I feel much of the js lib. landscape does not do enough to enforce an easily-understood project structure that would "constrain" the architecture and force modularity. (I might have this bias from my many years of work with Rails that is highly opinionated in this regard).
When you say generation behind, can you give a sense of what that means in functionality per your current use? Slower/lower quality, it would take more iterations to get what you want?
Context rot. My use case is iterating over a large codebase which quickly grows context. All LLMs degrade with larger context sizes, well below their published limits, but pro models degrade the least. R1 gets confused relatively quickly, despite their published numbers.
I think Fiction LiveBench captures some of those differences via a standardized benchmark that spreads interconnected facts through an increasingly large context to see how models can continue connecting the dots (similar to how in codebases you often have related ideas spread across many files)
> I’ve written millions of lines of revenue generating code
This is a wild claim.
Approx 250 working days in a year. 25 years coding. Just one million lines would be phenom output, at 160 lines per day forever. Now you are claiming multiple millions? Come on.
It's impossible as an IC on a team, or working where a concept of "tickets" exists. It's unavoidable as a solo founder, whether you're building enterprise systems or expanding your vision. Some details -
1. Before wife&kids, every weekend I would learn a library or a concept by recreating it from scratch. Re-implementing jQuery, fetch API via XHR, Promises, barebones React, a basic web router, express + common middlewares, etc. Usually, at least 1,000 lines of code every weekend. That's 1M+ over 25 years.
2. My last product is currently 400k LOCs, 95% built by me over three years. I didn't one-shot it, so assuming 2-3x ongoing refactors, that's more than 1M LOCs written.
3. In my current product repo, GitHub says for the last 6 months I'm +120k,-80k. I code less than I used to, but even at this rate, it's safely 100k-250k per year (times 20 years).
4. Even in open source, there are examples like esbuild, which is a side project from one person (cofounder and architect of Figma). esbuild is currently at ~150k LOCs, and GitHub says his contributions were +600k,-400k.
5. LOCs are not the same. 10k lines of algorithms can take a month, but 10K of React widgets is like a week of work (on a greenfield project where you know exactly what you're building). These days, when a frontend developer says their most extensive UI codebase was 100k LOCs in an interview, I assume they haven't built a big UI thing.
So yes, if the reference point is "how many sprint tickets is that", it seems impossible. If the reference point is "a creative outlet that aligns with startup-level rewards", I think my statement of "millions of lines" is conservative.
Granted, not all of it was revenue-generating - much was experimental, exploratory, or just for fun. My overarching point was that I build software products for (great) living, as opposed to a marketer who stumbled into Claude Code and now evangelizes it as some huge unlock.
No, it’s not. At all. At the overwhelming majority of companies I’ve worked for or heard of, even 400-500 lines fully shipped in a week, slightly less than your figure here, would be top quartile of output - but further, it isn’t necessarily the point. Writing lines of code is a pretty small part of the job at companies with more than about 5-6 engineers on staff, past that it’s a lot more design and architecture and LEGO-brick-fitting - or just politicking and policying. Heck, I know folks who wish they could ship 400 lines of code a month, but are held back by the bureaucracies of their companies.
Now extrapolate. That’s maybe 50k a year assuming some PTO.
10 years would make 500k and you just cross a million at 20.
So that would have to be 20 years straight of that style of working and you’re still not into plural millions until 40 years.
If someone actually produced multiple millions of lines in 25 years, it would have to be a side effect of some extremely verbose language where trivial changes take up many lines (maybe Java).
i've been using llm-based tools like copilot and claude pro (though not cc with opus), and while they can be helpful – e.g. for doc lookups, repetitive stuff, or quick reminders – i rarely get value beyond that. i've honestly never had a model surface a bug or edge case i wouldn’t have spotted myself.
i've tried agent-style workflows in copilot and windsurf (on claude 3.5 and 4), and honestly, they often just get stuck or build themselves into a corner. they don’t seem to reason across structure or long-term architecture in any meaningful way. it might look helpful at first, but what comes out tends to be fragile and usually something i’d refactor immediately.
sure, the model writes fast – but that speed doesn't translate into actual productivity for me unless it’s something dead simple. and if i’m spending a lot of time generating boilerplate, i usually take that as a design smell, not a task i want to automate harder.
so i’m honestly wondering: is cc max really that much better? are those productivity claims based on something fundamentally different? or is it more about tool enthusiasm + selective wins?
Unless you're getting paid for your commute, you're just giving your employer free productivity. I would recommend doing literally anything else with that time. Read a book, maybe.
If you can't do your job in your 8 hours then you're either not good enough or the requirements are too much and the company should change processes and hire.
Right, I'm not saying anyone should actually be in the office 40 hours a week that sounds terrible. And even with all the RTO of the last couple years that doesn't seem to be expected many places.
Personally I use dev containers on a server and I have written some template containers for quickly setting up new containers that has claude code and some scripts for easily connecting to the right container etc. Makes it possible to work on mobile,but lots of room for improvement in the workflow still.
The project is just a web backend. I give Claude Code grunt work tasks. Things like "make X operation also return Y data" or "create Z new model + CRUD operations". Also asking it to implement well-known patterns like denouncing or caching for an existing operation works well.
My app builds and runs fine on Termux, so my CLAUDE.md says to always run unit tests after making changes. So I punch in a request, close my phone for a bit, then check back later and review the diff. Usually takes one or two follow-up asks to get right, but since it always builds and passes tests, I never get complete garbage back.
There are some tasks that I never give it. Most of that is just intuition. Anything I need to understand deeply or care about the implementation of I do myself. And the app was originally hand-built by me, which I think is important - I would not trust CC to design the entire thing from scratch. It's much easier to review changes when you understand the overall architecture deeply.
you can easily reach 50$ per day.
by force switching model to opus
/model opus
it will continue to use opus eventhough there is a warning about approaching limit.
i found opus is significantly more capable in coding than sonnet, especcially for the task that is poorly defined, thinking mode can fulfill alot of missing detail and you just need to edit a little before let it code.
Claude Code with a Claude subscription is the cheap version for current SOTA.
"Agentic" workflows burn through tokens like there's no tomorrow, and the new Opus model is so expensive per-token that the Max plan pays itself back in one or two days of moderate usage. When people reports their Claude Code sessions costing $100+ per day, I read that as the API price equivalent - it makes no sense to actually "pay as you go" with Claude right now.
This is arguably the cheapest option available on the market right now in terms of results per dollar, but only if you can afford the subscription itself. There's also time/value component here: on Max x5, it's quite easy to hit the usage limits of Opus (fortunately the limit is per 5 hours or so); Max x20 is only twice the price of Max x5 but gives you 4x more Opus; better model = less time spent fighting with and cleaning up after the AI. It's expensive to be poor, unfortunately.
>less time spent fighting with and cleaning up after the AI.
I've yet to use anything but copilot in vscode, which is 1/2 the time helpful, and 1/2 wasting my time. For me it's almost break-even, if I don't count the frustration it causes.
I've been reading all these AI-related comment sections and none of it is convincing me there is really anything better out there. AI seems like break-even at best, but usually it's just "fighting with and cleaning up after the AI", and I'm really not interested in doing any of that. I was a lot happier when I wasn't constantly being shown bad code that I need to read and decide about, when I'm perfectly capable of writing the code myself without the hasle of AI getting in my way.
AI burnout is probably already a thing, and I'm close to that point already. I do not have hope that it will get much better than it is, as the core of the tech is essentially just a guessing game.
I tend to agree except for one recent experience: I built a quick prototype of an application whose backend I had written twice before and finally wanted to do right. But the existing infrastructure for it had bit-rotted, and I am definitely not a UI person. Every time I dive into html+js I have to spend hours updating my years-out-of-date knowledge of how to do things.
So I vibe coded it. I was extremely specific about how the back end should operate and pretty vague about the UI, and basically everything worked.
But there were a few things about this one: first, it was just a prototype. I wanted to kick around some ideas quickly, and I didn't care at all about code quality. Second, I already knew exactly how to do the hard parts in the back end, so part of the prompt input was the architecture and mechanism that I wanted.
But it spat out that html app way way faster than I could have.
Claude Code pro is ~$20USD/ month and is nearly enough for someone like me who can’t use it at work and is just playing around with it after work. I’m loving it.
cursor on a $20/month plan (if you burn thru the free credits) or gemini-cli (free) are 2 great ways to try out this kinda stuff for a hobbyist. you can throw in v0 too, $5/month free credits. susana’s free tier can give you a db as well.
Zed is fantastic. Just dipping my toes in agentic AI, but I was able to fix a failing test I spent maybe 15 minutes trying to untangle in a couple minutes with Zed. (It did proceed to break other tests in that file though, but I quickly reverted that.)
It is also BYOA or you can buy a subscription from Zed themselves and help them out. I currently use it with my free Copilot+ subscription (GitHub hands it out to pretty much any free/open source dev).
You can tell Claude Code to use opus using /model and then it doesn't fall back to Sonnet btw. I am on the $100 plan and I hit rate-limits every now and then, but not enough to warrant using Sonnet instead of Opus.
This is what I don’t get about the cost being reported by Claude code. At work I use it against our AWS Bedrock instance, and most sessions will say 15/20 dollars and I’ll have multiple agents running. So I can easily spend 60 bucks a day in reported cost. Our AWS Bedrock bill is only a small fraction of that? Why would you over charge on direct usage of your API?
Seems like the survival strategy for cursor would be to develop their own frontier coding model. Maybe they can leverage the data from their still somewhat significant lead in the space to make a solid effort.
I don’t think that’s a viable strategy. It is very very hard and not many people can do it. Just look at how much Meta is paying to poach the few people in the world capable of training a next gen frontier model.
The basic concept plus a lot of money spent on compute and training data gets you pretraining. After that to get a really good model there’s a lot more fine-tuning / RL steps that companies are pretty secretive about. That is where the “smart decisions” and knowledge gained by training previous generations of sota models comes in.
We’d probably see more companies training their own models if it was cheaper, for sure. Maybe some of them would do very well. But even having a lot of money to throw at this doesn’t guarantee success, e.g. Meta’s Llama 4 was a big disappointment.
That said, it’s not impossible to catch up to close to state-of-the-art, as Deepseek showed.
I’d also add that no one predicted the emergent properties of LLMs as they followed the scaling laws hypothesis. GPT showed all kinds of emergent stuff like reasoning/sentiment analysis when we went up an order of magnitude on the number of parameters. We don’t don’t actually know what would emerge if we trained a quadrillion param model. SOTA will always be mysterious until we reach those limits, so, no, companies like Cursor will never be on the frontier. It takes too much money and requires seeking out things we haven’t ever seen before.
There are plenty of people theoretically capable of doing this, I secretly believe some of the most talented people in this space are randos posting on /r/LocalLlama.
But the truth is to have experience building models at this scale requires working at a high level job at a major FAANG/LLM provider. Building what Meta needs is not something you can do in your basement.
The reality is the set of people who really understand this stuff and have experience working on it at scale is very, very small. And the people in this space are already paid very well.
It's a staggeringly bad deal. It's a hugely expensive task where unless you are the literal best in the world, you would never even see any usage. And even for those who are BOTH best and well known they have to be willing to lose billions on repeat with no end in sight.
It's very very rare to have winner takes all to such an extreme degree as code llm models
I don't think it's literally "winner takes all" - I regularly cycle between Gemini, DeepSeek and Claude for coding tasks. I'm sure any GPT model would be fine too, and I could even fall back to Qwen in a pinch (exactly what I did when I was in China recently with no ability to access foreign servers).
Claude does have a slight edge in quality (which is why it's my default) but infrastructure/cost/speed are all relevant too. Different providers may focus on one at the expense of the others.
One interesting scenario where we could end up is using large hosted models for planning/logic, and handing off to local models for execution.
I'd recommend reading some of the papers on what it takes to actually train a proper foundation model, such as the Llama 3 Herd of Models paper. It is a deeply sophisticated process.
Coding startups also try to fine-tune OSS models to their own ends. But this is also very difficult, and usually just done as a cost optimization, not as a way to get better functionality.
You need a person that can hit the ground running. Compute for LLM is extremely capital intensive and you’re always racing against time. Missing performance targets can mean life or death of the company.
As an actual user of Windsurf model, I don't think "tried" is fair. I sometimes use it. It's not as smart as Gemini but it iterates quicker and is very well aligned with their tool calls
- Forking VSCode is very easy; you can do it in 1 hour.
- Anthropic doesn't use the inputs for training.
- Cursor doesn't have $900M ARR. That was the raise. Their ARR is ~$500m [1].
- Claude Code already support the niceties, including "add selection to chat", accessing IDE's realtime warnings and errors (built-in tool 'ideDiagnostics'), and using IDE's native diff viewer for reviewing the edits.
The cost of vscode fork is that microsoft has restricted extension marketplace for forks. You have to maintain separate one, that is the real dealbreaker
Their base is $20/mth. That would equal 3.75M people paying a sub to Cursor.
If literally everyone is on their $200/mth plan, then that would be 375K paid users.
There’s 50M VS Code + VS users (May 2025). [1] 7% of all VS Code users having switched to Cursor does not match my personal circle of developers. 0.7% . . . Maybe? But, that would be if everyone using Cursor were paying $200/month.
Seems impossibly high, especially given the number of other AI subscription options as well.
The $20/month cursor sub is heavily limited though, for basic casual usage that's fine but you VERY soon run into its limits when working at any speed.
I also just prefer CC's UX. I've tried to make myself use Copilot and Roo and I just couldn't. The extra mental overhead and UI context-switching took me out of the flow. And tab completion has never felt valuable to me.
But the chat UX is so simple it doesn't take up any extra brain-cycles. It's easier to alt-tab to and from; it feels like slacking a coworker. I can have one or more terminal windows open with agents I'm managing, and still monitor/intervene in my editor as they work. Fits much nicer with my brain, and accelerates my flow instead of disrupting it
There's something starkly different for me about not having to think about exactly what context to feed to the tool, which text to highlight or tabs to open, which predefined agent to select, which IDE button to press
Just formulate my concepts and intent and then express those in words. If I need to be more precise in my words then I will be, but I stay in a concepts + words headspace. That's very important for conserving my own mental context window
Claude Code is just proving that coding agents can be successful. The interface isn’t magic, it just fits the model and integrates with a system in all the right ways. The Anthropic team for that product is very small comparatively (their most prolific contributor is Claude), and I think it’s more of a technology proof than a core competency - it’s a great API $ business lever, but there’s no reason for them to try and win the “agentic coding UI” market. Unless Generative AI flops everywhere else, these markets will continue to emerge and need focus. The Windsurf kerfuffle is further proof that OpenAI doesn’t see the market as must-win for a frontier model shop.
And so I’d say this isn’t a harbinger of the death of Cursor, instead proof that there’s a future in the market they were just recently winning.
I was being hyperbolic saying their ARR will go to zero. That's obviously not the case, but the point is that CC has revealed their real product was not "agentic coding UI", it was "insanely cheap tokens". I have no doubt they will continue to see success, but their future right now looks closer to being a competitor to free/open tools like cline/roo code, as well as the CLI entrants, not a standalone $500M ARR juggarnaut. They have no horse in the race in the token market, they're a middleman.
They either need to create their own model and compete on cost, or hope that token costs come down dramatically so as to be too cheap to meter.
Digging in here more... why would you say it isn't in Anthropic's interest to win the "agentic coding UI" market?
My mental model is that these foundation model companies will need to invest in and win in a significant number of the app layer markets in order to realize enough revenue to drive returns. And if coding / agentic coding is one of the top X use cases for tokens at the app layer, seems logical that they'd want to be a winner in this market.
Is your view that these companies will be content to win at the model layer and be agnostic as to the app layer?
My intuition is that their fundamental business is executing on the models, and any other products are secondary and exist to drive revenue that they can use to compete against Google/OpenAI/Meta as well as to ensure - and demonstrate - that their models are performant in these new markets. Claude needs to be great at coding, but Anthropic doesn’t need to own Coding. Claude Code is growing their core business, just like a Claude Robotics or a Claude Scheduling might, but they cant focus on robotics or scheduling because that takes them away from the core business of models. A strategic relationship with Cursor might have been enough to accomplish this, but it wasn’t - maybe Cursor couldn’t execute fast enough, or didn’t align on priorities, or whatever. I’ve watched a bunch of interviews with the CC team and I very much get the impression that it was more “holy shit, this works great” than a product strategy.
You may be right about “they need to invest in and win” in order to have __enough__ revenue to outcompete the nation-state sized competition, but this stuff is moving way to fast for anyone know.
Cursor see it coming - it's why they're moving to the web and mobile[0]
The bigger issue is the advantage Anthropic, Google and OpenAI have in developing and deploying their own models. It wasn't that long ago that Cursor was reading 50 lines of code at a time to save on token costs. Anthropic just came out and yolo'd the context window because they could afford to, and it blew everything else away.
Cursor could release a cli tomorrow but it wouldn't help them compete when Anthropic and Google can always be multiples cheaper
> Anthropic just came out and yolo'd the context window because they could afford to
I don’t think this is true at all. The reason CC is so good is that they’re very deliberate about what goes in the context. CC often spends ages reading 5 LOC snippets, but afterwards it only has relevant stuff in context.
Heard a lot of this context bs parroted all over HN, don't buy it. If simply increasing context size can solve problem, Gemini would be the best model for everything.
I think this is an interesting and cool direction for Cursor to be going in and I don't doubt something like this is the future. But I have my doubts whether it will save them in the short/medium term:
- AI is not good enough yet to abandon the traditional IDE experierence if you're doing anything non-trivial. Hard finding use cases for this right now.
- There's no moat here. There are already a dozen "Claude Code UI" OSS projects with similar basic functionality.
Strictly speaking about large, complex, sprawling codebases, I don't think you can beat the experience that an IDE + coding agent brings with a terminal-based coding agent.
Auto-regressive nature of these things mean that errors accumulate, and IDEs are well placed to give that observability to the human, than a coding agent. I can course correct more easily in an IDE with clear diffs, coding navigation, than following a terminal timeline.
You can view and navigate the diffs made by the terminal agent in your IDE in realtime, just like Cursor, as well as commit, revert, etc. That’s really all the “integration” you need.
Some excellent points. On “add selection to chat”, I just want to add that the Claude Code VS code extension automatically passes the current selection to the model. :)
I am genuinely curious if any Cursor or Windsurf users who have also tried Claude Code could speak to why they prefer the IDE-fork tools? I’ve only ever used Claude Code myself - what am I missing?
Cursor's tab completion model is legitimately fantastic and for many people is worth the entire $20 subscription. Lint fixes or syntax-level refactors are guessed and executed instantly with TAB with close to 100% accuracy. This is their final moat IMO, if Copilot manages to bring their tab completion up to near parity, very little reason to use Cursor.
Idk. When you're doing something it really gets it's super nice, but it's also off a lot of times and it's IMO super distracting when it constantly pop up. No way to explicitly request it instead - other than toggling, which seems to also turn off context/edit tracking, because after toggling on it does not suggest anything until you make some edits.
While Zed's model is not as good the UI is so much better IMO.
Just to offer a different perspective, I use Cursor at work and, coming from emacs (which I still use) with copilot completions only when I request them with a shortcut, Cursor’s behavior drives me crazy.
Which Emacs Package do you use for CoPilot, i tried using Copilot.el a long while ago, but had problems with it. Is there something new or does copilot.el fulfill your needs?
I only get suggestions if I use that key (the prefix looks huge but I have a Keyboardio Model 100 and I have that bound to the Any key, so I intentionally picked a crazy-long prefix hoping to avoid collision with other shortcuts) and that's the way I like to use these tools (which is why the behaviour in Cursor drives me crazy, though I admit I haven't spend time looking at its configuration, maybe it's something that can be turned off).
I haven't used Cursor or Claude much, how different is it from Copilot? I bounce between desktop ChatGPT (which can update VS Code) and copilot. Is there an impression that those have fallen behind?
IME, one of execution. Copilot is like having your cousin who works at Bestbuy try and help you code - it knows what a computer is, and speaks english, but is pretty bad at both
The story I've heard is that Cursor is making all their money on context management and prompting, to help smooth over the gap between "you know what I meant" and getting the underlying model to "know what you meant"
I haven't had as much experience with Claude or Claude Code to speak to those, but my colleagues speak of them highly
It's quite interesting how little the Cursor power users use tab. Majority of the posts are some insane number of agent edits and close to (or exactly) 0 tabs.
At my company we have an enterprise subscription and we're also all allowed to see the analytics for the entire company. Last I checked, I was literally the number one user of Tab and middle of the pack for agent.
It's interesting when I see videos or reddit posts about cursor and people getting rate limited and being super angry. In my experience tab is the number one feature, and I feel like most people using agent are probably overusing it tasks that would honestly take less time to do myself or using models way smarter than they need to be for the task at hand.
I use cursor strictly for agent edits and do anything else in a proper IDE meaning in a Jetbrains product that I run in a separate window.
Many of my co-workers do the same. VC Code is vastly inferior when it comes to editing and actual IDE feature so it is a non-starter when you do programming yourself.
I once tried AI tab-complete on Zed and it was all right but breaks my flow. Either the AI does the editing or I do it but mixing both annoys me.
I'd like to ask the opposite question: why do people prefer command line tools? I tried both and I prefer working in IDE. The main reason is that I don't trust the LLMs too much and I like to see and potentially quickly edit the changes they make. With an IDE, I can iterate much faster than with the command line tool.
I haven't tried Claude Code VS Code extension. Did anyone replaced Cursor with this setup?
I replaced. My opinion: Cursor sucks as an IDE.
Cursor may have a average to above average quality in IDE assistance - but the IDE seems to get in the way. It's entire performance is based on the real-time performance and latency from their servers and sometimes it is way too slow. The TAB autocomplete that was working for you in the last 30 minutes suddenly doesn't work randomly, or just experiences severe delays that it stops making sense.
Besides that, the IDE seems poorly designed - some navigation options are confusing and it makes way too many intrusive changes (ex: automatically finishing strings).
I've since gone back to VS Code - with Cline (with OpenRouter and super cheap Qwen Coder models, Windsurf FREE, Claude Code with $20 per month) and I get great mileage from all of them.
You're looking at (coloured) diffs in your shell is all when it comes to coding. It's pretty easy to setup MCP and have claude be the director. Like I have zen MCP running with an OpenRouter API key, and will ask claude to consult with pro (gemini) or o3, or both to come up with an architecture review / plan.
I honestly don't know how great that is, because it just reiterates what I was planning anyways, and I can't tell if it's just glazing, or it's just drawing the same general conclusions. Seriously though, it does a decent job, and you can discuss / ruminate over approaches.
I assume you can do all the same things in an editor. I'm just comfortable with a shell is all, and as a hardcore Vi user, I don't really want to use Visual Studio.
I also use vim heavily and I've found that I'm really enjoying Cursor + VS Code Vim extension. The cursor tab completion works very nicely in conjunction with vim navigate mode.
heh, including "for diffing" is selling short when our new job as software developers now seems to be reviewing code, of which looking at a diff is only one tiny part. That goes infinitely more for dynamically typed languages, where there is no compiler to catch dumb typos. If I have to actually, no kidding, review code then I want all the introspections, find references, go to declaration, et al for catching the intern trying to cheat me
I can roll back to different checkpoints with Cursor easily. Maybe CC has it but the fact that I haven’t found it after using it daily is an example of Cursor having a better UX for me.
I like using Claude Code through Roo Code (vscode extension). I find it easier to work with text using a mouse, vscode diff viewer etc. I guess if you're very good at vim shortcuts etc you can use that in Claude Code instead of selecting text with a mouse. Claude Code has a vscode extension too so I feel that using Claude Code through vscode just adds a better UI.
Using cline for a bit made me realize cursor was doomed. Everything is just a gpt/anthropic wrapper of fancy prompts.
I can do most of what I want with cline, and I've gone back from large changes to just small changes and been moving much quicker. Large refactors/changes start to deviate from what you actually want to accomplish unless you have written a dissertation, and even then they fail.
I agree with all you’ve said but with regards to writing a dissertation for larger changes : have you tried letting it first right a plan for you as markdown (just keep this file uncommitted) and then let it build a checklist of things to do?
I find just referencing this file over and over works wonders and it respects items that were already checked off really well.
I can get a lot done really fast this way in small enough chunks so i know every bit of code and how it works (tweaking manually of course where needed).
But I can blow through some tickets way faster than before this way.
IIRC problem is that VS Code does not allow extensions to create custom UI in the panels areas except for WebViews(?). It makes for not a great experience. Plus Cursor does a lot with background indexing to make their tab completion model really good - more than would be possible with the extensions APIs available.
Not if you want custom UI. There are a lot of things you can do in extension land (continue, cline, roocode, kilocode, etc. are good examples) but there are some things you can't.
One thing I would have thought would be really cool to try is to integrate it at the LSP level, and use all that good stuff, but apparently people trying (I think there was a company from .il trying) either went closed or didn't release anything note worthy...
When the Copilot extension needs a new VS Code feature it gets added, but it isn't available to third party extensions until months later... Err, years later... well, whenever Microsoft feels like it.
So an extension will never be able to compete with Copilot.
I use Augment extensively and find it superior to cursor in every way - and operates as an extension. It has a really handy task planning interface and meta prompt refinement feature and the costs are remarkably low. The quality of output implantation is higher IMO and I don’t have to do a lot of model selection and don’t get Max model bill explosions. If there’s something Cursor provided that Augment doesn’t via extension it was not functionally useful enough to notice.
I think Augment has been flying under the radar for many people, and really reserve better marketing.
I've been using Augment for over a year with IntelliJ, and never understood why my colleagues were all raving about Cursor and Windsurf. I gave Cursor a real try, but it wasn't any better, and the value proposition of having to adopt a dedicated IDE wasn't attractive to me.
A plugin to leverage your existing tools makes a lot more sense than an IDE. Or at least until/if AI agents get so smart that you don't need most of the IDE's functionality, which might change what kinds of tooling are needed when you're in the passenger seat rather than the driver's seat.
I've really struggled with using the extensions - their UI/UX is worse, they're much more limited in what they can do and they're much more unstable (in IntelliJ at least).
Then again writing mostly kotlin I cannot get along with the VS Code forks as they're just not that great outside of typescript projects in my experience.
I tend to prompt in cursor/windsurf and refactor in IntelliJ which is okay but a bit of a pain.
One competitor to Claude Code that I don't hear much about is Jetbrains Junie. From my experience, the code it generates is as good as CC, and if you've purchased a Jetbrains license you probably have some amount of free Junie every month.
I'll fill in some context. I think the value of Cursor as an IDE is probably somewhat ephemeral. It's mostly combating Microsoft's ambitions to keep other players busy and box them out of the market. Agents gain a lot of value from model context protocol and there's an amazingly short list of clients that fully support the protocol, but the VSCode Chat window is one of them: https://modelcontextprotocol.io/clients
I actually do prefer the view that having the agent built into an IDE brings me but I'll be damned if I'm forced to use CoPilot/OpenAI. Second to that, the agent does have access to a lot more contextual tools by being built into the editor like focused linting errors and test failures. Of course that demands your development environment is setup correctly and could be replicatable with Claude Code to some extent.
I never got the valuation. I (and many others) have built open source agent plugins that are pretty much just as good, in our free time (check out magenta nvim btw, I think it turned out neat!)
> with a simple extension for some UX improvements
What are the UX improvements?
I was using the Pycharm plugin and didn’t notice any actual integration.
I had problems with pycharm’s terminal—not least of which was default 5k line scroll back which while easy to change was worst part of CC for me at first.
I finally jumped to using iterm and then using pycharm separately to do code review, visual git workflows, some run config etc.
But the actual value of Pycharm—-and I’ve been a real booster of that IDE has shrank due to CC and moving out of the built in terminal is a threat to usage of the product for me.
If the plugin offered some big value I might stick with it but I’m not sure what they could even do.
#1 improvement for VS Code users is giving the agent MCP tools to get diagnostics from the editor LSPs. Saves a tremendous amount of time having the agent run and rerun linting commands.
Does anyone have a comparison between this and OpenAI Codex? I find OpenAI's thing really good actually (vastly better workflow that Windsurf). Maybe I am missing out however.
Codex CLI is very bad, it often struggles to even find the file and goes on a rampage inside the home directory trying to find the file and commenting on random folders. Using o3/o4-mini in Aider is decent though.
> What does Cursor/Windsurf offer over VS Code + CC?
Cursor's @Docs is still unparalleled and no MCP server for documentation fetching even comes close. That is the only reason why I still use Cursor, sometimes I have esoteric packages that must be used in my code and other IDEs will simply hallucinate due to not having such a robust docs feature, if any, which is useless to me, and I believe Claude Code also falls into that bucket.
> Cursor's @Docs is still unparalleled and no MCP server for documentation
I strongly disagree. It will put the wrong doc snippets into context 99% of the time. If the docs are slightly long then forget it, it’ll be even worse.
What packages do you use it for? I honestly never had that issue, it's very good in my use cases to find some specific function to call or to figure out some specific syntax.
just curious because I'm inexperienced with all the latest tools here
> - Tab completion model (Cursor's remaining moat)
What is that? I have Gemini Code Assist installed in VSCode and I'm getting tab completion. (yes, LLM based tab completion)
Which, as an aside I find useful when it works but also often extremely confusing to read. Like say in C++ I type
int myVar = 123
The editor might show
int myVar = 123;
And it's nearly impossible to tell that I didn't enter that `;` so I move on to the next line instead of pressing tab only to find the `;` wasn't really there. That's also probably an easy example. Literally it feels like 1 of 6 lines I type I can't tell what is actually in the file and what is being suggested. Any tips? Maybe I just need to set some special background color for text being suggested.
and PS: that tiny example is not an example of a great tab completion. A better one is when I start editing 1 of 10 similar lines, I edit the first one, it sees the pattern and auto does the other 9. Can also do the "type a comment and it fills in the code" thing. Just trying to be clear I'm getting LLM tab completion and not using Cursor
It gets even worse when all three of IntelliSense, AI completion, and the human are all vying for control of the input. This can be very frustrating at times.
I use Windsurf so I remain in the driver's seat. Using AI coding tools too much feels like brain rot where I can't think sharply anymore. Having auto complete guess my next edit as I'm typing is great because I still retain all the control over the code base. There's never any blocks of code that I can't be bothered to look at, because I wrote everything still.
I often use the same setup. Qwen 2.5 coder is very good on its own, but my Emacs setup doesn’t also use web search when that would be appropriate. I have separately been experimenting with the Perplexity Sonar APIs that combine models and search, but I don’t have that integrated with my Emacs and Qwen setup - and that automatic integration would be very difficult to do well! If I could ‘automatically’ use a local Qwen, or other model, and fall back to using a paid service like Perplexity or Gemini grounding APIs just when needed that would be fine indeed.
I am thinking about a new setup as I write this: in Emacs, I explicitly choose a local Ollama model or a paid API like Gemini or OpenAI, so I should just make calling Perplexity Sonar APIs another manual choice. (Currently I only use Perplexity from Python scripts.)
If I owned a company, I would frequently evaluate privacy and security aspects of using commercial APIs. Using Ollama solves that.
Windsurf big claim to fame was that you could run their model in airgap and they said they did not train on GPL code. This was an option available for Enterprise customers until they took it away recently to prevent self hosting
I agree it has a good chance of catching up, but the difference in quality is pretty noticeable today. I'd much rather stick with vscode, because I hate all the subtle ways Cursor changes the UI; like taking over the keyboard shortcut for clearing the scrollback in the terminal. But I find it's pretty hard to use Copilot's tab completion after using Cursor for a while.
I think CC is just far more useful; I use it for literally everything and without MCP (except puppeteer sometimes) as it just writes python/bash scripts to do that far better than all those hacked together MCP garbage bins. It controls my computer & writes code. It made me better as well as now I actually write code, including GUI/web apps, that's are always fully scriptable. It helps me, but it definitely helps CC; it can just interrogate/test everything I make without puppeteer (or other web browser control, which is always brittle as hell).
CC would explode even further if they had official Team/Enterprise plan (likely in the work, Claude Code Waffle flag), and worked on Windows without WSL (supposedly pretty easy to fix, they just didn't bother). Cursor learnt the % of Windows user was really high when they started looking, even before they really supported it.
They're likely artificially holding it back either because its a loss leader they want to use a very specific way, or because they're planning the next big boom/launch (maybe with a new model to build hype?).
Cursor's multi-file tab completion and multi-file diff experience are worth $20 easily IMO.
I truly do not understand people's affinity for a CLI interface for coding agents. Scriptability I understand, but surely we could agree that CC with Cursor's UX would be superior to CC's terminal alone, right? That's why CC is pushing IDE integration -- they're just not there yet.
I can't stand the UX, or VS Code's UX in general. I vastly prefer having CC open in a terminal alongside neovim. CC is fully capable of opening diffs in neovim or otherwise completely controlling neovim by talking to its socket.
Fair enough. I guess a better way to put it is: for people who like Cursor's UX, but prefer Claude Code's performance as an agent, the combination of both would be the true killer app. Having to choose between these feels like a temporary gap in the evolution of these tools, and I'm ready for us to get past it.
I sympathize with your feeling that there is a "gap", but I'm fairly certain that both my ideal workflow and your ideal workflow are unlikely to be anything more than evolutionary dead ends, like early automobiles that inherited the shape of horse-drawn carriages.
I don't know where the evolution of coding agents will take us in the next couple of years, but I would not be shocked if it looks more like a GitHub issue/PR tracker than a code+chat interface with autocomplete, etc. I'm already noticing the I'm starting to rely on tux + multiple CC instances with independent work trees instead of babysitting each proposed change.
I strongly agree with you.
I’m more of a CLI guy, and Claude Code just works. Most good projects have a CLI anyway (gcloud, GitHub CLI, Vercel, etc.). I prefer CLI vs MCP’s.
I’m on the $200 plan, and it’s absolutely worth it (never thought I’d say this for a CLI app).
I don’t see how there will be any money to be made in this industry once these models are quantized and all local. It’s going to be one of the most painful bubble deflations we have ever seen and the biggest success of open source in our lifetimes.
The forked IDE thing I don't understand either, but...
During the evaluation at a previous job, we found that windsurf is waaaay better than anything else. They were expensive (to train on our source code directly) but the solution the offered was outperforming others.
A lot of engineers underestimate the learning curve required to jump from IDE to terminal. Multiple generations of engineers were raised on IDEs. It's really hard to break that mental model.
We are working to resolve this. It's still in preview but expect to see healthy Gemini 2.5 Pro allocations for subscription customers in the near future.
Wait a minute, have you often run out of the gemini cli free daily quota? Their free quota is very generous because they are trying to get market/mind share.
OK, thanks, I understand now what is happening. I use gemini-cli for one specific task at a time and sometimes just one 5 to 10 minute session a day. If I use long work sessions then I will add my API key and pay.
Claude Code is totally different paradigm. You don't edit your files directly so there is no tab autocomplete. It's a chat session.
There are IDE integrations where you can run it in a terminal session while perusing the files through your IDE, but it's not powering any autocomplete there AFAIK.
Yes or running claude code in the cursor/vscode terminal and watching the files change and then reviewing in IDE. I often like to be able to see an entire file when reviewing a diff, rather than just the lines that changed. Plus it's nice to have go-to-definition when reviewing.
Yes, it shows you the file diff. But generally, the workflow is that you git commit a checkpoint, then let it make all the changes it wants freely, then in your IDE, review what has changed since previous commit, iterate the prompts/make your own adjustments to the code, and when you like it, git commit.
Depending on what I'm doing with it I have 3 modes:
Trivial/easy stuff - let it make a PR at the end and review in GitHub. It rarely gets this stuff wrong IME or does anything stupid.
Moderately complex stuff - let it code away, review/test it in my IDE and make any changes myself and tell claude what I've changed (and get it to do a quick review of my code)
Complex stuff - watch it like a hawk as it is thinking and interrupt it constantly asking questions/telling it what to do, then review in my IDE.
Apparently they are, which is crazy to me. Zed agent mode shows modified hunks and you can accept/reject them individually. I can't imagine doing it all through the CLI, it seems extremely primitive.
As far as I can tell, terminal agents are inferior to hosted agents in sandboxed/imaged environments when it comes to concurrent execution and far inferior to assisted ide in terms of UX so what exactly is the point?. The "UI niceties" is the whole point of using cursor and somehow, everyone else sucks at it.
Done. Now you have a SOTA agentic AI with pretty forgiving usage limits up and running immediately. This is why it's capturing developer mindshare. The simplicity of getting up and going with it is a selling point.
Plus it’s straightforward to make Claude Code run agents in parallel/background just like Codex and Cursor, in local sandboxes: https://github.com/dagger/container-use
You’re missing the point tho. The point of the cli agent is that it’s a building block to put this thing everywhere. Look at CCs GitHub plugin, it’s great
CC on github just looks like Codex. I see your point, but it seems like all the big players basically have a CLI agent and most of them think that its just an implementation detail so they dont expose it.
Let's review the current state of things:
- Terminal CLI agents are several orders of magnitude less $$$ to develop than forking an entire IDE.
- CC is dead simple to onboard (use whatever IDE you're using now, with a simple extension for some UX improvements).
- Anthropic is free to aggressively undercut their own API margins (and middlemen like Cursor) in exchange for more predictable subscription revenue + training data access.
What does Cursor/Windsurf offer over VS Code + CC?
- Tab completion model (Cursor's remaining moat)
- Some UI niceties like "add selection to chat", and etc.
Personally I think this is a harbinger of where things are going. Cursor was fastest to $900M ARR and IMO will be fastest back down again.