For personal use I already did a few months back. Dario is more competent than Sam, but even shadier (IMHO).
Anyway, switched to Openrouter through forgecode (or pi/opencode, the jury is still out on this one).
It will take a while, but I believe that also businesses will at least hedge against US companies basically being forced to geo-fence their models. For now is Fable, but they can include any model at any time.
For personal stuff I use forgecode with openrouter. Firstly, forgecode is a much better harness than Cloude code (IMHO).
Anyway, regarding the models, my experience is that there is not much difference in terms of quality, but the cost difference is insane. At least for how I use agents. Yesterday's example is the following: I am developing a small DSL for search across complex technical documents. I wanted to add a small operator to it and thought that to give fable a spin. It burned through 13 USD and while it delivered the solution it wasn't objectively better than what Deepseek v4 did for 1.7 dollars (same exact task because I was curious).
For full disclosure, I ask agents for piecemeal stuff. Like in the DSL case, I designed the operators and then asked agents to implement them one by one. Probably if I asked to design the whole thing starting from these complex documents Fable would shine, but every time I try to give agents broader scope tasks they burn through millions of tokens, generate questionable code, which I have to spend time familiarize myself with.
It is very basic and I am no DSL expert, but my idea was to build a graph from those complex documents (maintenance manuals) a that to decide what tools can be used for a given part on a given equipment in a given situation. If there is a path from A to Z it means you can use that tool given the circumstances. Basically the DSL is about pruning the graph as you specify things. I could have very well done without, but it is a fun project to try out rust, so I said, why not :)
Yeah I agree. As the saying goes: trust is built in drops and lost in buckets. A lot of time needs to pass while the US behaves as a proper ally if they want to go back to the status quo. At least IMHO as this is how I feel as an EU citizen.
I believe that county specific studies seem to support your thesis. For instance, countries that eat less processed food (eg Italy) and have stricter rules about pesticides didn't see an increase in stuff like colorectal cancer [1]. Some cancers incidence did grow, but others decreased keeping incidence more or less the same.
Yeah, it is crazy to me. Yesterday I did the math how much it would take to fully replace "just" 1M SWEs: https://news.ycombinator.com/item?id=48382414 . It turns out you need 380GW of constant power (or 80%+ of US current production). And I conservatively assumed 0.5J / token, which was a number calculated for llama3 8B parameters. Yeah, hardware and models are more efficient now, but I expect SOTA models to be at least 10x that and I don't think there was a 10x in efficiency since llama 3.
All of this to say that the AI hype is not considering the energy portion of the equation enough. It won't automate everything not because it can't but because there is just not enough energy to go around unless there is a 100x or more efficiency gain just around the corner.
I am pretty familiar with a 500k LOC codebase. If for every feature request/bug the agent has to go through a lot of it, spend a gazillion thinking tokens for understanding what it needs to do, plan, and then execute (assuming it gets it right) given the current cost of tokens I argue I am often more cost effective.
In fact, I believe that the most cost effective way is a collab of human+agent. Ie giving the agent direction as it goes along with the plan I can cut the thinking while keeping the speed. Basically helping the agent going from a breadth first search into a guided depth first one which is much more token efficient.
Additionally, humans have long term memory and knowledge of the context around your codebase. Agents do not, and while you can fit a lot in 1M context window, once you fill that the quality goes down considerably.
Not to mention the fact that even the most well documented codebase will have documentation blindspots about real-world concerns or limitations that LLMs cant know about. Cursor yesterday tried to remove a document format from the codebase because it was convinced that it was non-existant, turns out that not only does it exist and is vitally important for our shipping process, but also the API it comes from does not document its existence at all.
This is why you can't be replaced today. I'm not sure you can rely on that remaining true for very long. And this goes for the vast majority of us.
To be clear, I'm also not saying LLMs will definitely displace a lot of us very soon. I'm just saying I wouldn't be surprised by either outcome and I don't know how anyone claims to know one way or another given the past year or so of progress.
Im curious if that point comes before it automates away the entire mid-upper management caste.
In a hypothetical world where LLMs have enough context window and "understanding" to have no need for an experienced user to give inputs I would assume its also going to have enough information to make most business decisions and provide well formatted info to the C-Suite.
I think you are assuming cost per task will become cheaper and that there is unlimited energy supply.
While tokens costs are going down, the number of token burned is going up and up. Case in point Sam Altman is complaining about their top token users burning through 100B tokens per month [1]. So you have token prices going down but token usage going up 10x per year (if you extrapolate linearly from what Sam was ranting about). This is happening because people trust more and more LLMs and give them more autonomy and more complex tasks (IMHO).
So if you really need a true unsupervised agent that replaces SWEs you need how probably much more than that. Say 20x that number (2T tokens/month) for each SWE. I'm gonna focus on the energy part as this is more tangible. Trying with some realistic numbers:
- To replace 1M SWEs for a year you need 2T tokens/month * 12 months * 1M SWEs ( = 2.410^19 tokens)
- Assuming 0.5J per token you get 1.210^19J [2] (I took the number for an llama3 8B model, probably is much more for SOTA models IMHO).
- A year has 31M seconds
- Over a year that is 380 GW of constant power that is needed only for replacing 1M SWEs and that is around 80% of all the current US energy consumption (450GW). And apparently there are 47ish Million SWEs globally as of 2025 [3]
I don't think there is enough power capacity to deliver all of this without pivoting all of society into building data centers and power plants.
So unless there is some breakthrough in efficiency/intelligence (ie you need way fewer tokens for what you have to do) your job is gonna be safish at least.
Of course I pulled that 20x out of my ass, but I believe it is somewhat realistic for a truly autonomous agent(s) that replace SWEs.
> - Over a year that is 380 GW of constant power that is needed only for replacing 1M SWEs and that is around 80% of all the current US energy consumption (450GW). And apparently there are 47ish Million SWEs globally as of 2025 [3]
I think the economics here work out as "OK, so we've bought 80% the electricity in the US and used this to sell software to the 96% of humans not living in the US; this is profitable for the businesses, so nobody with money cares about the Americans who now literally can't afford to keep refrigerators running because we outbid them".
I think it is not so clear cut. I mean, the multi-model nature it is pretty neat. Yes, you can use pgvector on PostgreSQL, but here you also have native graph support. If you want to have both you need to also add something like apache AGE, but arguably that is also a small ecosystem (at least IMHO as I never heard it until I actually started looking for Neo4J alternatives). Also, pgvector has a hard limit on embedding size, while surrealdb does not. For instances in which you have less than 1M elements and retrieval performance matters surreal already has an advantage.
In my personal opinion is a great overall product. Probably not the best at anything, but close enough without having to fiddle with PostgreSQL extensions or adding another piece of machinery to support graph workloads.
The only thing I don't like is that they didn't use either pure SQL nor Cipher for the query(ies) language(s). They roll their own blend, meaning that you will likely need more work to move in the ecosystem and you can't fully use the muscle memory of users that worked with other DBs before.
> ...add something like apache AGE, but arguably that is also a small ecosystem (at least IMHO as I never heard it until I actually started looking for Neo4J alternatives)
Outside of the most trivial use cases, I've found that AGE will not get anywhere near Neo4j in terms of performance and there's a lot of edge cases that just flat out won't work. The interesting types of queries you'd want to do in the graph end up being quite limited in AGE openCypher; I could not write very complex Cypher that would otherwise work well in Neo4j.
I appreciate having the option, but for most use cases on Pg, you are better off just using JOINs or switch to Neo4j for your graph workloads. I switched some workloads back to using different approaches of approximating "connectedness" in Pg (e.g. using Jaccard similarity)
I think that most of western Europe is fucked in that sense. Previous generations piled up a ton of debt and so now there is no breathing room for subsidize affordable housing, parenting benefits, ... . Add to that that you need 2 salaries to live, that you must study until when you are 25+ to have a degree, that you don't want to have a baby just when you career is starting, that housing is extremely expensive, that you need to save for your pension because clearly the pension system will be drastically neutered in the future due to the above mentioned debt, ... . Basically, there are a bunch of factors that stop many many couples from having children.
I'm Italian so the situations is similar if not worse back at home.
In some countries (like Switzerland) you don't have any capital gain tax __unless_ you are a professional investor. What makes you a professional investor? One of the things that can elevate you to that status is the amount of trades you make.
So I am sure this is not viable for many people as buying an ETF counts like 1 trade, but investing the same money in the underlying assets count like 10s of trades.
Unless you have huge amount of money to play, there is no need to buy dozens of stocks every month. If you already have a portfolio of several stocks, you can buy just one or two every month and increase your portfolio. If you are just starting, you can buy a few more, or decide to start just with the most boring and safe stocks like coca-cola or IBM.
Fully agree. We really should punish companies that blatantly push this kind of mercenarism. I mean, every VP and CxO join a company, he/she takes super short-sighted decisions that push some random metric a bit up, and then they leave with a huge performance bonus not caring if everything is worse. They won't be around to cope with the fallout as they are already in another company doing the same.
I am not again performance bonuses, but they should be attach to better metrics. Eg the number of happy users is still up in 3 years time. Or something like this.
Anyway, switched to Openrouter through forgecode (or pi/opencode, the jury is still out on this one).
It will take a while, but I believe that also businesses will at least hedge against US companies basically being forced to geo-fence their models. For now is Fable, but they can include any model at any time.
reply