Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> In the early days of ride share it was an amazing advancement and very cheap, because it was highly subsidized.

This is not an analogous situation.

Inference APIs aren’t subsidised, and I’m not sure the monthly plans are any more either. AI startups burn a huge amount of money on providing free service to drive growth. That’s something they can reduce at any time without raising costs for their customers at all. Not to mention the fact that the cost of providing inference is plummeting by several orders of magnitude.

Uber weren’t providing free service to huge numbers of people, so when they wanted to turn a profit they couldn’t reduce there and had to raise prices for their customers. And the fees they pay to drivers didn’t drop a thousandfold so it wasn’t getting vastly cheaper to provide service.





The unit economics of these models and APIs are really ugly. Those saying they are not losing money on inference likely are only doing so when making up funky non-GAAP accounting thinking. It’s the old “we’re making money when you ignore all the places we’re spending money” argument.

When you factor in the R&D costs required to make these models and the very limited lifespan of a model (and thus extremely high capital investment depreciation rate) the numbers are pretty nasty.


Very well said. For what it’s worth, it’s the exact same “logic” public cloud providers also used to get customers off the hardware they already owned and onto hardware they’d rent forever. Some of the most successful businesses out there - tobacco companies, casinos, SaaS, fossil fuel providers, car companies, cloud providers, etc - are masterfully adept at weaving narratives that keep their customers focused on short or mid-term results over long-term costs, and AI is no different.

Sure, if all you ever look at are the token costs, the inferencing costs at the edge, then the narrative that this will never skyrocket in price and the gates to the walled garden won’t ever close seems believable. Once you factor in the R&D, the training, the data acquisition, the datacenters, the electricity and water and real estate and lobbying and shareholder returns…

It’ll be the most expensive product your company pays for, per seat, by miles, once the subsidy period ends and the real bills come due. Even open-weight models are likely to evaporate or shift to some sort of Folding@Home type distributed training model to keep costs low.


R&D costs don't have to be sustainable.

If the trend of staggering AI performance gains stops, you can afford to cut down on R&D and remain competitive. If it doesn't, you hit AGI and break the world economy - with a hope that it'll break in your favor.


If the performance gains stop then everything becomes a commodity and then a race to the bottom on pricing. It’s not a pretty picture.

And companies that already invested into massive amounts of compute for AI training? They're positioned to win that race.

They get to convert that compute to inference compute, pivot their R&D towards "figure out how to make inference cheaper" and leverage all the economies of scale.


Lots of chat about this:

> Inference APIs aren’t subsidised

This is hard to pin down. There are plenty of metal companies providing hosted inference at market rates (i.e. assumed profitably if heading towards some commodity price floor). The premise that every single one of these companies is operating at a loss is unlikely. The open question is about the "off-book" training costs for the models running on these servers: are your unit economics positive when factoring training costs. And if those training costs are truly off-book, it's not a meritless argument to say the model providers are "subsidizing" the inference industry. But it's not a clear cut argument either.

Anthropic and OpenAI are their own beasts. Are their unit economics negative? Depends on the time frame you're considering. In the mid-longer run, they're staking everything on "most decidedly not negative". But what are the rest of us paying on the day OpenAI posts 50% operating margins?


What makes you think these things aren’t subsidized? It would be very impressive if Claude was making money off of their $20/month users that hit their weekly limits.

> What makes you think these things aren’t subsidized?

You can pay Amazon or a great many other hosting providers for inference for a wide variety of models. Do you think all of these hosting providers are burning money for you, when it’s not even their model and they have no lock-in?

> It would be very impressive if Claude was making money off of their $20/month users that hit their weekly limits.

They have been adjusting their limits frequently, and those whole point of those limits is to control the cost of servicing those users.

Also:

> Unit economics of LLM APIs

> As of June 2024, OpenAI's API was very likely profitable, with surprisingly high margins. Our median estimate for gross margin (not including model training costs or employee salaries) was 75%.

> Once all traffic switches over to the new August GPT-4o model and pricing, OpenAI plausibly still will have a healthy profit margin. Our median estimate for the profit margin is 55%.

https://www.lesswrong.com/posts/SJESBW9ezhT663Sjd/unit-econo...

And more discussion on Hacker News here:

https://news.ycombinator.com/item?id=44161270


> As of June 2024, OpenAI's API was very likely profitable, with surprisingly high margins. Our median estimate for gross margin (not including model training costs or employee salaries) was 75%.

> Once all traffic switches over to the new August GPT-4o model and pricing, OpenAI plausibly still will have a healthy profit margin. Our median estimate for the profit margin is 55%.

"likely profitable", "median estimate"... that 75% gross margin is not based on hard numbers.


It doesn't matter if they make any profit off those who hit the limits.

It's about how many of the users hit those limits.


> Inference APIs aren’t subsidised

A lot of people disagreed with this point when I posted it, however Sam Altman said last week:

> We're profitable on inference. If we didn't pay for training, we'd be a very profitable company.

https://www.axios.com/2025/08/15/sam-altman-gpt5-launch-chat...


> inference APIs aren’t subsidised

How do you get to that conclusion? There is no inference without training, so each sale of a single inference token has a cost that includes both the inference as well as the amortised cost of training.


That's what the post means. OpenAI doesn't lose money on each request but rather gains it. To recuperate the fixed costs in R&D.

> OpenAI doesn't lose money on each request but rather gains it. To recuperate the fixed costs in R&D.

Right, but that is why I used the word "amortise"; there is only a limited time to recuperate that cost. If you spend $120 in training, and it takes 6 months for the next SOTA to drop from a competitor, you have to make clear $10/m after inference costs.


Sure they are - the big companies are dumping billions in capital on it, and the small companies are getting a firehose of venture, sovereign and pe to build stuff.

The way the big AI players are playing supports the assertion that the LLM is plateuing. The differentiator between OpenAI, Gemini, Copilot, Perpexity, Grok, etc is the app and how they find novel ways to do stuff. The old GPT models that Microsoft uses are kneecapped and suck, the Copilot for Office 365 is pretty awesome because it can integrate with the Office graph and has alot of context.


> Inference APIs aren’t subsidised

This made me laugh. Thanks for making my Friday a little bit better.


Almost no business works like this - every additional request does not make OpenAI lose money but rather gain it.

The fixed cause due to R&D is what makes it unprofitable but not each request. Your line of thinking is bit ridiculous because OpenAI is never going to lose money per request.


> The fixed cause due to R&D is what makes it unprofitable but not each request. Your line of thinking is bit ridiculous because OpenAI is never going to lose money per request.

We don't know this for sure. I agree that it would be insane from a business perspective, but I've seen so many SV startups make insane business decisions that I tend to lean towards this being quite possible.


> The fixed cause due to R&D is what makes it unprofitable but not each request.

If the amortisation period is too short (what is it now? 8 months? 6 months?) that "profit" from each inference token has to cover the training costs before the amortisation schedule ends.

In short, if you're making a profit of $1 on each unit sold, but require a capex of $10 in order to provide the unit sold, you need to sell at least 10 of those units to break even.

The training is the capex, the inference profit is the profit/unit sold. When a SOTA model lasts only 8 months, the inference has to make all that back in 8 months in order to be considered profitable.


You are describing a subsidy.

If your kid makes $50 with a lemonade stand, she thinks she made $50, because she doesn't account for the cost of the lemonade, table, lawn, etc. You're subsidizing your child.


I agree its subsidised but crucial point being that each API doesn't cost them but gives them profit. If R&D were to be stopped now they would be profitable.

> If R&D were to be stopped now they would be profitable.

Not until the cost of the previous training has been completed amortised.

Even if some company did immediately stop all training, they would only show a profit until the next SOTA model is released by a competitor, and then they would would go out of business.

None of them have any moat, other than large amounts of venture capital. Even if there is a single winner at the end of all of this, all it would take is a much smaller amount of capital to catch up.


No it gives them income. Profit is when all costs are subtracted.

Correct, I misspoke.

Nope, it costs billions to train and run those models, they are operating at a loss.

> AI startups burn a huge amount of money on providing free service to drive growth.

Of the pure-play companies, only OpenAI do this. Like, Anthropic are losing a bunch of money and the vast majority of their revenue comes from API usage.

So, either the training costs completely dominate the inference costs (seems unlikely but maybe) or they're just not great businesses.

I do think that OpenAI/Anthropic are probably hiring a lot of pre and post sales tech people to help customers use the products, and that's possibly something that they could cut in the future.


> Of the pure-play companies, only OpenAI do this. Like, Anthropic are losing a bunch of money and the vast majority of their revenue comes from API usage.

I’m not sure I understand you. You can use Claude for free just like you can use ChatGPT for free.


> I’m not sure I understand you. You can use Claude for free just like you can use ChatGPT for free.

For basically an hour. Like, have you tried to do this? I have, and ended up subscribing pretty soon.

Additionally, if you look at Anthropic's revenue the vast, vast majority comes from API (along with most of their users). This is not the case for OpenAI, hence my point.


> Inference APIs aren’t subsidised

I may be wrong, but wasn’t compute part of Microsoft’s 2019 or 2023 investment deals with OpenAI?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: