Hacker News new | past | comments | ask | show | jobs | submit login

> DeepSeek does not "do for $6M what cost US AI companies billions". I can only speak for Anthropic, but Claude 3.5 Sonnet is a mid-sized model that cost a few $10M's to train (I won't give an exact number). Also, 3.5 Sonnet was not trained in any way that involved a larger or more expensive model (contrary to some rumors).

Wow!




The narrative is running away in every direction.

There are popular infographics floating around today which explain to the layman that deepseek invented MoE.


Shit... I was just looking at the DeepSeekMoE paper, almost exactly from 1 year ago:

https://arxiv.org/html/2401.06066v1

The red star in the first chart is so funny.


^ This is publicly new information, and the 2nd part especially contradicts consequential rumours that had been all-but-cemented in closely-following outsiders' understanding of Sonnet and Anthropic. Completely aside from anything else in this article.


Also, though it's not "new information": "Making AI that is smarter than almost all humans at almost all things [...] is most likely to happen in 2026-2027." continues to sail over everybody's head, not a single comment about it, even to shit on it. People will continue to act as though they are literally blind to this, as though they literally don't see it.


> People will continue to act as though they are literally blind to this, as though they literally don't see it.

Or like they see it and have learned the appropriate weight to give unsupported predictions of this type from people with a vested interest in them being perceived as true. It not only not new information, its not information at all.


I find that really, really, really hard to believe, given the current approaches.


we're getting used to it

and, personally, i think, if any CEO in this industry dare to say "we won't get a super AI in 2028"

many people will be disappointed, some people will be scared, and one prisident be pissed off


Anthropic is, according to themselves, using RLAIF... which is basically using LLM as a judge / reward model. So maybe he means that the models they use for RLAIF are not (much?) more expensive than Sonnet 3.5 (e.g. previous Sonnet or Haiku 3 :)).


Do you have a link to Anthropic saying they use RLAIF?





Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: