Rumor About GPT-5 Changes Everything

ano-ther · 2025-01-17T12:24:00 1737116640

Hm

> Let me be clear: this is pure speculation. The evidence is public, but there are no leaks or insider rumors that confirm I’m right. In fact, I am building the theory with this post, not just sharing it. I don’t have privileged information—if I did, I’d be under an NDA anyway. The hypothesis feels compelling because it makes sense. And honestly, what more do I need to give the rumor mill a spin?

stingraycharles · 2025-01-17T12:39:49 1737117589

Maybe that raises eyebrows, but I for one appreciate this disclosure — too many opinions are presented as facts nowadays. In turn, it makes me read the article more seriously.

prmoustache · 2025-01-17T13:02:58 1737118978

I have the feeling on the other hand that we have plateaued quite quickly.

If you dig deep, what current AI companies do right now is not really better than what the chess and GO AI players were doing 30 years ago, the only difference being mostly the amount of storage available and the speed at which AI tools are able to use them.

And I have the feeling that we will already are maximizing on the amount and quality of data available, that we can't really grow and improve much better because the human quality content is already hidden by the sheer amount of uninteresting content being produced currently by AI tools. Also of what is left of human content, a huge majority is build to serve a marketing or political agenda, both using the same technics of lies, deception and fraud. What good can we get of tools built from lies and fantasies?

TradingPlaces · 2025-01-17T13:03:42 1737119022

The history of AI is big leaps followed by years of crawl.

Zardoz89 · 2025-01-17T14:13:14 1737123194

30 years ago Go AI was at best 20 kyu. Low rank professionals were giving AI 30 stone handicaps and fleecing them.

jvanderbot · 2025-01-17T12:49:32 1737118172

So, if I read this right, all the major AI companies are not releasing their highest performing models because it's too expensive to roll them out en masse, they are instead using it to (somehow, IANA ML expert) make their existing models cheaper to run.

If I'm reading this right, they're peaking on general performance (as a function of GPUs on hamster wheels), and instead are focusing on reducing costs for their most-demanded operating modes.

A testable prediction of this is then: "All major AI companies will roll out hyper-specialized models that exceed general models at the same price point" and simultaneously will boost profits. The game is to find the specializations that make the most money?

chilmers · 2025-01-17T13:10:10 1737119410

Well, I think the most interesting point made is that it may not make sense for AI companies to release models _at all_ as they approach AGI. Exposing models allows their capabilities to be easily copied, whereas keeping them internal lets you capture the profit from their output while protecting your IP.

For example, say OpenAI trained a hyper-intelligent model specialising in designing new small molecule drugs. Why make that publicly available at all? Why not just partner with a pharmaceutical manufacturer and make money from selling the drugs?

We’ve become accustomed to AI-as-a-service being the default, via chat or API, but this might just be a blip. After all OpenAI have made it clear that they only see their mission as making the “benefits” of AGI available to all, which doesn’t necessarily correspond with making AGI itself available to everyone.

doener · 2025-01-17T12:29:36 1737116976

It‘s a clickbaity headline, but at least I (as a layman) was not aware of the technique of AI distilling and found the concept interesting.

nottorp · 2025-01-17T12:44:33 1737117873

Didn't they say the same about gpt 3, gpt 3.x {x in 1..whatever}, gpt 4, gpt 4.x... ?

It's the marketing department at work.

Edit: "the same" being "it changes everything". Did not click on the click bait and don't intend to.

Xmd5a · 2025-01-17T13:13:31 1737119611

Related tweet by Sam Altman:

https://x.com/sama/status/1875603249472139576

If you can find it in this thread, an OpenAI employee hinted at the idea they have reached "data singularity": using a model to generate synthetic data to train the next version.

sirsinsalot · 2025-01-17T13:22:05 1737120125

It seems like that is undesirable. Surely you get compounding errors and a set of weights that skew towards weird.

I mean, just feed English to Japanese in a translator and back again, or encode a jpeg 100 times.

What stops them suffering the same entropy issues? Is there a reinforcement step with human feedback?

Xmd5a · 2025-01-17T14:08:44 1737122924

>What stops them suffering the same entropy issues? Is there a reinforcement step with human feedback?

It seems the situation changed with inference scaling/reasoning models (o1 and beyond). I found the tweet I was looking for, but it wasn't from an OpenAI employee as I seemed to recall:

https://x.com/kimmonismus/status/1879961110507581839

>>Gwern: we have reached the threshold of “recursively self-improving”, “where o4 or o5 will be able to automate AI R&D and finish off the rest.”

>Now that we have received various cryptic messages from renowned OpenAI scientists, “Gwern” assumes the following:

>- we have reached the threshold of “recursively self-improving”, “where o4 or o5 will be able to automate AI R&D and finish off the rest.”

>- the purpose of o1 is primarily to generate synthetic data for models like o3, which is why he is surprised that o1-pro was released at all

>- he thinks that Anthropic Opus 3.5 is not released for the same reason, the compute is needed to generate synthetic data

The rest of the tweet is interesting too, it starts with the same facts as this article, but makes the point for "reverse distillation" instead (i.e. using synthetic data to train the next model). In particular:

>If you look at the relevant scaling curves - may I yet again recommend reading Jones 2021?* - the reason for this becomes obvious. Inference-time search is a stimulant drug that juices your score immediately, but asymptotes hard. Quickly, you have to use a smarter model to improve the search itself, instead of doing more.

gwern.net

https://news.ycombinator.com/from?site=gwern.net

sirsinsalot · 2025-01-17T16:03:25 1737129805

Yeah that doesn't address the point. The output of an LLM is a compression. It has errors. Recursive training would seem to create iterations that become more noisy. There's no new information, just a lossy distillation of the previous iteration.

I'm not sure what stops that.

Xmd5a · 2025-01-17T17:40:12 1737135612

I'm not into ML so I can't answer your question specifically, but it seems your circular google translate or jpeg comparisons are missing the essential element: self-evaluation

https://x.com/search?q=synthetic%20data%20self-improvement

sirsinsalot · 2025-01-21T15:31:36 1737473496

You're not into ML, got it.

poisonborz · 2025-01-17T13:46:18 1737121578

Sorry but this is not only dumb speculation, but a harmful one. It spreads the cool aid all AI companies are sprinkling - maybe it actually is a marketing piece - that AI development is accelerating, (LLM-based) AGI is "almost" here, and in some secret labs and startups the Real Deal is already cooking.

No one could guard such a breakthrough secret. What actually is being rumored is that GPT-5 is not that much of a leap over current models AND it's expensive to run (especially compared the amount of money they already burning with current model). Which is, well, rather believeable considering the state of the industry.

monkeydust · 2025-01-17T13:10:55 1737119455

Its speculation but well argued and frankly somewhat convincing taking into account the overlay with Anthropic.

Remember they are all raising money hand-over-fist at high velocity so being able to throw in a story about improving margins whilst they march towards AGI its not a bad play.

herbst · 2025-01-17T12:31:41 1737117101

It sounds highly interesting and would maybe partially explain why current models appear to get more stupid, or rather more boring.