Oh this makes sense. chatGPT results have taken a nose dive in quality lately. I...

wilg · 2025-02-27T20:16:17 1740687377

99% chance that's confirmation bias

nomel · 2025-02-27T21:01:50 1740690110

Sam tweeted that they're running out of computer. I think it's reasonable to think they may serve somewhat quantized models when out of capacity. It would be a rational business decision that would minimally disrupt lower tier ChatGPT users.

Anecdotally, I've noticed what appears to be drops in quality, some days. When the quality drops, it responds in odd ways when asked what model it is.

anti-soyboy · 2025-02-28T13:17:02 1740748622

Who cares what that clown twits??

wilg · 2025-02-28T00:18:42 1740701922

I mean, GPT 4.5 says "I'm ChatGPT, based on OpenAI's GPT-4 Turbo model." and o1 Pro Mode can't answer, just says "I’m ChatGPT, a large language model trained by OpenAI."

Asking it what model it is shouldn't be considered a reliable indicator of anything.

fragmede · 2025-02-28T00:20:02 1740702002

Interviewing deepseek as to its identity should absolve anyone of that notion.

nomel · 2025-02-28T03:06:56 1740712016

> Asking it what model it is shouldn't be considered a reliable indicator of anything.

Sure, but a change in response may be, which is what I see (and no, I have no memories saved).

logicallee · 2025-02-27T21:52:32 1740693152

>It couldn't write a simple rename function for me yesterday, still buggy after seven attempts.

I'm surprised and a bit nervous about that. We intend to bootstrap a large project with it!!

Both ChatGPT 4o (fast) and ChatGPT o1 (a bit slower, deeper thinking) should easily be able to do this without fail.

Where did it go wrong? Could you please link to your chat?

About my project: I run the sovereign State of Utopia (will be at stateofutopia.com and stofut.com for short) which is a country based on the idea of state-owned, autonomous AI's that do all the work and give out free money, goods, and services to all citizens/beneficiaries. We've built a chess app (i.e. a free source of entertainment) as a proof of concept though the founder had to be in the loop to fix some bugs:

https://taonexus.com/chess.html

and a version that shows obvious blunders, by showing which squares are under attack:

https://taonexus.com/blunderfreechess.html

One of the largest and most complicated applications anyone can run is a web browser. We don't have a web browser built, but we do have a buggy minimal version of it that can load and minimally display some web pages, and post successsfully:

https://taonexus.com/publicfiles/feb2025/84toy-toy-browser-w...

It's about 1700 lines of code and at this point runs into the limitations of all the major engines. But it does run, can load some web pages and can post successfully.

I'm shocked and surprised ChatGPT failed to get a rename function to work, in 7 attempts.

UrineSqueegee · 2025-02-28T10:42:42 1740739362

with 4.5? Why? It's only meant for creative writing.

logicallee · 2025-02-28T11:03:29 1740740609

No, we used o1.

anti-soyboy · 2025-02-28T13:16:15 1740748575

Yep I realized about that many time ago, they are literally scammers