Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Oh this makes sense. chatGPT results have taken a nose dive in quality lately.

It couldn't write a simple rename function for me yesterday, still buggy after seven attempts.

I'm more and more convinced that they dumb down the core product when they plan to release a new version to make the difference seem bigger.



99% chance that's confirmation bias


Sam tweeted that they're running out of computer. I think it's reasonable to think they may serve somewhat quantized models when out of capacity. It would be a rational business decision that would minimally disrupt lower tier ChatGPT users.

Anecdotally, I've noticed what appears to be drops in quality, some days. When the quality drops, it responds in odd ways when asked what model it is.


Who cares what that clown twits??


I mean, GPT 4.5 says "I'm ChatGPT, based on OpenAI's GPT-4 Turbo model." and o1 Pro Mode can't answer, just says "I’m ChatGPT, a large language model trained by OpenAI."

Asking it what model it is shouldn't be considered a reliable indicator of anything.


Interviewing deepseek as to its identity should absolve anyone of that notion.


> Asking it what model it is shouldn't be considered a reliable indicator of anything.

Sure, but a change in response may be, which is what I see (and no, I have no memories saved).


>It couldn't write a simple rename function for me yesterday, still buggy after seven attempts.

I'm surprised and a bit nervous about that. We intend to bootstrap a large project with it!!

Both ChatGPT 4o (fast) and ChatGPT o1 (a bit slower, deeper thinking) should easily be able to do this without fail.

Where did it go wrong? Could you please link to your chat?

About my project: I run the sovereign State of Utopia (will be at stateofutopia.com and stofut.com for short) which is a country based on the idea of state-owned, autonomous AI's that do all the work and give out free money, goods, and services to all citizens/beneficiaries. We've built a chess app (i.e. a free source of entertainment) as a proof of concept though the founder had to be in the loop to fix some bugs:

https://taonexus.com/chess.html

and a version that shows obvious blunders, by showing which squares are under attack:

https://taonexus.com/blunderfreechess.html

One of the largest and most complicated applications anyone can run is a web browser. We don't have a web browser built, but we do have a buggy minimal version of it that can load and minimally display some web pages, and post successsfully:

https://taonexus.com/publicfiles/feb2025/84toy-toy-browser-w...

It's about 1700 lines of code and at this point runs into the limitations of all the major engines. But it does run, can load some web pages and can post successfully.

I'm shocked and surprised ChatGPT failed to get a rename function to work, in 7 attempts.


with 4.5? Why? It's only meant for creative writing.


No, we used o1.


Yep I realized about that many time ago, they are literally scammers




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: