Earlier in the year I had ChatGPT 4 write a large, complicated C program. It did so remarkably well, and most of the code worked without further tweaking.
Today I have the same experience. The thing fills in placeholder comments to skip over more difficult regions of the code, and routinely forgets what we were doing.
Aside all the recent OpenAI drama, I've been displeased as a paying customer that their products routinely make their debut at a much higher level of performance than when they've been in production for a while.
One would expect the opposite unless they're doing a bad job planning capacity. I'm not diminishing the difficulty of what they're doing; nevertheless, from a product perspective this is being handled poorly.
Definitely degraded. I recommend being more specific in your prompting. Also if you have threads with a ton of content, they will get slow as molasses. It sucks but giving them a fresh context each day is helpful. I create text expanders for common prompts / resetting context.
eg:
Write clean {your_language} code. Include {whatever_you_use} conventions to make the code readable. Do not reply until you have thought out how to implement all of this from a code-writing perspective. Do not include `/..../` or any filler commentary implying that further functionality needs to be written. Be decisive and create code that can run, instead of writing placeholders. Don't be afraid to write hundreds of lines of code. Include file names. Do not reply unless it's a full-fledged production ready code file.
These models are black boxes with unlabeled knobs. A change that makes things better for one user might make things worse for another user. It is not necessarily the case that just because it got worse for you that it got worse on average.
Also, the only way for OpenAI to really know if a model is an improvement or not is to test it out on some human guinea pigs.
My understanding is they reduced the number of ensembles feeding gpt4 so they could support more customers. I want to say they cut it from 16 to 8. Take that with a grain of salt, that comes through the rumor telephone.
Are you prompting it with instructions about how it should behave at the start of a chat, or just using the defaults? You can get better results by starting a chat with "you are an expert X developer, with experience in xyz and write full and complete programs" and tweak as needed.
Yep, I'm still able to contort prompts to achieve something usable; however, I didn't have to do that at the beginning, and I'd rather pay $100/mo to not have to do so now.
OpenAI just had to pause signups after demo day because of capacity issues. They also switched to making users pay in advance for usage instead of billing them after.
They aren't switching anything with payments. Bad rumor amplified by social contagion and a 100K:1 ratio of people talking about it to people building with it.
Im not really sure what chatgpt+ is serving me. There was a moment it was suddenly blazing fast, that was around the time turbo came out. Off late, it's been either super slow or super fast randomly.
Try using the playground, with a more code specific system prompt, or even put key points/the whole thing into the system prompt. I see better performance, compared to the web.
Today I have the same experience. The thing fills in placeholder comments to skip over more difficult regions of the code, and routinely forgets what we were doing.
Aside all the recent OpenAI drama, I've been displeased as a paying customer that their products routinely make their debut at a much higher level of performance than when they've been in production for a while.
One would expect the opposite unless they're doing a bad job planning capacity. I'm not diminishing the difficulty of what they're doing; nevertheless, from a product perspective this is being handled poorly.