Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Am I using it wrong? I have the gpt plus subscription, and can select "gpt4o" from the model list on ChatGPT, but whichever example I try from the example list under "Explorations of capabilities" on `https://openai.com/index/hello-gpt-4o/`, my results are worse:

* "Poetic typography" sample: I paste the prompt, and get an image with the typical lack of coherent text, just mangled letters.

* "Visual Narratives: Robot Writer's Block" - Mangled letters also

* "Visual Narratives: Sally the mailwoman" - not following instructions about camera angle. Sally looks different in each subsequent photo.

* "Meeting Notes with multiple speakers" - I uploaded the exact same audio file and used input 'How many speakers in this audio and what happened?'. gpt4o went off about about audio sample rates, speaker diarization models, torchaudio, and how its execution environment is broken and can't proceed to do it.



they haven’t released gpt-4o image capabilities yet, it defaults to dalle 3


Ah, I see. Seems like a weird product release? Since everything in the UI (and in the new ChatGPT macos app) says 'gpt4o' so I would expect at least something to work as shown in the demos. Or just don't show at all the 'gpt4o' in the UI if it's somehow a completely different 'gpt4o' from the one that can do everything on the announcements page. I don't mind waiting, but it was genuinely confusing to me.


link the chat?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: