Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Page 7 of their technical report [0] has a better apples to apples comparison. Why they choose to show apples to oranges on their landing page is odd to me.

[0] https://storage.googleapis.com/deepmind-media/gemini/gemini_...



I assume these landing pages are made for wall st analysts rather than people who understand LLM eval methods.


True, but even some of the apples to apples is favorable to Gemini Ultra 90.04% CoT@32 vs. GPT-4 87.29% CoT@32 (via API).


This isn't apples to apples - they're taking the optimal prompting technique for their own model, then using that technique for both models. They should be comparing it against the optimal prompting technique for GPT-4.


Showing dominance in AI is also targeted at their entreprise customers who spend millions on Google Cloud services.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: