Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is so true. Another thing, a model might be better at something in general, but worse if the context is too long. Looking at how GLM-4.5 is trained, on lots of short context, this may be the case for it.

GPT-5: Exceptional at abstract reasoning, planning and following the intention behind instructions. Concise and intentional. Not great at manipulating text or generating python code.

Gemini 2.5 Pro: Exceptional at manipulating text and python, not great at abstract reasoning. Verbose. Doesn't follow instructions well.

Another thing I've learned is that models work better when they work on code that they themselves generated. It's "in distribution" and more comprehensible to them.



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: