Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is an interesting overview, thank you. Different tasks, different models, all-day-usage and pretty complete (while still opinionated, which I like).

However, checking the results my personal overall winner if I had to pick only ONE probably would be

  deepseek/deepseek-chat-v3-0324
which is a good compromise between fast, cheap and good :-) Only for specific tasks (write a poem...) I would prefer a thinking model.


They released deepseek/deepseek-chat-v3.1 shortly after I did the evals, and that's what I now use 20+ times a day for all my questions. It replaces chat-v3 and r1, depending on whether you enable reasoning or not.


You can slightly improve output of non-thinking model if you add ad the end of prompt "output chain of though reasoning before outputting the result".




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: