Hacker News new | past | comments | ask | show | jobs | submit login

Also evidenced by 3B and 7B models becoming ever more capable, gaining qualities that were previously only achieved by models a magnitude larger.

Bigger models are more capable, but smaller models can be iterated on faster. For a couple months now most of the impressive achievements have been in increasingly smaller and cheaper models. Deepseek just has the perfect storm of impressive results, accessibility and international rivalry that made it go viral




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: