Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Why not? And I'm not being flippant, but like....isn't that the whole point of small models?


I guess different small models will have different points/goals, but you can still have a small model with lots of training effort or a large model with little training effort.

I think the point of most (frontier) small models is usually to provide the best answer possible given small inference resources, rather than to reduce training time.

This is more of a toy model, so fun and an interesting project but it doesn't necessarily tell us what the art of the possible is for small models.


For one thing, the model is trained on a language modelling task, not a question-answering task?


As I understand it, the most effective small models are synthesized from larger models.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: