Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The Gemma models are too small to be included in this list.

You're right the T5 stuff is very important historically but they're below 11B and I don't have much to say about them. Definitely a very interesting and important set of models though.



> too small

Eh?

* Gemma 1 (2024): 2B, 7B

* Gemma 2 (2024): 2B, 9B, 27B

* Gemma 3 (2025): 1B, 4B, 12B, 27B

This is the same range as some Llama models which you do mention.

> important historically

Aren't you trying to give a historical perspective? What's the point of this?


Since you included GPT-2, everything from Google including T5 would qualify for the list I would think.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: