Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There was no parameter creep with Llama. Llama 8B is actually a ~7B model comparable to Mistral 7B if you strip away multilingual embeddings and match what Mistral 7B supports.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: