Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Llama 3.3 70B 8-bit MLX runs on Macbook 128GB at 7+ tokens per second while running a full suite of other tools, even at the 130k tokens size, and behaves with surprising coherence. Reminded me of this time last year, first trying Mixtral 8x22 — which still offers a distinctive je ne sais quoi!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: