Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Try llama 3.3 70B. On groq or something. Runs on a 64GB macbook (4bit quantized, which seems to not impact quality much). Things have come a long way. Compare to llama 2 70b. It's wild


Llama 3.3 70B 8-bit MLX runs on Macbook 128GB at 7+ tokens per second while running a full suite of other tools, even at the 130k tokens size, and behaves with surprising coherence. Reminded me of this time last year, first trying Mixtral 8x22 — which still offers a distinctive je ne sais quoi!




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: