Llama 3.3 70B 8-bit MLX runs on Macbook 128GB at 7+ tokens per second while runn...

Llama 3.3 70B 8-bit MLX runs on Macbook 128GB at 7+ tokens per second while running a full suite of other tools, even at the 130k tokens size, and behaves with surprising coherence. Reminded me of this time last year, first trying Mixtral 8x22 — which still offers a distinctive je ne sais quoi!