Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm all for running as much on the edge as possible, but we're not even close to being able to do real-time inference on Frontier models on Macs or iPads, and that's just for vanilla LLM chatbots. Low-precision Llama 3-8b is awesome, but it isn't a Claude 3 replacer, totally drains my battery, and is slow (M1 Max).

Multimodal agent setups are going to be data center/home-lab only for at least the next five years.

Apple isn't about to put 80GB on VRAM in an iPad for about 15 reasons.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: