Hacker News new | past | comments | ask | show | jobs | submit login

Honestly I think being able to run any kind of LLM on a phone is a miracle. I'm astonished at how good (and how fast) Mistral 7B runs under MLC Chat on iOS, considering the constraints of the device.

I don't use it as more than a cool demo though, because the large hosted LLMs (I tend to mostly use GPT-4) are massively more powerful.

But... I'm still intrigued at the idea of a local, slow LLM on my phone enhanced with function calling capabilities, and maybe usable for RAG against private data.

The rate of improvement in these smaller models over the past 6 months has been incredible. We may well find useful applications for them even despite their weaknesses compared to GPT-4 etc.




How do you use GLT-4 frequently with how low the usage cap is?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: