Honestly I think being able to run any kind of LLM on a phone is a miracle. I'm astonished at how good (and how fast) Mistral 7B runs under MLC Chat on iOS, considering the constraints of the device.
I don't use it as more than a cool demo though, because the large hosted LLMs (I tend to mostly use GPT-4) are massively more powerful.
But... I'm still intrigued at the idea of a local, slow LLM on my phone enhanced with function calling capabilities, and maybe usable for RAG against private data.
The rate of improvement in these smaller models over the past 6 months has been incredible. We may well find useful applications for them even despite their weaknesses compared to GPT-4 etc.
I don't use it as more than a cool demo though, because the large hosted LLMs (I tend to mostly use GPT-4) are massively more powerful.
But... I'm still intrigued at the idea of a local, slow LLM on my phone enhanced with function calling capabilities, and maybe usable for RAG against private data.
The rate of improvement in these smaller models over the past 6 months has been incredible. We may well find useful applications for them even despite their weaknesses compared to GPT-4 etc.