Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

People would only be 'throwing their hands up' because commercial LLMs have set unreasonable expectations for folks.

Anyone who has a/the need for or understands the value of a local LLM would be OK with this kind of output.




Everyone has the need for on device LLM, if the response rate was fast!


I have MLCCHAT on my old Note 9 phone. It is actually still a great phone, but has 5GB RAM. Running an on device model is the first and only use case the RAM actually matters.

And it has a headphone jack, OK? I just hate Bluetooth earbuds. And yeah, it isna problem, but I digress.

When I run a 2.5B model, I get respectable output. Takes a minute or two to process the context, then output begins at somewhere on the order of 4 to 10 tokens per sec.

So, I just make a query and give it a few and I have my response.

Here is how I see it:

That little model, which is Gemma 2.2b sorry, knows a lot of stuff. It has knowledge I don't and it gives it to me in a reasonable, though predictable way. Answers are always of a certain teacher reminding student how it all goes way.

I don't care. Better is nice, but if I were stuck somewhere with no network, being able to query that model is amazing!

First aid, how to make fires, materials and uses. Fixing stuff, theories of operation, what things mean and more are in that thing ready for me to take advantage of.

I consider what I have fast. And it will get one or two orders faster over the next few years too.

I did it on a lark (ask the model what that means) and was surprised to see I gained a nice tool.


> First aid, how to make fires, materials and uses

This scares me more than it should...

Please do not trust an AI in actual life and death situations... Sure if it is literally your only option, but this implies you have a device on you that could make a phone call to an emergency number where a real human with real training and actually correct knowledge can assist you.

Even as an avid hiker the amount of times I've been out off cell service is miniscule and I absolutely refresh my knowledge on first aid regularly and any potential threats before a hike somewhere new.


Of course! I do the same. However, I won't deny being able to get some information, even if I must validate it with care, jn a pinch is a great thing.

It just a tool in the tool box. Like any tool, one must respect and use it with care.


I asked the question. It did a respectable job.

For choking, as an example, it outlines all the basics, affirmed it was educational info only, and urged the help of a professional.

All in all, reasonable.

Your point stands, but this little model 2.5B has clear bounds it seems.

I plan to poke at it a few more times asking for higher risk info, curious about what it may do.


I fired the model up at 8:39, ready to chat just a bit before 8:40, during which I input the following prompt:

Please define "lark" for me, as in "I did it on a lark."

Output began a bit before 8:41, and was complete before 8:43:

---Begin Model Output

"On a lark" means without much planning or reason, impulsively or unexpectedly.

Here's how it works:

"I did it on a lark." This means the person acted without a specific plan or reason. "A lark" refers to the lark bird, a small songbird known for its playful nature.

So, "on a lark" suggests something spontaneous, casual, and often with a touch of playful whimsy.

Here are some other ways to understand it:

Without much thought: The person was likely to do it quickly and without much consideration. Based on a whim: It's something they decided to do without a solid plan.

Let me know if you'd like more examples

---End Model Output

To me, that is pretty damn spiffy!


I opened Dictionary.app and typed in lark and got the definition with etymology in under 30 seconds.

> British English informal used to suggest that an activity is foolish or a waste of time: he's serious about this music lark.


You know that's not the point. You can't even ask the dictionary app a question. You more or less just input and it'll give you what it has.

I can ask this model some pretty intense things and I get meaningful output back and it doesn't take very long and I don't need an internet connection. Like I said it's spiffy.

I will, have, and will use it again sans network connection and be happy I had the option.

Better will come. That is just fine. I have something very useful today, right now.


I use commercial LLMs every day. The best of them can still be infuriating at times to the point of being unproductive. So I'm not sure I agree here.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: