Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

LLMs (large language models) running locally in privacy are the next killer app. Mac Studio can already run 180b models that compete with Gpt-3.5, 2-3 more generations and MacBook airs will be able to run locally. Apple is already deploying smaller LLMs to macOS. This will drive the next hardware cycle


I don't know of any 180B models that outperform gpt-3.5, including Falcon (according to benchmarks).

There are fine-tuned 70B models have higher benchmark scores than gpt-3.5, and when quantized they can run on a 64GB Macbook.

https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderb...

https://huggingface.co/spaces/gsaivinay/open_llm_leaderboard


Benchmarks don't necessarily reflect real-world performance. Especially given that they poorly measure more esoteric aspects of the model that, for now, can only be judged qualitatively. I would wait for a bit to see what the community comes up with before writing off Falcon-180B.


Note that quantization has serious impact on the models capabilities.


> Apple is already deploying smaller LLMs to macOS

hm?


I've only dabbled in development in xcode a little. Most of my programming on my M1 are C and Rust programs. But the tooling for apple apps (macOS, iOS, etc) in xcode make it pretty easy with Core ML to make some models and bundle it into your app to run locally on the device it gets installed on. At least that is my understanding on playing with it (someone with more experience with CoreML correct me if I am wrong on that). So its not just Apple themselves, but they make it easier for developers to do it also.


Their improvements to the keyboard auto-correction features is coming to their new software as an on-device LLM. Basically, it looks like they're improving the neural engines on their chips, over time, to start running more LLMs on-device to ensure user data privacy.


I don't know about macOS but they are supposedly including a local, transformer-based language model for autocorrect in iOS 17


That’s a bingo!

This year we have seen consumer prices for RAM hit 64GB (2x32GB) for less than $100.

I wouldn’t extrapolate memory price trends for the next 10 years because we are getting into weird territory but from the perspective of product design it’s an enticing prospect.


It's nice, but my Skylake laptop from 2016 also has pretty good llama.cpp acceleration. It might not drive the next hardware cycle as much as memory and storage constraints will.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: