I don’t think Apple or Microsoft (via Windows) will dominate AI. There’s just too much value in the AI being in the cloud (big powerful models vs local) and across your devices (more context on you, running on low powered edge devices like watches, glasses, smart home devices), and the idea of an OS being a decisive factor is already fading with how much work people do in a browser or cloud app.
I don’t see how AI won’t end up running on personal devices. It’s like how mainframes were the original computing platform and then we had the PC revolution. If anything, I think Apple is uniquely positioned to pull the rug on a lot of these cloud models. It might take ten or 15 years, but eventually we’ll see an arms race to do so. There’s too much money on the table, and once cloud providers are tapped out the next logical step is home users. It also makes scaling a lot easier because you don’t need increasingly expensive, complex, and power hungry data centers.
It wasn’t that long ago (ignoring the current DRAM market shenanigans) that it was unthinkable to have a single machine with over terabyte of RAM and 192 physical cores. Now that’s absolutely doable in a single workstation. Heck even my comparatively paltry 96GB of RAM would’ve been absurd in 2010, now there are single prosumer GPUs with that.
With the rate of progress (and in the opposite direction, the physical limitations Intel/AMD/TSMC/ETC are bumping into), there's no guarantees about what a machine will look like a decade from now. But, simple logic applies: if the user's machine scales to X amounts of RAM, the hyperscaler's rack scales to X*Y RAM and assuming the performance/scaling relationship we've seen holds true, it will be correspondingly far smarter/better/powerful compared to the user's AI.
Maybe that won't matter when the user is asking it a 5th grade question, but for any more complex application of AI than "what's the weather" or "turn on a light", users should want a better AI, particularly if they don't have to pay for all that silicon sitting around unused in their machine for most of the day?
This argument would sound nearly identical if you made it in the 70s or early 80s about mainframes and personal computers.
It's not that mainframes (or supercomputers, or servers, or the cloud) stopped existing, it's that there was a "good enough" point where the personal computer was powerful enough to do all the things that people care about. Why would this be different?*
And aren't we all paying for a bunch of silicon that sits mostly unused? I have a full modern GPU in my Apple SoC capable of throwing a ridiculous number of polygons per second at the screen and I'm using it to display two terminal emulator windows.
* (I can think of a number of reasons why it would in fact turn out different, but none of them have to do with the limits of technology -- they are all about control or economic incentives)
It’s different because of the ubiquity of the internet and the financial incentives of the companies involved.
Right now you can get 20TB hard drives for cheap and setup your own NAS, but way more people spend money every month on Dropbox/iCloud/onedrive - people value convenience and accessibility over “owning” the product.
Companies also lean into this. Just consider Photoshop. It used to be a one-time purchase, then it became a cloud subscription, now virtually every new AI feature uses paid credits. Despite having that fast SoC, Photoshop will still throw your request to their cloud and charge you for it.
The big point still remains: by the time you can run that trillion parameter model at home, it’s old news. If the personal computer of the 80s was good enough, why’s nobody still using one? AI on edge devices will exist, but will forever remain behind data center AI.
Right now you can get 20TB hard drives for cheap and setup your own NAS, but way more people spend money every month on Dropbox/iCloud/onedrive - people value convenience and accessibility over “owning” the product.
Yes, this is a convenience argument, not a technical one. It's not that your PC doesn't have or could have more than enough storage -- it likely does -- it's that there are other factors that make you use Dropbox.
So now the question becomes: do we not believe that personal devices will ever become good enough to run a "good enough" LLM (technical barrier), or do we believe that other factors will make it seem less desirable to do so (social/financial/legal barrier)?
I think there's a very decent chance that the latter will be true, but the original argument was a technical one -- that good-enough LLMs will always require so much compute that you wouldn't want to run one locally even if you could.
If the personal computer of the 80s was good enough, why’s nobody still using one?
What people want to do changes with time, and therefore your PC XT will no longer hack it in the modern workplace, but the point is that from the point that a personal computer of any kind was good enough, people kept using personal computers. The parallel argument here would be that if there is a plateau where LLM improvement slows and converges with ability to run something good enough on consumer hardware, why would people not then just keep running those good enough models on their hardware? The models would get better with time, sure, but so would the hardware running them.
The original point that I was making was never purely a technical one. Performance, economics, convenience, and business trends all play a part in what I think will happen.
Even if LLM improvement slows, it’ll probably result in the same treadmill effect we see in other software.
Consider MS Office, Adobe Creative (Cloud), or just about any pro level software. The older versions aren’t really used, for various reasons, including performance, features, compatibility, etc. Why would LLMs, which seem to be on an even faster trajectory than conventional software, be any different? Users will want to continue upgrading, and in the case of AI, that’ll mean continuing to access the latest cloud model.
No doubt that someone can run gpt-oss-120b five years from now on device, but outside of privacy, why would they when you can get a faster, smarter answer (for free, likely) from a service?
I think there's room for multiple approaches here.
Cloud based AI obviously has a lot of advantages e.g. batched proccessing on the best hardward, low power edge devices, data sharing, etc.
There's still room for local inference though. I don't know that I want "more context on me" all the time. I want some context, some of the time and I want to be in full control of it.
I'd pay for that. I don't think it will be for everyone but a number of people would pay a premium for an off shelf product that provides privacy and control that cloud vendors by their nature just can't offer.
Definitely room for multiple approaches, including local LLMs.
But I just don't think for most users that local LLM capabilities will be a deciding factor in either hardware or OS choices.
A cloud subscription model will be the premium offering ($20 for consumers, $100 to $1000 or pay-per-token for businesses), and inevitably something ad-supported at a lower price or free for low-end consumers.
Once Joe Consumer has access to that subscription ChatGPT or free tier, are they really going to run a far-less-powerful model on their laptop? Outside of a few simple tasks like semantic search in your email, notes, photos; or localized transcription, local models will just be too far behind the curve for the public to make much use of them.