Hacker News new | past | comments | ask | show | jobs | submit login

I’m not so sure it’s negligible. My anecdotal experience is that since Apple Silicon chips were found to be “ok” enough to run inference with MLX, more non-technical people in my circle have asked me how they can run LLMs on their macs.

Surely a smaller market than gamers or datacenters for sure.




It's annoying I do LLMs for work and have a bit of an interest in them and doing stuff with GANS etc.

I have a bit of an interest in games too.

If I could get one platform for both, I could justify 2k maybe a bit more.

I can't justify that for just one half: running games on Mac, right now via Linux: no thanks.

And on the PC side, nvidia consumer cards only go to 24gb which is a bit limiting for LLMs, while being very expensive - I only play games every few months.


The new $2k card from Nvidia will be 32GB but your point stands. AMD is planning a unified chiplet based GPU architecture (AI/data center/workstation/gaming) called UDNA, which might alleviate some of these issues. It's been delayed and delayed though - hence the lackluster GPU offerings from team Red this cycle - so I haven't been getting my hopes up.

Maybe (LP)CAMM2 memory will make model usage just cheap enough that I can have a hosting server for it and do my usual midrange gaming GPU thing before then.


Grace + Hopper, Grace + blackwell, and discussed GB10 are much like the currently shipping AMD MI300A.

I do hope that a AMD Strix Halo ships with 2 LPCAMM2 slots for a total width of 256 bits.


Unified architecture is still on track for 2026-ish.


32gb as of last night :)


I mean negligible to their bottom line. There may be tons of units bought or not, but the margin on a single datacenter system would buy tens of these.

It’s purely an ecosystem play imho. It benefits the kind of people who will go on to make potentially cool things and will stay loyal.


>It’s purely an ecosystem play imho. It benefits the kind of people who will go on to make potentially cool things and will stay loyal.

100%

The people who prototype on a 3k workstation will also be the people who decide how to architect for a 3k GPU buildout for model training.


> It’s purely an ecosystem play imho. It benefits the kind of people who will go on to make potentially cool things and will stay loyal.

It will be massive for research labs. Most academics have to jump through a lot of hoops to get to play with not just CUDA, but also GPUDirect/RDMA/Infiniband etc. If you get older/donated hardware, you may have a large cluster but not newer features.


Academic minimal-bureaucracy purchasing card limit is about $4k, so pricing is convenient*2.


Devalapers developers developers - balmer monkey dance - the key to be entrenched is the platform ecosystem.

Also why aws is giving trainium credits for free


Yes, but people already had their Macs for others reasons.

No one goes to an Apple store thinking "I'll get a laptop to do AI inference".


They have, because until now Apple Silicon was the only practical way for many to work with larger models at home because they can be configured with 64-192GB of unified memory. Even the laptops can be configured with up to 128GB of unified memory.

Performance is not amazing (roughly 4060 level, I think?) but in many ways it was the only game in town unless you were willing and able to build a multi-3090/4090 rig.


I would bet that people running LLMs on their Macs, today, is <0.1% of their user base.


People buying Macs for LLMs—sure I agree.

Since the current MacOS comes built in with small LLMs, that number might be closer to 50% not 0.1%.


I'm not arguing whether or not Macs are capable of doing it, but whether is a material force that drives people to buy Macs because of it; it's not.


Higher than that buying the top end machines though, which are very high margin


All macs? Yes. But of 192GB mac configs? Probably >50%


I'm currently wondering how likely it is I'll get into deeper LLM usage, and therefore how much Apple Silicon I need (because I'm addicted to macOS). So I'm some way closer to your steel man than you'd expect. But I'm probably a niche within a niche.


Tons of people do, my next machine will likely be a Mac for 60% this reason and 40% Windows being so user hostile now.


my $5k m3 max 128gb disagrees


Doubt it, a year ago useful local LLMs on a Mac (via something like ollama) was barely taking off.

If what you say it's true you were among the first 100 people on the planet who were doing this; which btw, further supports my argument on how extremely rare is that use case for Mac users.


No, I got a MacBook Pro 14”with M2 Max and 64GB for LLMs, and that was two generations back.


People were running llama.cpp on Mac laptops in March 2023 and Llama2 was released in July 2023. People were buying Macs to run LLMs months before M3 machines became available in November 2023.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: