Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The NPUs on a lot of different systems occupy an awkward spot. For extremely small models, they're the way to go for low-power inference. But once you reach LLM or vision transformer size, it makes a lot more sense to switch to GPU shaders for that extra bit of large-model performance. For stuff like Llama and Stable Diffusion, those Neural Engines are practically wasted silicon. The biggest saving grace is projects like ONNX attempting to sew them into a unified non-15-competing-standards API, but even that won't change how underpowered they are.

Nvidia escapes this by designing their GPU architecture to incorporate NPU concepts at a fundamental level. It's less redundant silicon and enables you to scale a single architecture instead of flip-flopping to whichever one is most convenient.



It's currently doable for Apple – I think their strategy is to slowly enhance iPhones, bit by bit, with special-purpose models for dealing with media like photo subject identification, OCR (in every language!), voice transcription, etc. Apple's currently learning from Microsoft's attempts to make AI stick everywhere.


I think Apple is more interested in features that work consistently than in giving power users the ability to play with essentially alpha or beta AI features.

I would guess that their strategy is to not include powerful client-side hardware, and supplement that with some kind of "AiCloud" subscription to do the battery-draining, heat-generating stuff on their cloud. They're trading off their branding as a privacy focused company under the (probably correct) belief that people will be more willing to upload their data to iCloud's AI than Microsoft's.

Fwiw, I think they're probably correct. It has always struck me as odd that people want to run AI on their phone. My impression of AI is that it creates very generalized solutions to problems that would be difficult to code, at the cost of being very compute inefficient.

I don't really want code like that running on my phone; it's a poor platform for it. Thermal dissipation and form factor limit the available processing power, and batteries limit how long you can use the processing power you have. I don't really want to waste either trying to do subject identification locally. I'm going to upload the photos to iCloud anyways; let me pay an extra $1/month or whatever to have that identification happen in the cloud, on a server built for it that has data center thermal dissipation and is plugged into the wall.


>I'm going to upload the photos to iCloud anyways; let me pay an extra $1/month or whatever to have that identification happen in the cloud, on a server built for it that has data center thermal dissipation and is plugged into the wall.

You might not be in area of poor connection and can't connect to the cloud.

One use for AI is speech recognition / transcription for deaf/HoH individuals. Up until now its almost been done exclusively on the cloud and it works fairly well (depending on conditions). Recently there's been an interest in doing it locally without relying on a network connection.

There's also privacy issues with transmitting this data over a network.


> It has always struck me as odd that people want to run AI on their phone. My impression of AI is that it creates very generalized solutions to problems that would be difficult to code, at the cost of being very compute inefficient.

I don't equate AI with coding. I want AI locally for photo sorting and album management, for general questions answering/list making that I use GPT for, and any number of other things.

I try not to upload personal data to sites that aren't E2E encrypted, so iCloud/Google photos is a no-go.


The pinch (as far as I can see it) is that you're right, and Apple can't sell a freestanding service to save their life. If we do get an AppleGPT pay-as-you-go service, it's certain to be extraordinarily censored and locked-down as the exclusive first-party option on iPhone. It will feature "vertical integration" that no other AI can have, alongside censorship so prudish that it would make Maurey Povich gasp.

So... I think users will be stuck. They'll want to run uncensored models on their phone, but Apple will want to keep them in the walled garden at any cost. It feels like the whole "Fortnite" situation all over again, where users can agree they want something but Apple can't decide.


Soon our phones will dream beside us every night (integrating new data into our personal model while on the charger)


Well, iPhone already does that with photos. :)


Do you have a link to where they breakdown what inference for photos happens in realtime vs overnight/charging?


Anyone checked out the NPU on the new iPad? It’s supposed to be a bazillion times better according to Apple but I haven’t had a chance to dig into the reality.

I guess we can assume this is going to be what’s used in what’s being called Apple’s first AI phone, iPhone 16.


That 38 TOPS figure was a bit weird, it's literally below baseline(45 TOPS) for "AI PC" branding Qualcomm/Intel/Microsoft is launching this June, and also 10x less than typical GPUs. I think it was just a clever marketing exploiting the fact that "AI PC" branding hasn't launched yet.


It has 38 TOPS of INT8 performance. Not very remarkable compared to consumer Nvidia GPU’s which are like one or two orders of magnitude faster.


For reference, Nvidia's Jetson Orin NX robotics platform is 35-50 TOPS on average. Apple is catching up, but Nvidia still has by-far the more flexible (and better scaled) platform.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: