Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
pzo
11 months ago
|
parent
|
context
|
favorite
| on:
Show HN: Cactus – Ollama for Smartphones
Is this using only llama.cpp as inference engine? How is this days support there on NPU and GPU? Not sure if LLM can run on NPU but many models like STT and TTS and vision often can run much faster on Apple NPU
liuliu
11 months ago
[–]
You don't need to guess:
https://github.com/cactus-compute/cactus/tree/main/cpp
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: