Hacker Newsnew | past | comments | ask | show | jobs | submit | rshemet's commentslogin

Yes! Cactus is optimized for mobile CPU inference and we're finishing internal testing of hybrid kernels that use the NPU, as well other chips.

We don't advise using GPUs on smartphones, since they're very energy-inefficient. Mobile GPU inference is actually the main driver behind the stereotype that "mobile inference drains your battery and heats up your phone".

Wrt to your last question – the short answer is yes, we'll have multimodal support. We currently support voice transcription and image understanding. We'll be expanding these capabilities to add more models, voice synthesis, and much more.


Very exciting, thanks!


indeed, this is exactly the goal! The license grants rights to commercial use, unlocks additional hardware acceleration, includes cloud telemetry, and offers significant savings over using cloud APIs.

In our deployments, we've seen open source models rival and even outperform lower-tier cloud counterparts. Happy to share some benchmarks if you like.

Our pricing is on a per-monthly-active-device basis, regardless of utilization. For voice-agent workflows, you typically hit savings as soon as you process over ≈2min of daily inference.


you can run it in Cactus Chat (download from the Play Store)


what model do you input in Cactus Chat? Seems like it's not one of the preset models and ggml-org/gemma-3-270m-GGUF on hf says Note This is a base (pre-trained) model. Do not use for chat!. Is there an alternative model that you can share so that I can put into cactus chat app?


you can also run it on Cactus - either in Cactus Chat from the App/Play Store or by using the Cactus framework to integrate it into your own app


THIS IS THE BOMB!!! So excited for this one. Thanks for putting cool tech out there.


Thank you!


if you ever end up trying to take this in the mobile direction, consider running on-device AI with Cactus –

https://cactuscompute.com/

Blazing-fast, cross-platform, and supports nearly all recent OS models.


Is this your site? It's missing a <title> tag.


thank you! Very kind feedback, and we'll add your feedback to our to-dos.

re: "question would get stuck on the last phrase and keep repeating it without end." - that's a limitation of the model i'm afraid. Smaller models tend to do that sometimes.


say more about "community tools"?


in the app you mean?

Adding shortly!


that's our mission! if you are passionate about the space, we look forward to your contributions!


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: