More

rshemet · 2025-09-19T10:46:27 1758278787

Yes! Cactus is optimized for mobile CPU inference and we're finishing internal testing of hybrid kernels that use the NPU, as well other chips.

We don't advise using GPUs on smartphones, since they're very energy-inefficient. Mobile GPU inference is actually the main driver behind the stereotype that "mobile inference drains your battery and heats up your phone".

Wrt to your last question – the short answer is yes, we'll have multimodal support. We currently support voice transcription and image understanding. We'll be expanding these capabilities to add more models, voice synthesis, and much more.

MrDrMcCoy · 2025-09-19T18:26:29 1758306389

Very exciting, thanks!

rshemet · 2025-09-19T10:37:10 1758278230

indeed, this is exactly the goal! The license grants rights to commercial use, unlocks additional hardware acceleration, includes cloud telemetry, and offers significant savings over using cloud APIs.

In our deployments, we've seen open source models rival and even outperform lower-tier cloud counterparts. Happy to share some benchmarks if you like.

Our pricing is on a per-monthly-active-device basis, regardless of utilization. For voice-agent workflows, you typically hit savings as soon as you process over ≈2min of daily inference.

rshemet · 2025-08-15T00:38:40 1755218320

you can run it in Cactus Chat (download from the Play Store)

nh43215rgb · 2025-08-15T03:31:59 1755228719

what model do you input in Cactus Chat? Seems like it's not one of the preset models and ggml-org/gemma-3-270m-GGUF on hf says Note This is a base (pre-trained) model. Do not use for chat!. Is there an alternative model that you can share so that I can put into cactus chat app?

rshemet · 2025-08-15T00:37:44 1755218264

you can also run it on Cactus - either in Cactus Chat from the App/Play Store or by using the Cactus framework to integrate it into your own app

rshemet · 2025-08-15T00:26:21 1755217581

THIS IS THE BOMB!!! So excited for this one. Thanks for putting cool tech out there.

yujonglee · 2025-08-15T00:34:06 1755218046

Thank you!

rshemet · 2025-08-08T19:30:41 1754681441

if you ever end up trying to take this in the mobile direction, consider running on-device AI with Cactus –

https://cactuscompute.com/

Blazing-fast, cross-platform, and supports nearly all recent OS models.

b0ner_t0ner · 2025-08-09T00:06:39 1754697999

Is this your site? It's missing a <title> tag.

rshemet · 2025-07-11T16:36:51 1752251811

thank you! Very kind feedback, and we'll add your feedback to our to-dos.

re: "question would get stuck on the last phrase and keep repeating it without end." - that's a limitation of the model i'm afraid. Smaller models tend to do that sometimes.

rshemet · 2025-07-11T16:35:14 1752251714

say more about "community tools"?

rshemet · 2025-07-11T16:34:41 1752251681

in the app you mean?

Adding shortly!

rshemet · 2025-07-11T16:34:26 1752251666

that's our mission! if you are passionate about the space, we look forward to your contributions!