You can still use OpenClaw on their API pricing tier as much as you want. What they did is not allow subscriptions to be used to power automated third-party workloads, including OpenClaw.
Now, is their messaging around this confusing? Absolutely. The whole thing has been handled shambolically. Everyone knows that they lack the compute to keep up, and likely have lower margins on subscriptions than API; but they cannot just say that because investors may be skittish.
It's not blocked, you just can't use the Claude-only subscription endpoint with unauthorized 3rd party software. (You can use it via the regular API (7x more expensive) and pay per token just fine.)
...Except now you sorta-kinda can: now they auto-detect 3rd party stuff and bill you per-token for it?
I tried TQ for vector search and my findings is not good, it is not worth it if you cannot use GPU, however I got same quality of search as 32f using 8bit quant
I wrote ann ext for sqlite, using tq, I do save a lot on space but 32f is still faster despite everything I have tried
you’re right that 32f is faster on raw query time, quantization adds extra step. main benefit on download size since gzip won’t help much, which matters most in browser contexts
I'm 100% sure that all providers are playing with the quantization, kv cache and other parameters of the models to be able to serve the demand. One of the biggest advantage of running a local model is that you get predictable behavior.
I have tried to solve the agent running wild, and I found two solutions, the first is to mount the workspace folder using WASM to scope any potential damage, the second is running rquickjs with all APIs and module imports disabled, requiring the agent to call a host function that checks permissions before accessing any files
i would love if you took the time to instruct claude to re-implement inference in c/c++, and put an mit license on it, it would be huge, but only if it actually works
reply