Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We use a bunch of caching mechanisms on the LLM requests themselves and extend the same to guardrails now.

So there's 2 levels of cache - the LLM request itself might be cached (simple and semantic) and the guardrail response can be cached as well.

We use a mix of a distributed kv store and a vector DB to actually store the data



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: