> Typical savings: 60-90% on most requests, since Gemini Flash is often free/che...

moduspol · 2025-11-26T20:11:02 1764187862

I genuinely wonder the use cases are where the required accuracy is so low (or I guess the prompts are so strong) that you don't need to vigorously use evals to prevent regressions with the model that works best--let alone actually just change models on the fly based on what's cheaper.

growt · 2025-11-26T20:31:20 1764189080

Yes and in addition for some reason that use case is also not a fit for some cheap OS model like qwen or kimi, but must be run on the cheapest model of the big three.