>For example I was experiencing a regression I can swear to be deliberate on gem...

acituan · 2025-12-09T17:44:27 1765302267

It is a black box. We don’t know what happens on the other side of the RPC call; good and bad, therefore it could be any number of knobs.

User has two knobs called the thinking level and the model. So we know there are definitely per call knobs. Who can tell if thinking-high actually has a server side fork into eg thinking-high-sports-mode versus thinking-high-eco-mode for example. Or if there were two slightly different instantiations of pro models, one with cheaper inference due to whatever hyperparameter versus full on expensive inference. There are infinite ways to implement this. Zero ways to be proven by the end user.