>For example I was experiencing a regression I can swear to be deliberate on gemini-3 coding capabilities after an initial launch boost
Can you describe what you mean by this more? Like you think there was some kind of canned override put in to add a regression to its response to whatever your input was? genuine question
It is a black box. We don’t know what happens on the other side of the RPC call; good and bad, therefore it could be any number of knobs.
User has two knobs called the thinking level and the model. So we know there are definitely per call knobs. Who can tell if thinking-high actually has a server side fork into eg thinking-high-sports-mode versus thinking-high-eco-mode for example. Or if there were two slightly different instantiations of pro models, one with cheaper inference due to whatever hyperparameter versus full on expensive inference. There are infinite ways to implement this. Zero ways to be proven by the end user.
Can you describe what you mean by this more? Like you think there was some kind of canned override put in to add a regression to its response to whatever your input was? genuine question