I'm responding to the parent comment who's suggesting we version control the "model" in Docker. There are infra reasons why companies don't do that. Numerical instability is one class of inference issues, but there can be other bugs in the stack separate from them intentionally changing the weights or switching to a quantized model.
As for the original forum post:
- Multiple numerical computation bugs can compound to make things worse (we saw this in the latest Anthropic post-mortum)
- OP didn't provide any details on eval methodology, so I don't think it's worth speculating on this anecdotal report until we see more data
As for the original forum post:
- Multiple numerical computation bugs can compound to make things worse (we saw this in the latest Anthropic post-mortum)
- OP didn't provide any details on eval methodology, so I don't think it's worth speculating on this anecdotal report until we see more data