They do this because people like to have predictability. A new model may behave quite differently on something that’s important for a use case.
Also, there are a lot of cases where very small models are just fine and others where they are not. It would always make sense to have the smallest highest performing models available.
Also, there are a lot of cases where very small models are just fine and others where they are not. It would always make sense to have the smallest highest performing models available.